AI in Chemistry: Study Highlights Strengths and Weaknesses
Computing power in the chemistry lab: Kevin Jablonka (left) and his team at HIPOLE Jena. Photo: Renzo Paulus
How well does artificial intelligence perform compared to human experts? A research team at HIPOLE Jena set out to answer this question in the field of chemistry. Using a newly developed evaluation method called “ChemBench,” the researchers compared the performance of modern language models such as GPT-4 with that of experienced chemists.
The study has recently been published in the journal Nature Chemistry (DOI 10.1038/s41557-025-01815-x).
More than 2,700 chemistry tasks from research and education were tested—ranging from fundamental knowledge to complex problems. In areas such as reaction prediction or the analysis of large datasets, AI models often excelled with high efficiency. However, a critical weakness became apparent: the models also produced confident answers even when they were factually incorrect. Human chemists, by contrast, were more cautious and questioned their own assessments.
“Our study shows that AI can be a valuable tool—but it is no substitute for human expertise,” says Dr. Kevin M. Jablonka, lead author of the study. The findings offer important insights for the responsible use of AI in chemical research and education.
HIPOLE Jena (Helmholtz Institute for Polymers in Energy Applications Jena) is an institute of HZB in cooperation with Friedrich Schiller University Jena (FSU Jena).
ma
https://www.helmholtz-berlin.de/pubbin/news_seite?nid=30246;sprache=en
- Copy link
-
Successful master's degree in IR thermography on solar facades
We are delighted to congratulate our student employee Luca Raschke on successfully completing her Master's degree in Renewable Energies at the Hochschule für Technik und Wirtschaft Berlin - and with distinction!
-
BESSY II: Phosphorous chains – a 1D material with 1D electronic properties
For the first time, a team at BESSY II has succeeded in demonstrating the one-dimensional electronic properties of a material through a highly refined experimental process. The samples consisted of short chains of phosphorus atoms that self-organise at specific angles on a silver substrate. Through sophisticated analysis, the team was able to disentangle the contributions of these differently aligned chains. This revealed that the electronic properties of each chain are indeed one-dimensional. Calculations predict an exciting phase transition to be expected as soon as these chains are more closely packed. While material consisting of individual chains with longer distances is semiconducting, a very dense chain structure would be metallic.
-
Did marine life in the palaeocene use a compass?
Some ancient marine organisms produced mysterious magnetic particles of unusually large size, which can now be found as fossils in marine sediments. An international team has succeeded in mapping the magnetic domains on one of such ‘giant magnetofossils’ using a sophisticated method at the Diamond X-ray source. Their analysis shows that these particles could have allowed these organisms to sense tiny variations in both the direction and intensity of the Earth’s magnetic field, enabling them to geolocate themselves and navigate across the ocean. The method offers a powerful tool for magnetically testing whether putative biological iron oxide particles in Mars samples have a biogenic origin.