AI in Chemistry: Study Highlights Strengths and Weaknesses

Computing power in the chemistry lab: Kevin Jablonka (left) and his team at HIPOLE Jena. Photo: Renzo Paulus

Computing power in the chemistry lab: Kevin Jablonka (left) and his team at HIPOLE Jena. Photo: Renzo Paulus

How well does artificial intelligence perform compared to human experts? A research team at HIPOLE Jena set out to answer this question in the field of chemistry. Using a newly developed evaluation method called “ChemBench,” the researchers compared the performance of modern language models such as GPT-4 with that of experienced chemists. 

The study has recently been published in the journal Nature Chemistry (DOI 10.1038/s41557-025-01815-x).

More than 2,700 chemistry tasks from research and education were tested—ranging from fundamental knowledge to complex problems. In areas such as reaction prediction or the analysis of large datasets, AI models often excelled with high efficiency. However, a critical weakness became apparent: the models also produced confident answers even when they were factually incorrect. Human chemists, by contrast, were more cautious and questioned their own assessments.

“Our study shows that AI can be a valuable tool—but it is no substitute for human expertise,” says Dr. Kevin M. Jablonka, lead author of the study. The findings offer important insights for the responsible use of AI in chemical research and education.

HIPOLE Jena (Helmholtz Institute for Polymers in Energy Applications Jena) is an institute of HZB in cooperation with Friedrich Schiller University Jena (FSU Jena).

ma

  • Copy link

You might also be interested in

  • What Zinc concentration in teeth reveals
    Science Highlight
    19.02.2026
    What Zinc concentration in teeth reveals
    Teeth are composites of mineral and protein, with a bulk of bony dentin that is highly porous. This structure is allows teeth to be both strong and sensitive. Besides calcium and phosphate, teeth contain trace elements such as zinc. Using complementary microscopy imaging techniques, a team from Charité Berlin, TU Berlin and HZB has quantified the distribution of natural zinc along and across teeth in 3 dimensions. The team found that, as porosity in dentine increases towards the pulp, zinc concentration increases 5~10 fold. These results help to understand the influence of widely-used zinc-containing biomaterials (e.g. filling) and could inspire improvements in dental medicine.
  • Fascinating archaeological find becomes a source of knowledge
    News
    12.02.2026
    Fascinating archaeological find becomes a source of knowledge
    The Bavarian State Office for the Preservation of Historical Monuments (BLfD) has sent a rare artefact from the Middle Bronze Age to Berlin for examination using cutting-edge, non-destructive methods. It is a 3,400-year-old bronze sword, unearthed during archaeological excavations in Nördlingen, Swabia, in 2023. Experts have been able to determine how the hilt and blade are connected, as well as how the rare and well-preserved decorations on the pommel were made. This has provided valuable insight into the craft techniques employed in southern Germany during the Bronze Age. The BLfD used 3D computed tomography and X-ray diffraction to analyse internal stresses at the Helmholtz-Zentrum Berlin (HZB), as well as X-ray fluorescence spectroscopy at a BESSY II beamline supervised by the Bundesanstalt für Materialforschung und -prüfung (BAM).
  • Element cobalt exhibits surprising properties
    Science Highlight
    11.02.2026
    Element cobalt exhibits surprising properties
    The element cobalt is considered a typical ferromagnet with no further secrets. However, an international team led by HZB researcher Dr. Jaime Sánchez-Barriga has now uncovered complex topological features in its electronic structure. Spin-resolved measurements of the band structure (spin-ARPES) at BESSY II revealed entangled energy bands that cross each other along extended paths in specific crystallographic directions, even at room temperature. As a result, cobalt can be considered as a highly tunable and unexpectedly rich topological platform, opening new perspectives for exploiting magnetic topological states in future information technologies.