AI in Chemistry: Study Highlights Strengths and Weaknesses

Computing power in the chemistry lab: Kevin Jablonka (left) and his team at HIPOLE Jena. Photo: Renzo Paulus

Computing power in the chemistry lab: Kevin Jablonka (left) and his team at HIPOLE Jena. Photo: Renzo Paulus

How well does artificial intelligence perform compared to human experts? A research team at HIPOLE Jena set out to answer this question in the field of chemistry. Using a newly developed evaluation method called “ChemBench,” the researchers compared the performance of modern language models such as GPT-4 with that of experienced chemists. 

The study has recently been published in the journal Nature Chemistry (DOI 10.1038/s41557-025-01815-x).

More than 2,700 chemistry tasks from research and education were tested—ranging from fundamental knowledge to complex problems. In areas such as reaction prediction or the analysis of large datasets, AI models often excelled with high efficiency. However, a critical weakness became apparent: the models also produced confident answers even when they were factually incorrect. Human chemists, by contrast, were more cautious and questioned their own assessments.

“Our study shows that AI can be a valuable tool—but it is no substitute for human expertise,” says Dr. Kevin M. Jablonka, lead author of the study. The findings offer important insights for the responsible use of AI in chemical research and education.

HIPOLE Jena (Helmholtz Institute for Polymers in Energy Applications Jena) is an institute of HZB in cooperation with Friedrich Schiller University Jena (FSU Jena).

ma

  • Copy link

You might also be interested in

  • 5000th protein structure at BESSY II: Starting point for a COVID drug
    Science Highlight
    26.02.2026
    5000th protein structure at BESSY II: Starting point for a COVID drug
    Many proteins have a complex architecture that enables biological functions. Molecules can bind to specific sites on a protein and alter its function. A team at HZB has now investigated the Nsp1 protein, which plays a role in infection with the SARS-CoV-2 virus. They analysed protein crystals, previously mixed with molecules from a fragment library, and discovered a total of 21 candidates as starting points for drug development. At the same time, they also decoded the 5000th structure at BESSY II.
  • What Zinc concentration in teeth reveals
    Science Highlight
    19.02.2026
    What Zinc concentration in teeth reveals
    Teeth are composites of mineral and protein, with a bulk of bony dentin that is highly porous. This structure is allows teeth to be both strong and sensitive. Besides calcium and phosphate, teeth contain trace elements such as zinc. Using complementary microscopy imaging techniques, a team from Charité Berlin, TU Berlin and HZB has quantified the distribution of natural zinc along and across teeth in 3 dimensions. The team found that, as porosity in dentine increases towards the pulp, zinc concentration increases 5~10 fold. These results help to understand the influence of widely-used zinc-containing biomaterials (e.g. filling) and could inspire improvements in dental medicine.
  • Fascinating archaeological find becomes a source of knowledge
    News
    12.02.2026
    Fascinating archaeological find becomes a source of knowledge
    The Bavarian State Office for the Preservation of Historical Monuments (BLfD) has sent a rare artefact from the Middle Bronze Age to Berlin for examination using cutting-edge, non-destructive methods. It is a 3,400-year-old bronze sword, unearthed during archaeological excavations in Nördlingen, Swabia, in 2023. Experts have been able to determine how the hilt and blade are connected, as well as how the rare and well-preserved decorations on the pommel were made. This has provided valuable insight into the craft techniques employed in southern Germany during the Bronze Age. The BLfD used 3D computed tomography and X-ray diffraction to analyse internal stresses at the Helmholtz-Zentrum Berlin (HZB), as well as X-ray fluorescence spectroscopy at a BESSY II beamline supervised by the Bundesanstalt für Materialforschung und -prüfung (BAM).