AI in Chemistry: Study Highlights Strengths and Weaknesses

Computing power in the chemistry lab: Kevin Jablonka (left) and his team at HIPOLE Jena. Photo: Renzo Paulus

Computing power in the chemistry lab: Kevin Jablonka (left) and his team at HIPOLE Jena. Photo: Renzo Paulus

How well does artificial intelligence perform compared to human experts? A research team at HIPOLE Jena set out to answer this question in the field of chemistry. Using a newly developed evaluation method called “ChemBench,” the researchers compared the performance of modern language models such as GPT-4 with that of experienced chemists. 

The study has recently been published in the journal Nature Chemistry (DOI 10.1038/s41557-025-01815-x).

More than 2,700 chemistry tasks from research and education were tested—ranging from fundamental knowledge to complex problems. In areas such as reaction prediction or the analysis of large datasets, AI models often excelled with high efficiency. However, a critical weakness became apparent: the models also produced confident answers even when they were factually incorrect. Human chemists, by contrast, were more cautious and questioned their own assessments.

“Our study shows that AI can be a valuable tool—but it is no substitute for human expertise,” says Dr. Kevin M. Jablonka, lead author of the study. The findings offer important insights for the responsible use of AI in chemical research and education.

HIPOLE Jena (Helmholtz Institute for Polymers in Energy Applications Jena) is an institute of HZB in cooperation with Friedrich Schiller University Jena (FSU Jena).

ma

  • Copy link

You might also be interested in

  • MXene for energy storage: More versatile than expected
    Science Highlight
    03.02.2026
    MXene for energy storage: More versatile than expected
    MXene materials are promising candidates for a new energy storage technology. However, the processes by which the charge storage takes place were not yet fully understood. A team at HZB has examined, for the first time, individual MXene flakes to explore these processes in detail. Using the in situ Scanning transmission X-ray microscope 'MYSTIIC' at BESSY II, the scientists mapped the chemical states of Titanium atoms on the MXene flake surfaces. The results revealed two distinct redox reactions, depending on the electrolyte. This lays the groundwork for understanding charge transfer processes at the nanoscale and provides a basis for future research aimed at optimising pseudocapacitive energy storage devices.
  • Bernd Rech elected to the BR50 Board of Directors
    News
    30.01.2026
    Bernd Rech elected to the BR50 Board of Directors
    The Scientific Director at Helmholt-Zentrum Berlin is the new face behind the "Natural Sciences" unit at Berlin Research 50 (BR50). Following the election in December 2025, the constituent meeting of the new BR50 Board of Directors took place on 22 January 2026.

    Its members are Michael Hintermüller (Weierstrass Institute, WIAS), Noa K. Ha (German Centre for Integration and Migration Research, DeZIM), Volker Haucke (Leibniz Research Institute for Molecular Pharmacology, FMP), Uta Bielfeldt (German Rheumatism Research Centre Berlin, DRFZ) and Bernd Rech (HZB).

  • A record year for our living lab for building-integrated PV
    News
    27.01.2026
    A record year for our living lab for building-integrated PV
    In 2025, our solar facade in Berlin-Adlershof generated more electricity than in any of the previous four years of operation.