Calculating the "fingerprints" of molecules with artificial intelligence

The graphical neural network GNN receives small molecules as input with the task of determining their spectral responses. By matching them with the known spectra, the GNN programme learns to calculate spectra reliably.

The graphical neural network GNN receives small molecules as input with the task of determining their spectral responses. By matching them with the known spectra, the GNN programme learns to calculate spectra reliably. © K. Singh, A. Bande/HZB

With conventional methods, it is extremely time-consuming to calculate the spectral fingerprint of larger molecules. But this is a prerequisite for correctly interpreting experimentally obtained data. Now, a team at HZB has achieved very good results in significantly less time using self-learning graphical neural networks.

"Macromolecules but also quantum dots, which often consist of thousands of atoms, can hardly be calculated in advance using conventional methods such as DFT," says PD Dr. Annika Bande at HZB. With her team she has now investigated how the computing time can be shortened by using methods from artificial intelligence.

The idea: a computer programme from the group of "graphical neural networks" or GNN receives small molecules as input with the task of determining their spectral responses. In the next step, the GNN programme compares the calculated spectra with the known target spectra (DFT or experimental) and corrects the calculation path accordingly. Round after round, the result becomes better. The GNN programme thus learns on its own how to calculate spectra reliably with the help of known spectra.

"We have trained five newer GNNs and found that enormous improvements can be achieved with one of them, the SchNet model: The accuracy increases by 20% and this is done in a fraction of the computation time," says first author Kanishka Singh. Singh participates in the HEIBRiDS graduate school and is supervised by two experts from different backgrounds: computer science expert Prof. Ulf Leser from Humboldt University Berlin and theoretical chemist Annika Bande.

"Recently developed GNN frameworks could do even better," she says. "And the demand is very high. We therefore want to strengthen this line of research and are planning to create a new postdoctoral position for it from summer onwards as part of the Helmholtz project "eXplainable Artificial Intelligence for X-ray Absorption Spectroscopy"."

 

Annotation:

The work was carried out within the framework of the HEIBRiDS graduate school and is being supported by the Helmholtz project "eXplainable Artificial Intelligence for X-ray Absorption Spectroscopy" (XAI-4-XAS).

The core of the project is to extend GNN, as used at HZB, to very large molecules in combination with the probabilistic analysis of molecular motifs developed at HEREON. It is used to capture only the relevant part of the configuration phase space of the molecules, which is necessary for the accurate prediction of X-ray spectra. The results of the ML predictions allow a rigorous interpretation of XAS experiments, so that characteristic parts of the spectrum of an extended material can be assigned 1:1 to its specific structural subgroups.

 

arö

  • Copy link

You might also be interested in

  • 5000th protein structure at BESSY II: Starting point for a COVID drug
    Science Highlight
    26.02.2026
    5000th protein structure at BESSY II: Starting point for a COVID drug
    Many proteins have a complex architecture that enables biological functions. Molecules can bind to specific sites on a protein and alter its function. A team at HZB has now investigated the Nsp1 protein, which plays a role in infection with the SARS-CoV-2 virus. They analysed protein crystals, previously mixed with molecules from a fragment library, and discovered a total of 21 candidates as starting points for drug development. At the same time, they also decoded the 5000th structure at BESSY II.
  • What Zinc concentration in teeth reveals
    Science Highlight
    19.02.2026
    What Zinc concentration in teeth reveals
    Teeth are composites of mineral and protein, with a bulk of bony dentin that is highly porous. This structure is allows teeth to be both strong and sensitive. Besides calcium and phosphate, teeth contain trace elements such as zinc. Using complementary microscopy imaging techniques, a team from Charité Berlin, TU Berlin and HZB has quantified the distribution of natural zinc along and across teeth in 3 dimensions. The team found that, as porosity in dentine increases towards the pulp, zinc concentration increases 5~10 fold. These results help to understand the influence of widely-used zinc-containing biomaterials (e.g. filling) and could inspire improvements in dental medicine.
  • Fascinating archaeological find becomes a source of knowledge
    News
    12.02.2026
    Fascinating archaeological find becomes a source of knowledge
    The Bavarian State Office for the Preservation of Historical Monuments (BLfD) has sent a rare artefact from the Middle Bronze Age to Berlin for examination using cutting-edge, non-destructive methods. It is a 3,400-year-old bronze sword, unearthed during archaeological excavations in Nördlingen, Swabia, in 2023. Experts have been able to determine how the hilt and blade are connected, as well as how the rare and well-preserved decorations on the pommel were made. This has provided valuable insight into the craft techniques employed in southern Germany during the Bronze Age. The BLfD used 3D computed tomography and X-ray diffraction to analyse internal stresses at the Helmholtz-Zentrum Berlin (HZB), as well as X-ray fluorescence spectroscopy at a BESSY II beamline supervised by the Bundesanstalt für Materialforschung und -prüfung (BAM).