New software based on Artificial Intelligence helps to interpret complex data

Experimental data is often not only highly dimensional, but also noisy and full of artefacts. This makes it difficult to interpret the data. Now a team at HZB has designed software that uses self-learning neural networks to compress the data in a smart way and reconstruct a low-noise version in the next step. This enables to recognise correlations that would otherwise not be discernible. The software has now been successfully used in photon diagnostics at the FLASH free electron laser at DESY. But it is suitable for very different applications in science.

More is not always better, but sometimes a problem. With highly complex data, which have many dimensions due to their numerous parameters, correlations are often no longer recognisable. Especially since experimentally obtained data are additionally disturbed and noisy due to influences that cannot be controlled.

Helping humans to interpret the data

Now, new software based on artificial intelligence methods can help: It is a special class of neural networks (NN) that experts call "disentangled variational autoencoder network (β-VAE)". Put simply, the first NN takes care of compressing the data, while the second NN subsequently reconstructs the data. "In the process, the two NNs are trained so that the compressed form can be interpreted by humans," explains Dr Gregor Hartmann. The physicist and data scientist supervises the Joint Lab on Artificial Intelligence Methods at HZB, which is run by HZB together with the University of Kassel.

Extracting core principles without prior knowledge

Google Deepmind had already proposed to use β-VAEs in 2017. Many experts assumed that the application in the real world would be challenging, as non-linear components are difficult to disentangle. "After several years of learning how the NNs learn, it finally worked," says Hartmann. β-VAEs are able to extract the underlying core principle from data without prior knowledge.

Photon energy of FLASH determined

In the study now published, the group used the software to determine the photon energy of FLASH from single-shot photoelectron spectra. "We succeeded in extracting this information from noisy electron time-of-flight data, and much better than with conventional analysis methods," says Hartmann. Even data with detector-specific artefacts can be cleaned up this way.

A powerful tool for different problems

"The method is really good when it comes to impaired data," Hartmann emphasises. The programme is even able to reconstruct tiny signals that were not visible in the raw data. Such networks can help uncover unexpected physical effects or correlations in large experimental data sets. "AI-based intelligent data compression is a very powerful tool, not only in photon science," says Hartmann.

Now plug and play

In total, Hartmann and his team spent three years developing the software. "But now, it is more or less plug and play. We hope that soon many colleagues will come with their data and we can support them."

arö

  • Copy link

You might also be interested in

  • Protein crystallography at BESSY II: faster, better and more and more automatic
    Interview
    04.03.2026
    Protein crystallography at BESSY II: faster, better and more and more automatic
    Many diseases are linked to malfunctions of proteins in the organism. The three-dimensional architecture of these molecules is often highly complex, but it can provide valuable insights into biological processes and the development of drugs. X-ray diffraction at the MX beamlines of BESSY II can be used to decipher the 3D structure of proteins. To date, more than 5000 structures have been solved at the three MX beamlines. Here, we present a review and an outlook with  Manfred Weiss, head of the research group for macromolecular crystallography. 
  • 5000th protein structure at BESSY II: Starting point for a COVID drug
    Science Highlight
    26.02.2026
    5000th protein structure at BESSY II: Starting point for a COVID drug
    Many proteins have a complex architecture that enables biological functions. Molecules can bind to specific sites on a protein and alter its function. A team at HZB has now investigated the Nsp1 protein, which plays a role in infection with the SARS-CoV-2 virus. They analysed protein crystals, previously mixed with molecules from a fragment library, and discovered a total of 21 candidates as starting points for drug development. At the same time, they also decoded the 5000th structure at BESSY II.
  • What Zinc concentration in teeth reveals
    Science Highlight
    19.02.2026
    What Zinc concentration in teeth reveals
    Teeth are composites of mineral and protein, with a bulk of bony dentin that is highly porous. This structure is allows teeth to be both strong and sensitive. Besides calcium and phosphate, teeth contain trace elements such as zinc. Using complementary microscopy imaging techniques, a team from Charité Berlin, TU Berlin and HZB has quantified the distribution of natural zinc along and across teeth in 3 dimensions. The team found that, as porosity in dentine increases towards the pulp, zinc concentration increases 5~10 fold. These results help to understand the influence of widely-used zinc-containing biomaterials (e.g. filling) and could inspire improvements in dental medicine.