How machine learning can support atmospheric compound discovery – Technology Org

Identifying chemical compounds found in the atmosphere is challenging and tedious, currently relying on mass spectrometry measurements. A new perspective paper now discusses what promise machine learning holds to accelerate and improve the accuracy of ongoing studies to map new atmospheric compounds. Such compounds are worth studying as they contribute to atmospheric particle formation, directly impacting climate and air quality.

How machine learning can support atmospheric compound discovery – Technology Org

Illustration by Hilda Sandström

CEST researchers Hilda Sandström and Patrick Rinke, along with collaborators from Aalto University, the University of Helsinki and Tampere University, conducted a comprehensive review of the current state of data-driven compound identification in atmospheric mass spectrometry. This perspective article outlines crucial steps required from the atmospheric chemistry community to implement the identification of compounds using modern smart algorithms.

Despite the acknowledged complexity and sheer number of potential atmospheric organic compounds, detailed knowledge of their reaction mechanisms, intermediates, and products is lacking. Efforts to gain new fundamental knowledge about these atmospheric processes persist, primarily relying on mass spectrometry. However, existing experimental data libraries and manual identification methods struggle to cope with the shear number, large variability and complexity inherent in atmospheric compounds and processes.

While smart compound identification algorithms have demonstrated state-of-the-art performance in other chemical disciplines, their implementation in atmospheric chemistry has been hindered by the scarcity of training data from such atmospheric mass spectrometry studies. The researchers have provided examples of how these machine learning-based compound identification tools could be effectively utilized in conjunction with soft ionization techniques commonly employed in atmospheric mass spectrometry.

Establishing automated and improved identification methods for atmospheric compounds is pivotal to advance our basic understanding of atmospheric chemistry. Crucially, the paper proposes an action plan to create an infrastructure for development of data-driven compound identification in atmospheric mass spectrometry. Following this initial review, Sandström and collaborators now aim to initiate the development and testing of these future intelligent identification methods to help identify atmospheric compounds.

Source: Aalto University