

Figure reproduced from H. Toru Shay et al. (2025). The detected inventory of TMC–1 molecular species and their column densities is converted into numerical vectors with the VICGAE model. The vectorized strings and the corresponding column densities are then used to train a machine learning regressor. Through learning the relationship between vectorized molecular structures and their column densities, the model is optimized to find the best fit. Once trained, the best-fit model can predict column densities for new molecules and be used to identify potential detection targets in TMC–1. — The Astrophysical Journal
In the last few years, several approaches have used machine learning to reproduce the inventory and column densities of interstellar molecular sources.
These models can also provide predictions of other molecules that might make good targets for discovery in a source, along with a predicted column density. A molecular target and column density alone are often not enough to dictate a good target; spectroscopic considerations (e.g., permanent electric dipole moments, line intensities, and line frequencies) need to be accounted for as well.
Here we describe a new approach to take these considerations into account, in an attempt to identify the hidden likely candidates among thousands of molecules and associated column densities recommended by machine learning. We leverage machine learning results alongside quantum chemical and astronomical considerations to define a “detectability” metric and then apply this analysis to machine learning–recommended molecules in TMC–1.
We find that this approach significantly narrows the field of candidate species but also highlights some of the ongoing issues with results from machine learning models trained on the very small (for machine learning) data sets produced by observations. We discuss these in the context of the need for human intervention, finding that our approach reduces but certainly does not eliminate this requirement.
Bridging Machine Learning and Spectroscopy: A New Analysis for Astrochemical Target Selection, The Astrophysical Journal (open access)
Astrobiology, Astrochemistry,




