PuSH - Publikationsserver des Helmholtz Zentrums München

Hartog, P. ; Westerlund, A.M.* ; Tetko, I.V. ; Genheden, S.*

Investigations into the efficiency of computer-aided synthesis planning.

J. Chem. Inf. Model. 65, 1771-1781 (2025)
Verlagsversion Forschungsdaten DOI PMC
Open Access Gold (Paid Option)
Creative Commons Lizenzvertrag
The efficiency of machine learning (ML) models is crucial to minimize inference times and reduce the carbon footprints of models deployed in production environments. Current models employed in retrosynthesis to generate a synthesis route from a target molecule to purchasable compounds are prohibitively slow. The model operates in a single-step fashion in a tree search algorithm by predicting reactant molecules given a product molecule as input. In this study, we investigate the ability of alternative transformer architectures, knowledge distillation (KD), and simple hyper-parameter optimization to decrease inference times of the Chemformer model. Initially, we assess the ability of closely related transformer architectures and conclude that these models under-performed when using KD. Additionally, we investigate the effects of feature-based and response-based KD together with hyper-parameters optimized based on inference sample time and model accuracy. We find that although reducing model size and improving single-step speed are important, our results indicate that multi-step search efficiency is more significantly influenced by the diversity and confidence of single-step models. Based on this work, further research should use KD in combination with other techniques, as multi-step speed continues to prevent proper integration of synthesis planning. However, in Monte Carlo-based (MC) multi-step retrosynthesis, other factors play a crucial role in balancing exploration and exploitation during the search process, often outweighing the direct impact of single-step model speed and carbon footprints.
Altmetric
Weitere Metriken?
Zusatzinfos bearbeiten [➜Einloggen]
Publikationstyp Artikel: Journalartikel
Dokumenttyp Wissenschaftlicher Artikel
Korrespondenzautor
Schlagwörter Transformer
ISSN (print) / ISBN 0021-9576
e-ISSN 1520-5142
Quellenangaben Band: 65, Heft: 4, Seiten: 1771-1781 Artikelnummer: , Supplement: ,
Verlag American Chemical Society (ACS)
Verlagsort 1155 16th St, Nw, Washington, Dc 20036 Usa
Nichtpatentliteratur Publikationen
Begutachtungsstatus Peer reviewed
Förderungen European Union's Horizon 2020 research and innovation program under the Marie Sklodowska-Curie Actions Innovative Training Network European Industrial Doctorate grant agreement "Advanced machine learning for Innovative Drug Discovery (AIDD)"