PuSH - Publication Server of Helmholtz Zentrum München

Hartog, P. ; Westerlund, A.M.* ; Tetko, I.V. ; Genheden, S.*

Investigations into the efficiency of computer-aided synthesis planning.

J. Chem. Inf. Model. 65, 1771-1781 (2025)
Publ. Version/Full Text Research data DOI PMC
Open Access Hybrid
Creative Commons Lizenzvertrag
The efficiency of machine learning (ML) models is crucial to minimize inference times and reduce the carbon footprints of models deployed in production environments. Current models employed in retrosynthesis to generate a synthesis route from a target molecule to purchasable compounds are prohibitively slow. The model operates in a single-step fashion in a tree search algorithm by predicting reactant molecules given a product molecule as input. In this study, we investigate the ability of alternative transformer architectures, knowledge distillation (KD), and simple hyper-parameter optimization to decrease inference times of the Chemformer model. Initially, we assess the ability of closely related transformer architectures and conclude that these models under-performed when using KD. Additionally, we investigate the effects of feature-based and response-based KD together with hyper-parameters optimized based on inference sample time and model accuracy. We find that although reducing model size and improving single-step speed are important, our results indicate that multi-step search efficiency is more significantly influenced by the diversity and confidence of single-step models. Based on this work, further research should use KD in combination with other techniques, as multi-step speed continues to prevent proper integration of synthesis planning. However, in Monte Carlo-based (MC) multi-step retrosynthesis, other factors play a crucial role in balancing exploration and exploitation during the search process, often outweighing the direct impact of single-step model speed and carbon footprints.
Impact Factor
Scopus SNIP
Altmetric
5.600
0.000
Tags
Annotations
Special Publikation
Hide on homepage

Edit extra information
Edit own tags
Private
Edit own annotation
Private
Hide on publication lists
on hompage
Mark as special
publikation
Publication type Article: Journal article
Document type Scientific Article
Keywords Transformer
Language english
Publication Year 2025
HGF-reported in Year 2025
ISSN (print) / ISBN 0021-9576
e-ISSN 1520-5142
Quellenangaben Volume: 65, Issue: 4, Pages: 1771-1781 Article Number: , Supplement: ,
Publisher American Chemical Society (ACS)
Publishing Place 1155 16th St, Nw, Washington, Dc 20036 Usa
Reviewing status Peer reviewed
POF-Topic(s) 30203 - Molecular Targets and Therapies
Research field(s) Enabling and Novel Technologies
PSP Element(s) G-503093-001
G-503000-001
Grants European Union's Horizon 2020 research and innovation program under the Marie Sklodowska-Curie Actions Innovative Training Network European Industrial Doctorate grant agreement "Advanced machine learning for Innovative Drug Discovery (AIDD)"
Scopus ID 85216732449
PubMed ID 39889203
Erfassungsdatum 2025-03-26