PuSH - Publication Server of Helmholtz Zentrum München

Siebenmorgen, T. ; Cardoso Micu Menezes, F.M. ; Benassou, S.* ; Merdivan, E. ; Didi, K.* ; Mourao, A. ; Kitel, R.* ; Liò, P.* ; Kesselheim, S.* ; Piraud, M. ; Theis, F.J. ; Sattler, M. ; Popowicz, G.M.

MISATO: Machine learning dataset of protein-ligand complexes for structure-based drug discovery.

Nat. Comput. Sci. 4, 367–378 (2024)
Publ. Version/Full Text DOI PMC
Open Access Gold (Paid Option)
Creative Commons Lizenzvertrag
Large language models have greatly enhanced our ability to understand biology and chemistry, yet robust methods for structure-based drug discovery, quantum chemistry and structural biology are still sparse. Precise biomolecule-ligand interaction datasets are urgently needed for large language models. To address this, we present MISATO, a dataset that combines quantum mechanical properties of small molecules and associated molecular dynamics simulations of ~20,000 experimental protein-ligand complexes with extensive validation of experimental data. Starting from the existing experimental structures, semi-empirical quantum mechanics was used to systematically refine these structures. A large collection of molecular dynamics traces of protein-ligand complexes in explicit water is included, accumulating over 170 μs. We give examples of machine learning (ML) baseline models proving an improvement of accuracy by employing our data. An easy entry point for ML experts is provided to enable the next generation of drug discovery artificial intelligence models.
Altmetric
Additional Metrics?
Edit extra informations Login
Publication type Article: Journal article
Document type Scientific Article
Corresponding Author
Keywords Scoring Function; Force-field; Binding; Affinity; Efficient; Models; Parameterization; Generation; Prediction; Accuracy
ISSN (print) / ISBN 2662-8457
e-ISSN 2662-8457
Quellenangaben Volume: 4, Issue: , Pages: 367–378 Article Number: , Supplement: ,
Publisher Springer
Publishing Place Campus, 4 Crinan St, London, N1 9xw, England
Non-patent literature Publications
Reviewing status Peer reviewed
Grants Helmholtz Association's Initiative and Networking Fund on the HAICORE@FZJ partition
BMBF
Bundesministerium fr Bildung und Forschung (Federal Ministry of Education and Research)