PuSH - Publication Server of Helmholtz Zentrum München: Transformer-CNN: Swiss knife for QSAR modeling and interpretation.

Navigation

Home

Deutsch

Research

Advanced Search

Browse by ...

... Journal

... Publication Type

... Research Data

... Publication Year

Publication overview

Support & Contact

Contact persons

Help

Data protection

Karpov, P. ; Godin, G.* ; Tetko, I.V.

Transformer-CNN: Swiss knife for QSAR modeling and interpretation.

J. Cheminformatics 12:17 (2020)

Publ. Version/Full Text

DOI

PMC

	Open Access Gold

Abstract
Metrics
Extra information

We present SMILES-embeddings derived from the internal encoder state of a Transformer [1] model trained to canonize SMILES as a Seq2Seq problem. Using a CharNN [2] architecture upon the embeddings results in higher quality interpretable QSAR/QSPR models on diverse benchmark datasets including regression and classification tasks. The proposed Transformer-CNN method uses SMILES augmentation for training and inference, and thus the prognosis is based on an internal consensus. That both the augmentation and transfer learning are based on embeddings allows the method to provide good results for small datasets. We discuss the reasons for such effectiveness and draft future directions for the development of the method. The source code and the embeddings needed to train a QSAR model are available on https://github.com/bigchem/transformer-cnn. The repository also has a standalone program for QSAR prognosis which calculates individual atoms contributions, thus interpreting the model's result. OCHEM [3] environment (https://ochem.eu) hosts the on-line implementation of the method proposed.

Altmetric

Additional Metrics?

[➜Log in]

Edit extra informations Login

Publication type Article: Journal article

Document type Scientific Article

Keywords Augmentation ; Character-based Models ; Cheminformatics ; Classification ; Convolutional Neural Neural Networks ; Embeddings ; Qsar ; Regression ; Smiles ; Transformer Model; Aqueous Solubility; Neural-networks

e-ISSN 1758-2946

Journal Journal of Cheminformatics

Quellenangaben Volume: 12, Issue: 1, Article Number: 17

Publisher BioMed Central

Publishing Place Campus, 4 Crinan St, London N1 9xw, England

Reviewing status Peer reviewed

Institute(s) Institute of Structural Biology (STB)