PuSH - Publikationsserver des Helmholtz Zentrums München

Inferring protein from transcript abundances using convolutional neural networks.

BioData Min. 18:18 (2025)
Verlagsversion Forschungsdaten DOI PMC
Open Access Gold
Creative Commons Lizenzvertrag
BACKGROUND: Although transcript abundance is often used as a proxy for protein abundance, it is an unreliable predictor. As proteins execute biological functions and their expression levels influence phenotypic outcomes, we developed a convolutional neural network (CNN) to predict protein abundances from mRNA abundances, protein sequence, and mRNA sequence in Homo sapiens (H. sapiens) and the reference plant Arabidopsis thaliana (A. thaliana). RESULTS: After hyperparameter optimization and initial data exploration, we implemented distinct training modules for value-based and sequence-based data. By analyzing the learned weights, we revealed common and organism-specific sequence features that influence protein-to-mRNA ratios (PTRs), including known and putative sequence motifs. Adding condition-specific protein interaction information identified genes correlated with many PTRs but did not improve predictions, likely due to insufficient data. The integrated model predicted protein abundance on unseen genes with a coefficient of determination (r2) of 0.30 in H. sapiens and 0.32 in A. thaliana. CONCLUSIONS: For H. sapiens, our model improves prediction performance by nearly 50% compared to previous sequence-based approaches, and for A. thaliana it represents the first model of its kind. The model's learned motifs recapitulate known regulatory elements, supporting its utility in systems-level and hypothesis-driven research approaches related to protein regulation.
Altmetric
Weitere Metriken?
Zusatzinfos bearbeiten [➜Einloggen]
Publikationstyp Artikel: Journalartikel
Dokumenttyp Wissenschaftlicher Artikel
Korrespondenzautor
Schlagwörter Convolutional Neural Networks ; Explainable Ai ; Protein-to-mrna Ratio ; Regression Analysis ; Translational Regulation; Rna-binding Proteins; Messenger-rna; Translation; Interactome; Regions; Codon; Tool; Seq
ISSN (print) / ISBN 1756-0381
e-ISSN 1756-0381
Zeitschrift BioData Mining
Quellenangaben Band: 18 Heft: 1, Seiten: , Artikelnummer: 18 Supplement: ,
Verlag BioMed Central
Verlagsort London
Nichtpatentliteratur Publikationen
Begutachtungsstatus Peer reviewed
Institut(e) Institute of Network Biology (INET)
Förderungen Horizon 2020 Framework Programme