Despite the frequent implication of aberrant gene expression in diseases, algorithms predicting aberrantly expressed genes of an individual are lacking. To address this need, we compile an aberrant expression prediction benchmark covering 8.2 million rare variants from 633 individuals across 49 tissues. While not geared toward aberrant expression, the deleteriousness score CADD and the loss-of-function predictor LOFTEE show mild predictive ability (1-1.6% average precision). Leveraging these and further variant annotations, we next train AbExp, a model that yields 12% average precision by combining in a tissue-specific fashion expression variability with variant effects on isoforms and on aberrant splicing. Integrating expression measurements from clinically accessible tissues leads to another two-fold improvement. Furthermore, we show on UK Biobank blood traits that performing rare variant association testing using the continuous and tissue-specific AbExp variant scores instead of LOFTEE variant burden increases gene discovery sensitivity and enables improved phenotype predictions.
GrantsNational Health Service (NHS) NCI Common Fund of the Office of the Director of the National Institutes of Health Helmholtz Association - Free State of Bavaria's Hightech Agenda through the Institute of AI for Health European Union's Horizon Europe research and innovation program Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) VALE German Bundesministerium fr Bildung und Forschung (BMBF) through the Model Exchange for Regulatory Genomics project NHGRI NHLBI NIDA Diabetes UK Cancer Research UK British Heart Foundation Welsh Government Northwest Regional Development Agency Wellcome Trust medical charity, Medical Research Council, Department of Health, Scottish Government Answer ALS Consortium NINDS NIMH Bundesministerium fr Bildung und Forschung (Federal Ministry of Education and Research)