PuSH - Publication Server of Helmholtz Zentrum München: Deep learning-based phenotype imputation on population-scale biobank data increases genetic discoveries.

Navigation

Home

Deutsch

Research

Advanced Search

Browse by ...

... Journal

... Publication Type

... Research Data

... Publication Year

Publication overview

Support & Contact

Contact persons

Help

Data protection

An, U.* ; Pazokitoroudi, A.* ; Alvarez, M.* ; Huang, L. ; Bacanu, S.A.* ; Schork, A.J.* ; Kendler, K.* ; Pajukanta, P.* ; Flint, J.* ; Zaitlen, N.* ; Cai, N. ; Dahl, A.* ; Sankararaman, S.*

Deep learning-based phenotype imputation on population-scale biobank data increases genetic discoveries.

Nat. Genet. 55, 2269-2276 (2023)

Publ. Version/Full Text

DOI

PMC

	Open Access Hybrid

Abstract
Metrics
Extra information

Biobanks that collect deep phenotypic and genomic data across many individuals have emerged as a key resource in human genetics. However, phenotypes in biobanks are often missing across many individuals, limiting their utility. We propose AutoComplete, a deep learning-based imputation method to impute or ‘fill-in’ missing phenotypes in population-scale biobank datasets. When applied to collections of phenotypes measured across ~300,000 individuals from the UK Biobank, AutoComplete substantially improved imputation accuracy over existing methods. On three traits with notable amounts of missingness, we show that AutoComplete yields imputed phenotypes that are genetically similar to the originally observed phenotypes while increasing the effective sample size by about twofold on average. Further, genome-wide association analyses on the resulting imputed phenotypes led to a substantial increase in the number of associated loci. Our results demonstrate the utility of deep learning-based phenotype imputation to increase power for genetic discoveries in existing biobank datasets.

Altmetric

Additional Metrics?

[➜Log in]

Edit extra informations Login

Publication type Article: Journal article

Document type Scientific Article

Keywords Genome-wide Association; Ld Score Regression; Dna

ISSN (print) / ISBN 1061-4036

e-ISSN 1546-1718

Journal Nature Genetics

Quellenangaben Volume: 55, Issue: 12, Pages: 2269-2276

Publisher Nature Publishing Group

Publishing Place New York, NY

Reviewing status Peer reviewed

Institute(s) Helmholtz Pioneer Campus (HPC)

Grants Lundbeckfonden (Lundbeck Foundation)
U.S. Department of Health & Human Services | National Institutes of Health (NIH)
NSF | Directorate for Biological Sciences (BIO)
NSF | Directorate for Computer & Information Science & Engineering | Division of Information and Intelligent Systems (Information & Intelligent Systems)
NSF | BIO | Division of Biological Infrastructure (DBI)
National Science Foundation (NSF)