PuSH - Publikationsserver des Helmholtz Zentrums München

Finner, H.* ; Strassburger, K.* ; Heid, I.M. ; Herder, C.* ; Rathmann, W.* ; Giani, G.* ; Dickhaus, T.* ; Lichtner, P. ; Meitinger, T. ; Wichmann, H.-E. ; Illig, T. ; Gieger, C.

How to link call rate and p-values for Hardy-Weinberg equilibrium as measures of genome-wide SNP data quality.

Stat. Med. 29, 2347-2358 (2010)
DOI PMC
Open Access Green möglich sobald Postprint bei der ZB eingereicht worden ist.
We study the link between two quality measures of SNP (single nucleotide polymorphism) data in genome-wide association (GWA) studies, that is, per SNP call rates (CR) and p-values for testing Hardy-Weinberg equilibrium (HWE). The aim is to improve these measures by applying methods based on realized randomized p-values, the false discovery rate and estimates for the proportion of false hypotheses. While exact non-randomized conditional p-values for testing HWE cannot be recommended for estimating the proportion of false hypotheses, their realized randomized counterparts should be used. P-values corresponding to the asymptotic unconditional chi-square test lead to reasonable estimates only if SNPs with low minor allele frequency are excluded. We provide an algorithm to compute the probability that SNPs violate HWE given the observed CR, which yields an improved measure of data quality. The proposed methods are applied to SNP data from the KORA (Cooperative Health Research in the Region of Augsburg, Southern Germany) 500 K project, a GWA study in a population-based sample genotyped by Affymetrix GeneChip 500 K arrays using the calling algorithm BRLMM 1.4.0. We show that all SNPs with CR = 100 per cent are nearly in perfect HWE which militates in favor of the population to meet the conditions required for HWE at least for these SNPs. Moreover, we show that the proportion of SNPs not being in HWE increases with decreasing CR. We conclude that using a single threshold for judging HWE p-values without taking the CR into account is problematic. Instead we recommend a stratified analysis with respect to CR.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
1.220
1.220
7
11
Tags
Anmerkungen
Besondere Publikation
Auf Hompepage verbergern

Zusatzinfos bearbeiten
Eigene Tags bearbeiten
Privat
Eigene Anmerkung bearbeiten
Privat
Auf Publikationslisten für
Homepage nicht anzeigen
Als besondere Publikation
markieren
Publikationstyp Artikel: Journalartikel
Dokumenttyp Wissenschaftlicher Artikel
Schlagwörter False discovery rate; Genome-wide association (GWA) studies; Genotyping errors; Multiple comparisons; Multiple hypotheses testing; Quality control; Randomized p-values
Sprache englisch
Veröffentlichungsjahr 2010
HGF-Berichtsjahr 2010
ISSN (print) / ISBN 0277-6715
e-ISSN 1097-0258
Quellenangaben Band: 29, Heft: 22, Seiten: 2347-2358 Artikelnummer: , Supplement: ,
Verlag Wiley
Begutachtungsstatus Peer reviewed
Institut(e) Institute of Epidemiology (EPI)
Institute of Human Genetics (IHG)
POF Topic(s) 30503 - Chronic Diseases of the Lung and Allergies
30501 - Systemic Analysis of Genetic and Environmental Factors that Impact Health
30202 - Environmental Health
Forschungsfeld(er) Genetics and Epidemiology
PSP-Element(e) G-503900-001
G-500700-001
G-504090-001
PubMed ID 20641143
Scopus ID 77956928025
Erfassungsdatum 2010-10-01