Finner, H.* ; Strassburger, K.* ; Heid, I.M. ; Herder, C.* ; Rathmann, W.* ; Giani, G.* ; Dickhaus, T.* ; Lichtner, P. ; Meitinger, T. ; Wichmann, H.-E. ; Illig, T. ; Gieger, C.
     
    
        
How to link call rate and p-values for Hardy-Weinberg equilibrium as measures of genome-wide SNP data quality.
    
    
        
    
    
        
        Stat. Med. 29, 2347-2358 (2010)
    
    
 	
    
	
	  DOI
 DOI
	  PMC
 PMC
		
		
			Open Access Green as soon as Postprint is submitted to ZB.
		
     
    
      
      
	
	    We study the link between two quality measures of SNP (single nucleotide polymorphism) data in genome-wide association (GWA) studies, that is, per SNP call rates (CR) and p-values for testing Hardy-Weinberg equilibrium (HWE). The aim is to improve these measures by applying methods based on realized randomized p-values, the false discovery rate and estimates for the proportion of false hypotheses. While exact non-randomized conditional p-values for testing HWE cannot be recommended for estimating the proportion of false hypotheses, their realized randomized counterparts should be used. P-values corresponding to the asymptotic unconditional chi-square test lead to reasonable estimates only if SNPs with low minor allele frequency are excluded. We provide an algorithm to compute the probability that SNPs violate HWE given the observed CR, which yields an improved measure of data quality. The proposed methods are applied to SNP data from the KORA (Cooperative Health Research in the Region of Augsburg, Southern Germany) 500 K project, a GWA study in a population-based sample genotyped by Affymetrix GeneChip 500 K arrays using the calling algorithm BRLMM 1.4.0. We show that all SNPs with CR = 100 per cent are nearly in perfect HWE which militates in favor of the population to meet the conditions required for HWE at least for these SNPs. Moreover, we show that the proportion of SNPs not being in HWE increases with decreasing CR. We conclude that using a single threshold for judging HWE p-values without taking the CR into account is problematic. Instead we recommend a stratified analysis with respect to CR.
	
	
	    
	
       
      
	
	    
		Impact Factor
		Scopus SNIP
		Web of Science
Times Cited
		Scopus
Cited By
		Altmetric
		
	     
	    
	 
       
      
     
    
        Publication type
        Article: Journal article
    
 
    
        Document type
        Scientific Article
    
 
    
        Thesis type
        
    
 
    
        Editors
        
    
    
        Keywords
        False discovery rate; Genome-wide association (GWA) studies; Genotyping errors; Multiple comparisons; Multiple hypotheses testing; Quality control; Randomized p-values
    
 
    
        Keywords plus
        
    
 
    
    
        Language
        english
    
 
    
        Publication Year
        2010
    
 
    
        Prepublished in Year
        
    
 
    
        HGF-reported in Year
        2010
    
 
    
    
        ISSN (print) / ISBN
        0277-6715
    
 
    
        e-ISSN
        1097-0258
    
 
    
        ISBN
        
    
    
        Book Volume Title
        
    
 
    
        Conference Title
        
    
 
	
        Conference Date
        
    
     
	
        Conference Location
        
    
 
	
        Proceedings Title
        
    
 
     
	
    
        Quellenangaben
        
	    Volume: 29,  
	    Issue: 22,  
	    Pages: 2347-2358 
	    Article Number: ,  
	    Supplement: ,  
	
    
 
    
        
            Series
            
        
 
        
            Publisher
            Wiley
        
 
        
            Publishing Place
            
        
 
	
        
            Day of Oral Examination
            0000-00-00
        
 
        
            Advisor
            
        
 
        
            Referee
            
        
 
        
            Examiner
            
        
 
        
            Topic
            
        
 
	
        
            University
            
        
 
        
            University place
            
        
 
        
            Faculty
            
        
 
    
        
            Publication date
            0000-00-00
        
 
         
        
            Application date
            0000-00-00
        
 
        
            Patent owner
            
        
 
        
            Further owners
            
        
 
        
            Application country
            
        
 
        
            Patent priority
            
        
 
    
        Reviewing status
        Peer reviewed
    
 
     
    
        POF-Topic(s)
        30503 - Chronic Diseases of the Lung and Allergies
30501 - Systemic Analysis of Genetic and Environmental Factors that Impact Health
30202 - Environmental Health
    
 
    
        Research field(s)
        Genetics and Epidemiology
    
 
    
        PSP Element(s)
        G-503900-001
G-500700-001
G-504090-001
    
 
    
        Grants
        
    
 
    
        Copyright
        
    
 	
    
    
    
    
        Erfassungsdatum
        2010-10-01