PLS-optimal: A stepwise D-optimal design based on latent variables.
    
    
        
    
    
        
        J. Chem. Inf. Model. 52, 975-983 (2012)
    
    
    
      
      
	
	    Several applications, such as risk assessment within REACH or drug discovery, require reliable methods for the design of experiments and efficient testing strategies. Keeping the number of experiments as low as possible is important from both a financial and an ethical point of view, as exhaustive testing of compounds requires significant financial resources and animal lives. With a large initial set of compounds, experimental design techniques can be used to select a representative subset for testing. Once measured, these compounds can be used to develop quantitative structure activity relationship models to predict properties of the remaining compounds. This reduces the required resources and time. D-Optimal design is frequently used to select an optimal set of compounds by analyzing data variance. We developed a new sequential approach to apply a D-Optimal design to latent variables derived from a partial least squares (PLS) model instead of principal components. The stepwise procedure selects a new set of molecules to be measured after each previous measurement cycle. We show that application of the D-Optimal selection generates models with a significantly improved performance on four different data sets with end points relevant for REACH. Compared to those derived from principal components, PLS models derived from the selection on latent variables had a lower root-mean-square error and a higher Q2 and R2. This improvement is statistically significant, especially for the small number of compounds selected.
	
	
	    
	
       
      
	
	    
		Impact Factor
		Scopus SNIP
		Web of Science
Times Cited
		Scopus
Cited By
		Altmetric
		
	     
	    
	 
       
      
     
    
        Publication type
        Article: Journal article
    
 
    
        Document type
        Scientific Article
    
 
    
        Thesis type
        
    
 
    
        Editors
        
    
    
        Keywords
        TETRAHYMENA-PYRIFORMIS; REPRESENTATIVE SUBSET; APPLICABILITY DOMAIN; PRINCIPAL COMPONENTS; MULTIVARIATE DESIGN; COMPOUND SELECTION; QSAR; RECONSTRUCTION; PREDICTION; TOXICITY
    
 
    
        Keywords plus
        
    
 
    
    
        Language
        
    
 
    
        Publication Year
        2012
    
 
    
        Prepublished in Year
        
    
 
    
        HGF-reported in Year
        2012
    
 
    
    
        ISSN (print) / ISBN
        0021-9576
    
 
    
        e-ISSN
        1520-5142
    
 
    
        ISBN
        
    
    
        Book Volume Title
        
    
 
    
        Conference Title
        
    
 
	
        Conference Date
        
    
     
	
        Conference Location
        
    
 
	
        Proceedings Title
        
    
 
     
	
    
        Quellenangaben
        
	    Volume: 52,  
	    Issue: 4,  
	    Pages: 975-983 
	    Article Number: ,  
	    Supplement: ,  
	
    
 
    
        
            Series
            
        
 
        
            Publisher
            American Chemical Society (ACS)
        
 
        
            Publishing Place
            
        
 
	
        
            Day of Oral Examination
            0000-00-00
        
 
        
            Advisor
            
        
 
        
            Referee
            
        
 
        
            Examiner
            
        
 
        
            Topic
            
        
 
	
        
            University
            
        
 
        
            University place
            
        
 
        
            Faculty
            
        
 
    
        
            Publication date
            0000-00-00
        
 
         
        
            Application date
            0000-00-00
        
 
        
            Patent owner
            
        
 
        
            Further owners
            
        
 
        
            Application country
            
        
 
        
            Patent priority
            
        
 
    
        Reviewing status
        Peer reviewed
    
 
     
    
        POF-Topic(s)
        30203 - Molecular Targets and Therapies
    
 
    
        Research field(s)
        Enabling and Novel Technologies
    
 
    
        PSP Element(s)
        G-503000-001
    
 
    
        Grants
        
    
 
    
        Copyright
        
    
 	
    
    
    
    
    
        Erfassungsdatum
        2012-07-23