Background: In order to investigate associations between air pollution and adverse health effects consistent fine spatial air pollution surfaces are needed across large areas to provide cohorts with comparable exposures. The aim of this paper is to develop and evaluate fine spatial scale land use regression models for four major health relevant air pollutants (PM2.5, NO2, BC, O-3) across Europe.Methods: We developed West-European land use regression models (LUR) for 2010 estimating annual mean PM2.5, NO2, BC and O-3 concentrations (including cold and warm season estimates for O-3). The models were based on AirBase routine monitoring data (PM2.5, NO2 and O-3) and ESCAPE monitoring data (BC), and incorporated satellite observations, dispersion model estimates, land use and traffic data. Kriging was performed on the residual spatial variation from the LUR models and added to the exposure estimates. One model was developed using all sites (100%). Robustness of the models was evaluated by performing a five-fold hold-out validation and for PM2.5 and NO2 additionally with independent comparison at ESCAPE measurements. To evaluate the stability of each model's spatial structure over time, separate models were developed for different years (NO2 and O-3: 2000 and 2005; PM2.5: 2013).Results: The PM2.5, BC, NO2, O-3 annual, O-3 warm season and O-3 cold season models explained respectively 72%, 54%, 59%, 65%, 69% and 83% of spatial variation in the measured concentrations. Kriging proved an efficient technique to explain a part of residual spatial variation for the pollutants with a strong regional component explaining respectively 10%, 24% and 16% of the R-2 in the PM2.5, O-3 warm and O-3 cold models. Explained variance at fully independent sites vs the internal hold-out validation was slightly lower for PM2.5 (65% vs 66%) and lower for NO2 (49% vs 57%). Predictions from the 2010 model correlated highly with models developed in other years at the overall European scale.Conclusions: We developed robust PM2.5, NO2, O-3 and BC hybrid LUR models. At the West-European scale models were robust in time, becoming less robust at smaller spatial scales. Models were applied to 100 x 100 m surfaces across Western Europe to allow for exposure assignment for 35 million participants from 18 European cohorts participating in the ELAPSE study.