TY - JOUR AB - Single-cell RNA-seq data from clinical samples often suffer from batch effects, but data sharing is limited due to genomic privacy concerns. We present FedscGen, a privacy-preserving communication-efficient federated method built upon the scGen model, enhanced with secure multiparty computation. FedscGen supports federated training and batch effect correction workflows, including the integration of new studies. We benchmark FedscGen across diverse datasets, showing competitive performance-matching scGen on key metrics like NMI, GC, ILF1, ASW_C, kBET, and EBM on the Human Pancreas dataset. Published as a FeatureCloud app, FedscGen enables secure, real-world collaboration for scRNA-seq batch effect correction. AU - Bakhtiari, M.* AU - Bonn, S.* AU - Theis, F.J. AU - Zolotareva, O.* AU - Baumbach, J.* C1 - 75219 C2 - 57856 CY - Campus, 4 Crinan St, London N1 9xw, England TI - FedscGen: Privacy-preserving federated batch effect correction of single-cell RNA sequencing data. JO - Genome Biol. VL - 26 IS - 1 PB - Bmc PY - 2025 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: Field inoculation of crops with beneficial microbes is a promising sustainable strategy to enhance plant fitness and nutrient acquisition. However, effectiveness can vary due to environmental factors, microbial competition, and methodological challenges, while their precise modes of action remain uncertain. This underscores the need for further research to optimize inoculation strategies for consistent agricultural benefits. RESULTS: Using a comprehensive, multidisciplinary approach, we investigate the effects of a consortium of beneficial microbes (BMc) (Pseudomonas sp. RU47, Bacillus atrophaeus ABi03, Trichoderma harzianum OMG16) on maize (Zea mays cv. Benedictio) through an inoculation experiment conducted within a long-term field trial across intensive and extensive farming practices. Additionally, an unexpected early drought stress emerged as a climatic variable, offering further insight into the effectiveness of the microbial consortium. Our findings demonstrate that BMc root inoculation primarily enhanced plant growth and fitness, particularly by increasing iron uptake, which is crucial for drought adaptation. Inoculated maize plants show improved shoot growth and fitness compared to non-inoculated plants, regardless of farming practices. Specifically, BMc modulate plant hormonal balance, enhance the detoxification of reactive oxygen species, and increase root exudation of iron-chelating metabolites. Amplicon sequencing reveals shifts in rhizosphere bacterial and fungal communities mediated by the consortium. Metagenomic shotgun sequencing indicates enrichment of genes related to antimicrobial lipopeptides and siderophores. CONCLUSIONS: Our findings highlight the multifaceted benefits of BMc inoculation on plant fitness, significantly influencing metabolism, stress responses, and the rhizosphere microbiome. These improvements are crucial for advancing sustainable agricultural practices by enhancing plant resilience and productivity. AU - Francioli, D.* AU - Kampouris, I.D.* AU - Kuhl-Nagel, T.* AU - Babin, D.* AU - Sommermann, L.* AU - Behr, J.H.* AU - Chowdhury, S.P. AU - Zrenner, R.* AU - Moradtalab, N.* AU - Schloter, M. AU - Geistlinger, J.* AU - Ludewig, U.* AU - Neumann, G.* AU - Smalla, K.* AU - Grosch, R.* C1 - 74854 C2 - 57606 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Microbial inoculants modulate the rhizosphere microbiome, alleviate plant stress responses, and enhance maize growth at field scale. JO - Genome Biol. VL - 26 IS - 1 PB - Bmc PY - 2025 SN - 1474-760X ER - TY - JOUR AB - With rapid advancements in single-cell DNA sequencing (scDNA-seq), various computational methods have been developed to study evolution and call variants on single-cell level. However, modeling deletions remains challenging because they affect total coverage in ways that are difficult to distinguish from technical artifacts. We present DelSIEVE, a statistical method that infers cell phylogeny and single-nucleotide variants, accounting for deletions, from scDNA-seq data. DelSIEVE distinguishes deletions from mutations and artifacts, detecting more evolutionary events than previous methods. Simulations show high performance, and application to cancer samples reveals varying amounts of deletions and double mutants in different tumors. AU - Kang, S.* AU - Borgsmüller, N.* AU - Valecha, M.* AU - Markowska, M.* AU - Kuipers, J.* AU - Beerenwinkel, N.* AU - Posada, D.* AU - Szczurek, E. C1 - 75399 C2 - 57957 CY - Campus, 4 Crinan St, London N1 9xw, England TI - DelSIEVE: Cell phylogeny modeling of single nucleotide variants and deletions from single-cell DNA sequencing data. JO - Genome Biol. VL - 26 IS - 1 PB - Bmc PY - 2025 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: Direct conversion of reactive glial cells to neurons is a promising avenue for neuronal replacement therapies after brain injury or neurodegeneration. The overexpression of neurogenic fate determinants in glial cells results in conversion to neurons. For repair purposes, the conversion should ideally be induced in the pathology-induced neuroinflammatory environment. However, very little is known regarding the influence of the injury-induced neuroinflammatory environment and released growth factors on the direct conversion process. RESULTS: We establish a new in vitro culture system of postnatal astrocytes without epidermal growth factor that reflects the direct conversion rate in the injured, neuroinflammatory environment in vivo. We demonstrate that the growth factor combination corresponding to the injured environment defines the ability of glia to be directly converted to neurons. Using this culture system, we show that chromatin structural protein high mobility group box 2 (HMGB2) regulates the direct conversion rate downstream of the growth factor combination. We further demonstrate that Hmgb2 cooperates with neurogenic fate determinants, such as Neurog2, in opening chromatin at the loci of genes regulating neuronal maturation and synapse formation. Consequently, early chromatin rearrangements occur during direct fate conversion and are necessary for full fate conversion. CONCLUSIONS: Our data demonstrate novel growth factor-controlled regulation of gene expression during direct fate conversion. This regulation is crucial for proper maturation of induced neurons and could be targeted to improve the repair process. AU - Maddhesiya, P. AU - Lepko, T. AU - Steiner‑Mezzardi, A. AU - Schneider, J. AU - Schwarz, V. AU - Merl-Pham, J. AU - Berger, F. AU - Hauck, S.M. AU - Ronfani, L.* AU - Bianchi, M.E.* AU - Simon, T.* AU - Krontira, A. AU - Masserdotti, G. AU - Götz, M. AU - Ninkovic, J. C1 - 74146 C2 - 57351 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Hmgb2 improves astrocyte to neuron conversion by increasing the chromatin accessibility of genes associated with neuronal maturation in a proneuronal factor-dependent manner. JO - Genome Biol. VL - 26 IS - 1 PB - Bmc PY - 2025 SN - 1474-760X ER - TY - JOUR AB - Following publication of the original article [1], the authors identified an error in the author name of Zhanna Balkhiyarova. The incorrect author name is: Zhanna Balkiyarova The correct author name is: Zhanna Balkhiyarova The author group has been updated above and the original article [1] has been corrected. AU - Bradfeld, J.P.* AU - Kember, R.L.* AU - Ulrich, A.* AU - Balkhiyarova, Z.* AU - Alyass, A.* AU - Aris, I.M.* AU - Bell, J.A.* AU - Broadaway, K.A.* AU - Chen, Z.* AU - Chai, J.F.* AU - Davies, N.M.* AU - Fernandez-Orth, D.* AU - Bustamante, M.* AU - Fore, R.* AU - Ganguli, A.* AU - Heiskala, A.* AU - Hottenga, J.J.* AU - Iñiguez, C.* AU - Kobes, S.* AU - Leinonen, J.T.* AU - Lowry, E.* AU - Lyytikäinen, L.P.* AU - Mahajan, A.* AU - Pitkänen, N.* AU - Schnurr, T.M.* AU - Have, C.T.* AU - Strachan, D.P.* AU - Thiering, E. AU - Vogelezang, S.* AU - Wade, K.H.* AU - Wang, C.A.* AU - Wong, A.* AU - Holm, L.A.* AU - Chesi, A.* AU - Choong, C.* AU - Cruz, M.* AU - Elliott, P.* AU - Franks, S.* AU - Frithiof-Bøjsøe, C.* AU - Gauderman, W.J.* AU - Glessner, J.T.* AU - Gilsanz, V.* AU - Griesman, K.* AU - Hanson, R.L.* AU - Kaakinen, M.* AU - Kalkwarf, H.* AU - Kelly, A.* AU - Kindler, J.* AU - Kähönen, M.* AU - Lanca, C.* AU - Lappe, J.* AU - Lee, N.R.* AU - McCormack, S.E.* AU - Mentch, F.D.* AU - Mitchell, J.A.* AU - Mononen, N.* AU - Niinikoski, H.* AU - Oken, E.* AU - Pahkala, K.* AU - Sim, X.* AU - Teo, Y.Y.* AU - Baier, L.J.* AU - van Beijsterveldt, T.* AU - Adair, L.S.* AU - Boomsma, D.I.* AU - de Geus, E.* AU - Guxens, M.* AU - Eriksson, J.G.* AU - Felix, J.F.* AU - Gilliland, F.D.* AU - Hansen, T.* AU - Hardy, R.* AU - Hivert, M.F.* AU - Holm, J.C.* AU - Jaddoe, V.W.V.* AU - Järvelin, M.R.* AU - Lehtimäki, T.* AU - Mackey, D.A.* AU - Meyre, D.* AU - Mohlke, K.L.* AU - Mykkänen, J.* AU - Oberfeld, S.* AU - Pennell, C.E.* AU - Perry, J.R.B.* AU - Raitakari, O.* AU - Rivadeneira, F.* AU - Saw, S.M.* AU - Sebert, S.* AU - Shepherd, J.A.* AU - Standl, M. AU - Sørensen, T.I.A.* AU - Timpson, N.J.* AU - Torrent, M.* AU - Willemsen, G.* AU - Hyppönen, E.* AU - Power, C.* AU - McCarthy, M.I.* AU - Freathy, R.M.* AU - Widén, E.* AU - Hakonarson, H.* AU - Prokopenko, I.* AU - Voight, B.F.* AU - Zemel, B.S.* AU - Grant, S.F.A.* AU - Cousminer, D.L.* C1 - 70726 C2 - 55726 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Author Correction: Trans-ancestral genome-wide association study of longitudinal pubertal height growth and shared heritability with adult health outcomes. JO - Genome Biol. VL - 25 IS - 1 PB - Bmc PY - 2024 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: Pubertal growth patterns correlate with future health outcomes. However, the genetic mechanisms mediating growth trajectories remain largely unknown. Here, we modeled longitudinal height growth with Super-Imposition by Translation And Rotation (SITAR) growth curve analysis on ~ 56,000 trans-ancestry samples with repeated height measurements from age 5 years to adulthood. We performed genetic analysis on six phenotypes representing the magnitude, timing, and intensity of the pubertal growth spurt. To investigate the lifelong impact of genetic variants associated with pubertal growth trajectories, we performed genetic correlation analyses and phenome-wide association studies in the Penn Medicine BioBank and the UK Biobank. RESULTS: Large-scale growth modeling enables an unprecedented view of adolescent growth across contemporary and 20th-century pediatric cohorts. We identify 26 genome-wide significant loci and leverage trans-ancestry data to perform fine-mapping. Our data reveals genetic relationships between pediatric height growth and health across the life course, with different growth trajectories correlated with different outcomes. For instance, a faster tempo of pubertal growth correlates with higher bone mineral density, HOMA-IR, fasting insulin, type 2 diabetes, and lung cancer, whereas being taller at early puberty, taller across puberty, and having quicker pubertal growth were associated with higher risk for atrial fibrillation. CONCLUSION: We report novel genetic associations with the tempo of pubertal growth and find that genetic determinants of growth are correlated with reproductive, glycemic, respiratory, and cardiac traits in adulthood. These results aid in identifying specific growth trajectories impacting lifelong health and show that there may not be a single "optimal" pubertal growth pattern. AU - Bradfield, J.P.* AU - Kember, R.L.* AU - Ulrich, A.* AU - Balkiyarova, Z.* AU - Alyass, A.* AU - Aris, I.M.* AU - Bell, J.A.* AU - Broadaway, K.A.* AU - Chen, Z.* AU - Chai, J.F.* AU - Davies, N.M.* AU - Fernandez-Orth, D.* AU - Bustamante, M.* AU - Fore, R.* AU - Ganguli, A.* AU - Heiskala, A.* AU - Hottenga, J.J.* AU - Iñiguez, C.* AU - Kobes, S.* AU - Leinonen, J.T.* AU - Lowry, E.* AU - Lyytikäinen, L.P.* AU - Mahajan, A.* AU - Pitkänen, N.* AU - Schnurr, T.M.* AU - Have, C.T.* AU - Strachan, D.P.* AU - Thiering, E. AU - Vogelezang, S.* AU - Wade, K.H.* AU - Wang, C.A.* AU - Wong, A.* AU - Holm, L.A.* AU - Chesi, A.* AU - Choong, C.* AU - Cruz, M.* AU - Elliott, P.* AU - Franks, S.* AU - Frithioff-Bøjsøe, C.* AU - Gauderman, W.J.* AU - Glessner, J.T.* AU - Gilsanz, V.* AU - Griesman, K.* AU - Hanson, R.L.* AU - Kaakinen, M.* AU - Kalkwarf, H.* AU - Kelly, A.* AU - Kindler, J.* AU - Kähönen, M.* AU - Lanca, C.* AU - Lappe, J.* AU - Lee, N.R.* AU - McCormack, S.E.* AU - Mentch, F.D.* AU - Mitchell, J.A.* AU - Mononen, N.* AU - Niinikoski, H.* AU - Oken, E.* AU - Pahkala, K.* AU - Sim, X.* AU - Teo, Y.Y.* AU - Baier, L.J.* AU - van Beijsterveldt, T.* AU - Adair, L.S.* AU - Boomsma, D.I.* AU - de Geus, E.* AU - Guxens, M.* AU - Eriksson, J.G.* AU - Felix, J.F.* AU - Gilliland, F.D.* AU - Hansen, T.* AU - Hardy, R.* AU - Hivert, M.F.* AU - Holm, J.C.* AU - Jaddoe, V.W.V.* AU - Järvelin, M.R.* AU - Lehtimäki, T.* AU - Mackey, D.A.* AU - Meyre, D.* AU - Mohlke, K.L.* AU - Mykkänen, J.* AU - Oberfield, S.* AU - Pennell, C.E.* AU - Perry, J.R.B.* AU - Raitakari, O.* AU - Rivadeneira, F.* AU - Saw, S.M.* AU - Sebert, S.* AU - Shepherd, J.A.* AU - Standl, M. AU - Sørensen, T.I.A.* AU - Timpson, N.J.* AU - Torrent, M.* AU - Willemsen, G.* AU - Hyppönen, E.* AU - Power, C.* AU - McCarthy, M.I.* AU - Freathy, R.M.* AU - Widén, E.* AU - Hakonarson, H.* AU - Prokopenko, I.* AU - Voight, B.F.* AU - Zemel, B.S.* AU - Grant, S.F.A.* AU - Cousminer, D.L.* C1 - 69827 C2 - 55209 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Trans-ancestral genome-wide association study of longitudinal pubertal height growth and shared heritability with adult health outcomes. JO - Genome Biol. VL - 25 IS - 1 PB - Bmc PY - 2024 SN - 1474-760X ER - TY - JOUR AB - Single-cell multiplexing techniques (cell hashing and genetic multiplexing) combine multiple samples, optimizing sample processing and reducing costs. Cell hashing conjugates antibody-tags or chemical-oligonucleotides to cell membranes, while genetic multiplexing allows to mix genetically diverse samples and relies on aggregation of RNA reads at known genomic coordinates. We develop hadge (hashing deconvolution combined with genotype information), a Nextflow pipeline that combines 12 methods to perform both hashing- and genotype-based deconvolution. We propose a joint deconvolution strategy combining best-performing methods and demonstrate how this approach leads to the recovery of previously discarded cells in a nuclei hashing of fresh-frozen brain tissue. AU - Curion, F. AU - Wu, X. AU - Heumos, L. AU - Gonzales André, M.M. AU - Halle, L. AU - Ozols, M.* AU - Grant-Peters, M.* AU - Rich-Griffin, C.* AU - Yeung, H.Y.* AU - Dendrou, C.A.* AU - Schiller, H. AU - Theis, F.J. C1 - 70553 C2 - 55867 CY - Campus, 4 Crinan St, London N1 9xw, England TI - hadge: A comprehensive pipeline for donor deconvolution in single-cell studies. JO - Genome Biol. VL - 25 IS - 1 PB - Bmc PY - 2024 SN - 1474-760X ER - TY - JOUR AB - Single-cell multiomic analysis of the epigenome, transcriptome, and proteome allows for comprehensive characterization of the molecular circuitry that underpins cell identity and state. However, the holistic interpretation of such datasets presents a challenge given a paucity of approaches for systematic, joint evaluation of different modalities. Here, we present Panpipes, a set of computational workflows designed to automate multimodal single-cell and spatial transcriptomic analyses by incorporating widely-used Python-based tools to perform quality control, preprocessing, integration, clustering, and reference mapping at scale. Panpipes allows reliable and customizable analysis and evaluation of individual and integrated modalities, thereby empowering decision-making before downstream investigations. AU - Curion, F. AU - Rich-Griffin, C.* AU - Agarwal, D.* AU - Ouologuem, S. AU - Rue-Albrecht, K.* AU - May, L. AU - Garcia, G.E.L.* AU - Heumos, L. AU - Thomas, T.* AU - Lason, W.* AU - Sims, D.* AU - Theis, F.J. AU - Dendrou, C.A.* C1 - 71069 C2 - 55933 TI - Panpipes: A pipeline for multiomic single-cell and spatial transcriptomic data analysis. JO - Genome Biol. VL - 25 IS - 1 PY - 2024 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: The fatal diffuse midline gliomas (DMG) are characterized by an undruggable H3K27M mutation in H3.1 or H3.3. K27M impairs normal development by stalling differentiation. The identification of targetable pathways remains very poorly explored. Toward this goal, we undertake a multi-omics approach to evaluate replication timing profiles, transcriptomics, and cell cycle features in DMG cells from both H3.1K27M and H3.3K27M subgroups and perform a comparative, integrative data analysis with healthy brain tissue. RESULTS: DMG cells present differential replication timing in each subgroup, which, in turn, correlates with significant differential gene expression. Differentially expressed genes in S phase are involved in various pathways related to DNA replication. We detect increased expression of DNA replication genes earlier in the cell cycle in DMG cell lines compared to normal brain cells. Furthermore, the distance between origins of replication in DMG cells is smaller than in normal brain cells and their fork speed is slower, a read-out of replication stress. Consistent with these findings, DMG tumors present high replication stress signatures in comparison to normal brain cells. Finally, DMG cells are specifically sensitive to replication stress therapy. CONCLUSIONS: This whole genome multi-omics approach provides insights into the cell cycle regulation of DMG via the H3K27M mutations and establishes a pharmacologic vulnerability in DNA replication, which resolves a potentially novel therapeutic strategy for this non-curable disease. AU - Hains, A.E.* AU - Chetal, K.* AU - Nakatani, T. AU - Marques, J.G.* AU - Ettinger, A. AU - Junior, C.A.O.B.* AU - Gonzalez-Sandoval, A.* AU - Pillai, R.* AU - Filbin, M.G.* AU - Torres-Padilla, M.E. AU - Sadreyev, R.I.* AU - Van Rechem, C.* C1 - 72876 C2 - 56766 TI - Multi-omics approaches reveal that diffuse midline gliomas present altered DNA replication and are susceptible to replication stress therapy. JO - Genome Biol. VL - 25 IS - 1 PY - 2024 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: The rise of large-scale multi-species genome sequencing projects promises to shed new light on how genomes encode gene regulatory instructions. To this end, new algorithms are needed that can leverage conservation to capture regulatory elements while accounting for their evolution. RESULTS: Here, we introduce species-aware DNA language models, which we trained on more than 800 species spanning over 500 million years of evolution. Investigating their ability to predict masked nucleotides from context, we show that DNA language models distinguish transcription factor and RNA-binding protein motifs from background non-coding sequence. Owing to their flexibility, DNA language models capture conserved regulatory elements over much further evolutionary distances than sequence alignment would allow. Remarkably, DNA language models reconstruct motif instances bound in vivo better than unbound ones and account for the evolution of motif sequences and their positional constraints, showing that these models capture functional high-order sequence and evolutionary context. We further show that species-aware training yields improved sequence representations for endogenous and MPRA-based gene expression prediction, as well as motif discovery. CONCLUSIONS: Collectively, these results demonstrate that species-aware DNA language models are a powerful, flexible, and scalable tool to integrate information from large compendia of highly diverged genomes. AU - Karollus, A.* AU - Hingerl, J.* AU - Gankin, D.* AU - Grosshauser, M.* AU - Klemon, K.* AU - Gagneur, J. C1 - 70355 C2 - 55518 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Species-aware DNA language models capture regulatory elements and their evolution. JO - Genome Biol. VL - 25 IS - 1 PB - Bmc PY - 2024 SN - 1474-760X ER - TY - JOUR AB - Simultaneous profiling of single-cell gene expression and lineage history holds enormous potential for studying cellular decision-making. Recent computational approaches combine both modalities into cellular trajectories; however, they cannot make use of all available lineage information in destructive time-series experiments. Here, we present moslin, a Gromov-Wasserstein-based model to couple cellular profiles across time points based on lineage and gene expression information. We validate our approach in simulations and demonstrate on Caenorhabditis elegans embryonic development how moslin predicts fate probabilities and putative decision driver genes. Finally, we use moslin to delineate lineage relationships among transiently activated fibroblast states during zebrafish heart regeneration. AU - Lange, M. AU - Piran, Z.* AU - Klein, M.* AU - Spanjaard, B.* AU - Klein, D. AU - Junker, J.P.* AU - Theis, F.J. AU - Nitzan, M.* C1 - 72114 C2 - 56507 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Mapping lineage-traced cells across time points with moslin. JO - Genome Biol. VL - 25 IS - 1 PB - Bmc PY - 2024 SN - 1474-760X ER - TY - JOUR AB - Background: The Critical Assessment of Genome Interpretation (CAGI) aims to advance the state-of-the-art for computational prediction of genetic variant impact, particularly where relevant to disease. The five complete editions of the CAGI community experiment comprised 50 challenges, in which participants made blind predictions of phenotypes from genetic data, and these were evaluated by independent assessors. Results: Performance was particularly strong for clinical pathogenic variants, including some difficult-to-diagnose cases, and extends to interpretation of cancer-related variants. Missense variant interpretation methods were able to estimate biochemical effects with increasing accuracy. Assessment of methods for regulatory variants and complex trait disease risk was less definitive and indicates performance potentially suitable for auxiliary use in the clinic. Conclusions: Results show that while current methods are imperfect, they have major utility for research and clinical applications. Emerging methods and increasingly large, robust datasets for training and assessment promise further progress ahead. AU - Critical Assessment of Genome Interpretation Consortium (Müller, N.S.) AU - Critical Assessment of Genome Interpretation Consortium (Eraslan, G.) C1 - 70324 C2 - 55515 CY - Campus, 4 Crinan St, London N1 9xw, England TI - CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods. JO - Genome Biol. VL - 25 IS - 1 PB - Bmc PY - 2024 SN - 1474-760X ER - TY - JOUR AB - Tumours exhibit high genotypic and transcriptional heterogeneity. Both affect cancer progression and treatment, but have been predominantly studied separately in follicular lymphoma. To comprehensively investigate the evolution and genotype-to-phenotype maps in follicular lymphoma, we introduce CaClust, a probabilistic graphical model integrating deep whole exome, single-cell RNA and B-cell receptor sequencing data to infer clone genotypes, cell-to-clone mapping, and single-cell genotyping. CaClust outperforms a state-of-the-art model on simulated and patient data. In-depth analyses of single cells from four samples showcase effects of driver mutations, follicular lymphoma evolution, possible therapeutic targets, and single-cell genotyping that agrees with an independent targeted resequencing experiment. AU - Oksza-Orzechowski, K.* AU - Quinten, E.* AU - Shafighi, S.* AU - Kiełbasa, S.M.* AU - van Kessel, H.W.* AU - de Groen, R.A.L.* AU - Vermaat, J.S.P.* AU - Sepúlveda Yáñez, J.H.* AU - Navarrete, M.A.* AU - Veelken, H.* AU - van Bergen, C.A.M.* AU - Szczurek, E. C1 - 72242 C2 - 56519 CY - Campus, 4 Crinan St, London N1 9xw, England TI - CaClust: Linking genotype to transcriptional heterogeneity of follicular lymphoma using BCR and exomic variants. JO - Genome Biol. VL - 25 IS - 1 PB - Bmc PY - 2024 SN - 1474-760X ER - TY - JOUR AB - Many datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While biospecimen and experimental information is often captured, detailed metadata standards related to data matrices and analysis workflows are currently lacking. To address this, we develop the matrix and analysis metadata standards (MAMS) to serve as a resource for data centers, repositories, and tool developers. We define metadata fields for matrices and parameters commonly utilized in analytical workflows and developed the rmams package to extract MAMS from single-cell objects. Overall, MAMS promotes the harmonization, integration, and reproducibility of single-cell data across platforms. AU - Sarfraz, I.* AU - Wang, Y.* AU - Shastry, A.* AU - Teh, W.K.* AU - Sokolov, A.* AU - Herb, B.R.* AU - Creasy, H.H.* AU - Virshup, I. AU - Dries, R.* AU - Degatano, K.* AU - Mahurkar, A.* AU - Schnell, D.J.* AU - Madrigal, P.* AU - Hilton, J.* AU - Gehlenborg, N.* AU - Tickle, T.* AU - Campbell, J.D.* C1 - 71397 C2 - 56106 TI - MAMS: Matrix and analysis metadata standards to facilitate harmonization and reproducibility of single-cell data. JO - Genome Biol. VL - 25 IS - 1 PY - 2024 SN - 1474-760X ER - TY - JOUR AB - The identification of gene regulatory networks (GRNs) is crucial for understanding cellular differentiation. Single-cell RNA sequencing data encode gene-level covariations at high resolution, yet data sparsity and high dimensionality hamper accurate and scalable GRN reconstruction. To overcome these challenges, we introduce NetID leveraging homogenous metacells while avoiding spurious gene-gene correlations. Benchmarking demonstrates superior performance of NetID compared to imputation-based methods. By incorporating cell fate probability information, NetID facilitates the prediction of lineage-specific GRNs and recovers known network motifs governing bone marrow hematopoiesis, making it a powerful toolkit for deciphering gene regulatory control of cellular differentiation from large-scale single-cell transcriptome data. AU - Wang, W. AU - Wang, Y.* AU - Lyu, R.* AU - Grün, D.* C1 - 72058 C2 - 56517 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Scalable identification of lineage-specific gene regulatory networks from metacells with NetID. JO - Genome Biol. VL - 25 IS - 1 PB - Bmc PY - 2024 SN - 1474-760X ER - TY - JOUR AB - CRISPR interference (CRISPRi) is the leading technique to silence gene expression in bacteria; however, design rules remain poorly defined. We develop a best-in-class prediction algorithm for guide silencing efficiency by systematically investigating factors influencing guide depletion in genome-wide essentiality screens, with the surprising discovery that gene-specific features substantially impact prediction. We develop a mixed-effect random forest regression model that provides better estimates of guide efficiency. We further apply methods from explainable AI to extract interpretable design rules from the model. This study provides a blueprint for predictive models for CRISPR technologies where only indirect measurements of guide activity are available. AU - Yu, Y.* AU - Gawlitt, S.* AU - Barros De Andrade E Sousa, L. AU - Merdivan, E. AU - Piraud, M. AU - Beisel, C.L.* AU - Barquist, L.* C1 - 69733 C2 - 55198 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Improved prediction of bacterial CRISPRi guide efficiency from depletion screens through mixed-effect machine learning and data integration. JO - Genome Biol. VL - 25 IS - 1 PB - Bmc PY - 2024 SN - 1474-760X ER - TY - JOUR AB - We present RBPNet, a novel deep learning method, which predicts CLIP-seq crosslink count distribution from RNA sequence at single-nucleotide resolution. By training on up to a million regions, RBPNet achieves high generalization on eCLIP, iCLIP and miCLIP assays, outperforming state-of-the-art classifiers. RBPNet performs bias correction by modeling the raw signal as a mixture of the protein-specific and background signal. Through model interrogation via Integrated Gradients, RBPNet identifies predictive sub-sequences that correspond to known and novel binding motifs and enables variant-impact scoring via in silico mutagenesis. Together, RBPNet improves imputation of protein-RNA interactions, as well as mechanistic interpretation of predictions. AU - Horlacher, M. AU - Wagner, N.* AU - Moyon, L. AU - Kuret, K.* AU - Goedert, N. AU - Salvatore, M.* AU - Ule, J.* AU - Gagneur, J. AU - Winther, O.* AU - Marsico, A. C1 - 68033 C2 - 54511 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Towards in silico CLIP-seq: predicting protein-RNA interaction via sequence-to-signal learning. JO - Genome Biol. VL - 24 IS - 1 PB - Bmc PY - 2023 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: The largest sequence-based models of transcription control to date are obtained by predicting genome-wide gene regulatory assays across the human genome. This setting is fundamentally correlative, as those models are exposed during training solely to the sequence variation between human genes that arose through evolution, questioning the extent to which those models capture genuine causal signals. RESULTS: Here we confront predictions of state-of-the-art models of transcription regulation against data from two large-scale observational studies and five deep perturbation assays. The most advanced of these sequence-based models, Enformer, by and large, captures causal determinants of human promoters. However, models fail to capture the causal effects of enhancers on expression, notably in medium to long distances and particularly for highly expressed promoters. More generally, the predicted impact of distal elements on gene expression predictions is small and the ability to correctly integrate long-range information is significantly more limited than the receptive fields of the models suggest. This is likely caused by the escalating class imbalance between actual and candidate regulatory elements as distance increases. CONCLUSIONS: Our results suggest that sequence-based models have advanced to the point that in silico study of promoter regions and promoter variants can provide meaningful insights and we provide practical guidance on how to use them. Moreover, we foresee that it will require significantly more and particularly new kinds of data to train models accurately accounting for distal elements. AU - Karollus, A.* AU - Mauermeier, T.* AU - Gagneur, J. C1 - 67658 C2 - 53965 CY - Campus, 4 Crinan St, London N1 9xw, England SP - 56 TI - Current sequence-based models capture gene expression determinants in promoters but mostly ignore distal enhancers. JO - Genome Biol. VL - 24 IS - 1 PB - Bmc PY - 2023 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: Expression quantitative trait loci (eQTL) studies show how genetic variants affect downstream gene expression. Single-cell data allows reconstruction of personalized co-expression networks and therefore the identification of SNPs altering co-expression patterns (co-expression QTLs, co-eQTLs) and the affected upstream regulatory processes using a limited number of individuals. RESULTS: We conduct a co-eQTL meta-analysis across four scRNA-seq peripheral blood mononuclear cell datasets using a novel filtering strategy followed by a permutation-based multiple testing approach. Before the analysis, we evaluate the co-expression patterns required for co-eQTL identification using different external resources. We identify a robust set of cell-type-specific co-eQTLs for 72 independent SNPs affecting 946 gene pairs. These co-eQTLs are replicated in a large bulk cohort and provide novel insights into how disease-associated variants alter regulatory networks. One co-eQTL SNP, rs1131017, that is associated with several autoimmune diseases, affects the co-expression of RPS26 with other ribosomal genes. Interestingly, specifically in T cells, the SNP additionally affects co-expression of RPS26 and a group of genes associated with T cell activation and autoimmune disease. Among these genes, we identify enrichment for targets of five T-cell-activation-related transcription factors whose binding sites harbor rs1131017. This reveals a previously overlooked process and pinpoints potential regulators that could explain the association of rs1131017 with autoimmune diseases. CONCLUSION: Our co-eQTL results highlight the importance of studying context-specific gene regulation to understand the biological implications of genetic variation. With the expected growth of sc-eQTL datasets, our strategy and technical guidelines will facilitate future co-eQTL identification, further elucidating unknown disease mechanisms. AU - Li, S.* AU - Schmid, K. AU - de Vries, D.H.* AU - Korshevniuk, M.* AU - Losert, C. AU - Oelen, R.* AU - van Blokland, I.V.* AU - Groot, H.E.* AU - Swertz, M.A.* AU - van der Harst, P.* AU - Westra, H.J.* AU - van der Wijst, M.G.P.* AU - Heinig, M. AU - Franke, L.* C1 - 67687 C2 - 53994 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Identification of genetic variants that impact gene co-expression relationships using large-scale single-cell data. JO - Genome Biol. VL - 24 IS - 1 PB - Bmc PY - 2023 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: Xenobiotics are primarily metabolized by hepatocytes in the liver, and primary human hepatocytes are the gold standard model for the assessment of drug efficacy, safety, and toxicity in the early phases of drug development. Recent advances in single-cell genomics demonstrate liver zonation and ploidy as main drivers of cellular heterogeneity. However, little is known about the impact of hepatocyte specialization on liver function upon metabolic challenge, including hepatic metabolism, detoxification, and protein synthesis. RESULTS: Here, we investigate the metabolic capacity of individual human hepatocytes in vitro. We assess how chronic accumulation of lipids enhances cellular heterogeneity and impairs the metabolisms of drugs. Using a phenotyping five-probe cocktail, we identify four functional subgroups of hepatocytes responding differently to drug challenge and fatty acid accumulation. These four subgroups display differential gene expression profiles upon cocktail treatment and xenobiotic metabolism-related specialization. Notably, intracellular fat accumulation leads to increased transcriptional variability and diminishes the drug-related metabolic capacity of hepatocytes. CONCLUSIONS: Our results demonstrate that, upon a metabolic challenge such as exposure to drugs or intracellular fat accumulation, hepatocyte subgroups display different and heterogeneous transcriptional responses. AU - Sánchez Quant, E.S. AU - Richter, M. AU - Colomé-Tatché, M. AU - Martinez Jimenez, C.P. C1 - 68643 C2 - 54846 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Single-cell metabolic profiling reveals subgroups of primary human hepatocytes with heterogeneous responses to drug challenge. JO - Genome Biol. VL - 24 IS - 1 PB - Bmc PY - 2023 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: Histone lactylation has been recently described as a novel histone post-translational modification linking cellular metabolism to epigenetic regulation. RESULTS: Given the expected relevance of this modification and current limited knowledge of its function, we generate genome-wide datasets of H3K18la distribution in various in vitro and in vivo samples, including mouse embryonic stem cells, macrophages, adipocytes, and mouse and human skeletal muscle. We compare them to profiles of well-established histone modifications and gene expression patterns. Supervised and unsupervised bioinformatics analysis shows that global H3K18la distribution resembles H3K27ac, although we also find notable differences. H3K18la marks active CpG island-containing promoters of highly expressed genes across most tissues assessed, including many housekeeping genes, and positively correlates with H3K27ac and H3K4me3 as well as with gene expression. In addition, H3K18la is enriched at active enhancers that lie in proximity to genes that are functionally important for the respective tissue. CONCLUSIONS: Overall, our data suggests that H3K18la is not only a marker for active promoters, but also a mark of tissue specific active enhancers. AU - Galle, E.* AU - Wong, C.W.* AU - Ghosh, A.* AU - Desgeorges, T.* AU - Melrose, K.* AU - Hinte, L.C.* AU - Castellano-Castillo, D.* AU - Engl, M.* AU - de Sousa, J.A.* AU - Ruiz Ojeda, F.J. AU - de Bock, K.* AU - Ruiz, J.R.* AU - von Meyenn, F.* C1 - 66358 C2 - 52803 TI - H3K18 lactylation marks tissue-specific active enhancers. JO - Genome Biol. VL - 23 IS - 1 PY - 2022 SN - 1474-760X ER - TY - JOUR AB - Cost-efficient library generation by early barcoding has been central in propelling single-cell RNA sequencing. Here, we optimize and validate prime-seq, an early barcoding bulk RNA-seq method. We show that it performs equivalently to TruSeq, a standard bulk RNA-seq method, but is fourfold more cost-efficient due to almost 50-fold cheaper library costs. We also validate a direct RNA isolation step, show that intronic reads are derived from RNA, and compare cost-efficiencies of available protocols. We conclude that prime-seq is currently one of the best options to set up an early barcoding bulk RNA-seq protocol from which many labs would profit. AU - Janjic, A.* AU - Wange, L.E.* AU - Bagnoli, J.W.* AU - Geuder, J.* AU - Nguyen, P.* AU - Richter, D.* AU - Vieth, B.* AU - Vick, B. AU - Jeremias, I. AU - Ziegenhain, C.* AU - Hellmann, I.* AU - Enard, W.* C1 - 64747 C2 - 51962 TI - Prime-seq, efficient and powerful bulk RNA sequencing. JO - Genome Biol. VL - 23 IS - 1 PY - 2022 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. RESULTS: To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N = 1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3-5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. CONCLUSIONS: Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk. AU - Kanoni, S.* AU - Graham, S.E.* AU - Wang, Y.* AU - Surakka, I.* AU - Ramdas, S.* AU - Zhu, X.* AU - Clarke, S.L.* AU - Bhatti, K.F.* AU - Vedantam, S.* AU - Winkler, T.W.* AU - Locke, A.E.* AU - Marouli, E.* AU - Zajac, G.J.M.* AU - Wu, K.H.H.* AU - Ntalla, I.* AU - Hui, Q.* AU - Klarin, D.* AU - Hilliard, A.T.* AU - Wang, Z.* AU - Xue, C.* AU - Thorleifsson, G.* AU - Helgadottir, A.* AU - Gudbjartsson, D.F.* AU - Holm, H.* AU - Olafsson, I.* AU - Hwang, M.Y.* AU - Han, S.* AU - Akiyama, M.* AU - Sakaue, S.* AU - Terao, C.* AU - Kanai, M.* AU - Zhou, W.* AU - Brumpton, B.M.* AU - Rasheed, H.* AU - Havulinna, A.S.* AU - Veturi, Y.* AU - Pacheco, J.A.* AU - Rosenthal, E.A.* AU - Lingren, T.* AU - Feng, Q.P.* AU - Kullo, I.J.* AU - Narita, A.* AU - Takayama, J.* AU - Martin, H.C.* AU - Hunt, K.A.* AU - Trivedi, B.* AU - Haessler, J.* AU - Giulianini, F.* AU - Bradford, Y.* AU - Miller, J.E.* AU - Campbell, A.* AU - Lin, K.* AU - Millwood, I.Y.* AU - Rasheed, A.* AU - Hindy, G.* AU - Faul, J.D.* AU - Zhao, W.* AU - Weir, D.R.* AU - Turman, C.* AU - Huang, H.* AU - Graff, M.* AU - Choudhury, A.* AU - Sengupta, D.* AU - Mahajan, A.* AU - Brown, M.R.* AU - Zhang, W.* AU - Yu, K.* AU - Schmidt, E.M.* AU - Pandit, A.* AU - Gustafsson, S.* AU - Yin, X.* AU - Luan, J.* AU - Zhao, J.H.* AU - Matsuda, F.* AU - Jang, H.M.* AU - Yoon, K.* AU - Medina-Gomez, C.* AU - Pitsillides, A.* AU - Hottenga, J.J.* AU - Wood, A.R.* AU - Ji, Y.* AU - Gao, Z. AU - Haworth, S.* AU - Yousri, N.A.* AU - Mitchell, R.E.* AU - Chai, J.F.* AU - Aadahl, M.* AU - Bjerregaard, A.A.* AU - Yao, J.* AU - Manichaikul, A.* AU - Hwu, C.M.* AU - Hung, Y.J.* AU - Warren, H.R.* AU - Ramirez, J.* AU - Bork-Jensen, J.* AU - Kårhus, L.L.* AU - Goel, A.* AU - Sabater-Lleal, M.* AU - Noordam, R.* AU - Mauro, P.* AU - Møllehave, L.T.* AU - Munz, M.* AU - Zeng, L.* AU - Kurbasic, A.* AU - Lamina, C.* AU - Scholz, M.* AU - Zmuda, J.M* AU - Brody, J.A.* AU - Engmann, J.* AU - Slieker, R.C.* AU - Zilhao, N.R.* AU - Iha, H.* AU - Schmidt, B.* AU - Fernandez‑Lopez, J.C.* AU - Oldmeadow, C.* AU - Prasad, G.* AU - Lorés‑Motta, L.* AU - Nutile, T.* AU - Banas, B.* AU - Hebbar, P.* AU - Hofer, E.* AU - Bentley, A.R.* AU - Southam, L. AU - Rayner, N.W. AU - Wang, C.A.* AU - Couture, C.* AU - Cuellar‑Partida, G.* AU - Giannakopoulou, O.* AU - van Setten, J.* AU - Liang, J.* AU - Terzikhan, N.* AU - Kawaguchi, T.* AU - Nalls, M.A.* AU - Raitakari, O.T.* AU - Campbell, H.* AU - Ikram, M.A.* AU - Asselbergs, F.W.* AU - Pasterkamp, G.* AU - Bandinelli, S.* AU - Wickremasinghe, A.R.* AU - Bharadwaj, D.* AU - Koistinen, H.A.* AU - Yokota, M.* AU - Pramstaller, P.P.* AU - Kronenberg, F.* AU - Sabanayagam, C.* AU - Peters, A. AU - Gieger, C. AU - Hattersley, A.T.* AU - Pedersen, N.L.* AU - Cupples, L.A.* AU - Langenberg, C.* AU - Zeggini, E. AU - Kuusisto, J.* AU - Laakso, M.* AU - Saleheen, D.* AU - Jousilahti, P.* AU - Salomaa, V.* AU - Zhang, J.* AU - Deloukas, P.* AU - Willer, C.J.* AU - Assimes, T.* AU - Peloso, G.M.* C1 - 67096 C2 - 53489 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis. JO - Genome Biol. VL - 23 IS - 1 PB - Bmc PY - 2022 SN - 1474-760X ER - TY - JOUR AB - Population-scale single-cell RNA sequencing (scRNA-seq) is now viable, enabling finer resolution functional genomics studies and leading to a rush to adapt bulk methods and develop new single-cell-specific methods to perform these studies. Simulations are useful for developing, testing, and benchmarking methods but current scRNA-seq simulation frameworks do not simulate population-scale data with genetic effects. Here, we present splatPop, a model for flexible, reproducible, and well-documented simulation of population-scale scRNA-seq data with known expression quantitative trait loci. splatPop can also simulate complex batch, cell group, and conditional effects between individuals from different cohorts as well as genetically-driven co-expression. AU - Azodi, C.B* AU - Zappia, L. AU - Oshlack, A.* AU - McCarthy, D.J.* C1 - 63849 C2 - 51730 CY - Campus, 4 Crinan St, London N1 9xw, England TI - splatPop: Simulating population scale single-cell RNA sequencing data. JO - Genome Biol. VL - 22 IS - 1 PB - Bmc PY - 2021 SN - 1474-760X ER - TY - JOUR AB - Most research articles presenting new data analysis methods claim that "the new method performs better than existing methods," but the veracity of such statements is questionable. Our manuscript discusses and illustrates consequences of the optimistic bias occurring during the evaluation of novel data analysis methods, that is, all biases resulting from, for example, selection of datasets or competing methods, better ability to fix bugs in a preferred method, and selective reporting of method variants. We quantitatively investigate this bias using an example from epigenetic analysis: normalization methods for data generated by the Illumina HumanMethylation450K BeadChip microarray. AU - Buchka, S.* AU - Hapfelmeier, A.* AU - Gardner, P.P.* AU - Wilson, R. AU - Boulesteix, A.L.* C1 - 62038 C2 - 50555 CY - Campus, 4 Crinan St, London N1 9xw, England TI - On the optimistic performance evaluation of newly introduced bioinformatic methods. JO - Genome Biol. VL - 22 IS - 1 PB - Bmc PY - 2021 SN - 1474-760X ER - TY - JOUR AB - Following publication of the original paper [1], it was noticed that a typesetting error occurred. Julien Gagneur was mistakenly not indicated as a corresponding author. This has been corrected and the original article [1] has been updated. AU - Cheng, J.* AU - Çelik, M.H.* AU - Kundaje, A.* AU - Gagneur, J. C1 - 61847 C2 - 50191 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Publisher Correction: MTSplice predicts effects of genetic variants on tissue-specific splicing. JO - Genome Biol. VL - 22 IS - 1 PB - Bmc PY - 2021 SN - 1474-760X ER - TY - JOUR AB - We develop the free and open-source model Multi-tissue Splicing (MTSplice) to predict the effects of genetic variants on splicing of cassette exons in 56 human tissues. MTSplice combines MMSplice, which models constitutive regulatory sequences, with a new neural network that models tissue-specific regulatory sequences. MTSplice outperforms MMSplice on predicting tissue-specific variations associated with genetic variants in most tissues of the GTEx dataset, with largest improvements on brain tissues. Furthermore, MTSplice predicts that autism-associated de novo mutations are enriched for variants affecting splicing specifically in the brain. We foresee that MTSplice will aid interpreting variants associated with tissue-specific disorders. AU - Cheng, J.* AU - Çelik, M.H.* AU - Kundaje, A.* AU - Gagneur, J. C1 - 61754 C2 - 50190 CY - Campus, 4 Crinan St, London N1 9xw, England TI - MTSplice predicts effects of genetic variants on tissue-specific splicing. JO - Genome Biol. VL - 22 IS - 1 PB - Bmc PY - 2021 SN - 1474-760X ER - TY - JOUR AB - Single-cell RNA-seq datasets are often first analyzed independently without harnessing model fits from previous studies, and are then contextualized with public data sets, requiring time-consuming data wrangling. We address these issues with sfaira, a single-cell data zoo for public data sets paired with a model zoo for executable pre-trained models. The data zoo is designed to facilitate contribution of data sets using ontologies for metadata. We propose an adaption of cross-entropy loss for cell type classification tailored to datasets annotated at different levels of coarseness. We demonstrate the utility of sfaira by training models across anatomic data partitions on 8 million cells. AU - Fischer, D.S. AU - Dony, L. AU - König, M. AU - Moeed, A. AU - Zappia, L. AU - Heumos, L. AU - Tritschler, S. AU - Holmberg, O. AU - Aliee, H. AU - Theis, F.J. C1 - 62879 C2 - 49951 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Sfaira accelerates data and model reuse in single cell genomics. JO - Genome Biol. VL - 22 IS - 1 PB - Bmc PY - 2021 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: Biological aging estimators derived from DNA methylation data are heritable and correlate with morbidity and mortality. Consequently, identification of genetic and environmental contributors to the variation in these measures in populations has become a major goal in the field. RESULTS: Leveraging DNA methylation and SNP data from more than 40,000 individuals, we identify 137 genome-wide significant loci, of which 113 are novel, from genome-wide association study (GWAS) meta-analyses of four epigenetic clocks and epigenetic surrogate markers for granulocyte proportions and plasminogen activator inhibitor 1 levels, respectively. We find evidence for shared genetic loci associated with the Horvath clock and expression of transcripts encoding genes linked to lipid metabolism and immune function. Notably, these loci are independent of those reported to regulate DNA methylation levels at constituent clock CpGs. A polygenic score for GrimAge acceleration showed strong associations with adiposity-related traits, educational attainment, parental longevity, and C-reactive protein levels. CONCLUSION: This study illuminates the genetic architecture underlying epigenetic aging and its shared genetic contributions with lifestyle factors and longevity. AU - McCartney, D.L.* AU - Min, J.L.* AU - Richmond, R.C.* AU - Lu, A.T.* AU - Sobczyk, M.K.* AU - Davies, G.* AU - Broer, L.* AU - Guo, X.* AU - Jeong, A.* AU - Jung, J.* AU - Kasela, S.* AU - Katrinli, S.* AU - Kuo, P.L.* AU - Matias-Garcia, P.R. AU - Mishra, P.P.* AU - Nygaard, M.* AU - Palviainen, T.* AU - Patki, A.* AU - Raffield, L.M.* AU - Ratliff, S.M.* AU - Richardson, T.G.* AU - Robinson, O.* AU - Soerensen, M.* AU - Sun, D.* AU - Tsai, P.C.* AU - van der Zee, M.D.* AU - Walker, R.M.* AU - Wang, X.* AU - Wang, Y.* AU - Xia, R.* AU - Xu, Z.* AU - Yao, J.* AU - Zhao, W.* AU - Correa, A.* AU - Boerwinkle, E.* AU - Dugué, P.A.* AU - Durda, P.* AU - Elliott, H.R.* AU - Gieger, C. AU - de Geus, E.J.C.* AU - Harris, S.E.* AU - Hemani, G.* AU - Imboden, M.* AU - Kähönen, M.* AU - Kardia, S.L.R.* AU - Kresovich, J.K.* AU - Li, S.* AU - Lunetta, K.L.* AU - Mangino, M.* AU - Mason, D.* AU - McIntosh, A.M.* AU - Mengel-From, J.* AU - Moore, A.Z.* AU - Murabito, J.M.* AU - Ollikainen, M.* AU - Pankow, J.S.* AU - Pedersen, N.L.* AU - Peters, A. AU - Polidoro, S.* AU - Porteous, D.J.* AU - Raitakari, O.* AU - Rich, S.S.* AU - Sandler, D.P.* AU - Sillanpää, E.* AU - Smith, A.K.* AU - Southey, M.C.* AU - Strauch, K. AU - Tiwari, H.* AU - Tanaka, T.* AU - Tillin, T.* AU - Uitterlinden, A.G.* AU - Van Den Berg, D.J.* AU - van Dongen, J.* AU - Wilson, J.G.* AU - Wright, J.* AU - Yet, I.* AU - Arnett, D.* AU - Bandinelli, S.* AU - Bell, J.T.* AU - Binder, A.M.* AU - Boomsma, D.I.* AU - Chen, W.* AU - Christensen, K.* AU - Conneely, K.N.* AU - Elliott, P.* AU - Ferrucci, L.* AU - Fornage, M.* AU - Hägg, S.* AU - Hayward, C.* AU - Irvin, M.R.* AU - Kaprio, J.* AU - Lawlor, D.A.* AU - Lehtimäki, T.* AU - Lohoff, F.W.* AU - Milani, L.* AU - Milne, R.L.* AU - Probst-Hensch, N.* AU - Reiner, A.P.* AU - Ritz, B.* AU - Rotter, J.I.* AU - Smith, J.A.* AU - Taylor, J.A.* AU - van Meurs, J.B.J.* AU - Vineis, P.* AU - Waldenberger, M. AU - Deary, I.J.* AU - Relton, C.L.* AU - Horvath, S.* AU - Marioni, R.E.* C1 - 62448 C2 - 50892 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Genome-wide association studies identify 137 genetic loci for DNA methylation biomarkers of aging. JO - Genome Biol. VL - 22 IS - 1 PB - Bmc PY - 2021 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: Little is known about the impact of trans-acting genetic variation on the rates with which proteins are synthesized by ribosomes. Here, we investigate the influence of such distant genetic loci on the efficiency of mRNA translation and define their contribution to the development of complex disease phenotypes within a panel of rat recombinant inbred lines. RESULTS: We identify several tissue-specific master regulatory hotspots that each control the translation rates of multiple proteins. One of these loci is restricted to hypertrophic hearts, where it drives a translatome-wide and protein length-dependent change in translational efficiency, altering the stoichiometric translation rates of sarcomere proteins. Mechanistic dissection of this locus across multiple congenic lines points to a translation machinery defect, characterized by marked differences in polysome profiles and misregulation of the small nucleolar RNA SNORA48. Strikingly, from yeast to humans, we observe reproducible protein length-dependent shifts in translational efficiency as a conserved hallmark of translation machinery mutants, including those that cause ribosomopathies. Depending on the factor mutated, a pre-existing negative correlation between protein length and translation rates could either be enhanced or reduced, which we propose to result from mRNA-specific imbalances in canonical translation initiation and reinitiation rates. CONCLUSIONS: We show that distant genetic control of mRNA translation is abundant in mammalian tissues, exemplified by a single genomic locus that triggers a translation-driven molecular mechanism. Our work illustrates the complexity through which genetic variation can drive phenotypic variability between individuals and thereby contribute to complex disease. AU - Witte, F.* AU - Ruiz-Orera, J.* AU - Mattioli, C.C.* AU - Blachut, S.* AU - Adami, E.* AU - Schulz, J.F.* AU - Schneider-Lunitz, V.* AU - Hummel, O.* AU - Patone, G.* AU - Mücke, M.B.* AU - Silhavý, J.* AU - Heinig, M. AU - Bottolo, L.* AU - Sanchis, D.* AU - Vingron, M.* AU - Chekulaeva, M.* AU - Pravenec, M.* AU - Hubner, N.* AU - Van Heesch, S.* C1 - 62453 C2 - 50833 TI - A trans locus causes a ribosomopathy in hypertrophic hearts that affects mRNA translation in a protein length-dependent fashion. JO - Genome Biol. VL - 22 IS - 1 PY - 2021 SN - 1474-760X ER - TY - JOUR AB - Recent years have seen a revolution in single-cell RNA-sequencing (scRNA-seq) technologies, datasets, and analysis methods. Since 2016, the scRNA-tools database has cataloged software tools for analyzing scRNA-seq data. With the number of tools in the database passing 1000, we provide an update on the state of the project and the field. This data shows the evolution of the field and a change of focus from ordering cells on continuous trajectories to integrating multiple samples and making use of reference datasets. We also find that open science practices reward developers with increased recognition and help accelerate the field. AU - Zappia, L. AU - Theis, F.J. C1 - 63386 C2 - 51359 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Over 1000 tools reveal trends in the single-cell RNA-seq analysis landscape. JO - Genome Biol. VL - 22 IS - 1 PB - Bmc PY - 2021 SN - 1474-760X ER - TY - JOUR AB - BackgroundPlants can transmit somatic mutations and epimutations to offspring, which in turn can affect fitness. Knowledge of the rate at which these variations arise is necessary to understand how plant development contributes to local adaption in an ecoevolutionary context, particularly in long-lived perennials.ResultsHere, we generate a new high-quality reference genome from the oldest branch of a wild Populus trichocarpa tree with two dominant stems which have been evolving independently for 330years. By sampling multiple, age-estimated branches of this tree, we use a multi-omics approach to quantify age-related somatic changes at the genetic, epigenetic, and transcriptional level. We show that the per-year somatic mutation and epimutation rates are lower than in annuals and that transcriptional variation is mainly independent of age divergence and cytosine methylation. Furthermore, a detailed analysis of the somatic epimutation spectrum indicates that transgenerationally heritable epimutations originate mainly from DNA methylation maintenance errors during mitotic rather than during meiotic cell divisions.ConclusionTaken together, our study provides unprecedented insights into the origin of nucleotide and functional variation in a long-lived perennial plant. AU - Hofmeister, B.T.* AU - Denkena, J. AU - Colomé-Tatché, M. AU - Shahryary, Y.* AU - Hazarika, R.* AU - Grimwood, J.* AU - Mamidi, S.* AU - Jenkins, J.* AU - Grabowski, P.P.* AU - Sreedasyam, A.* AU - Shu, S.* AU - Barry, K.* AU - Lail, K.* AU - Adam, C.* AU - Lipzen, A.* AU - Sorek, R.* AU - Kudrna, D.* AU - Talag, J.* AU - Wing, R.* AU - Hall, D.W.* AU - Jacobsen, D.* AU - Tuskan, G.A.* AU - Schmutz, J.* AU - Johannes, F.* AU - Schmitz, R.J.* C1 - 60304 C2 - 49374 CY - Campus, 4 Crinan St, London N1 9xw, England TI - A genome assembly and the somatic genetic and epigenetic mutation rate in a wild long-lived perennial Populus trichocarpa. JO - Genome Biol. VL - 21 IS - 1 PB - Bmc PY - 2020 SN - 1474-760X ER - TY - JOUR AB - Background: The presence of nuclear mitochondrial DNA (numtDNA) has been reported within several nuclear genomes. Next to mitochondrial protein-coding genes, numtDNA sequences also encode for mitochondrial tRNA genes. However, the biological roles of numtDNA remain elusive. Results: Employing in silico analysis, we identify 281 mitochondrial tRNA homologs in the human genome, which we term nimtRNAs (nuclear intronic mitochondrial-derived tRNAs), being contained within introns of 76 nuclear host genes. Despite base changes in nimtRNAs when compared to their mtRNA homologs, a canonical tRNA cloverleaf structure is maintained. To address potential functions of intronic nimtRNAs, we insert them into introns of constitutive and alternative splicing reporters and demonstrate that nimtRNAs promote pre-mRNA splicing, dependent on the number and positioning of nimtRNA genes and splice site recognition efficiency. A mutational analysis reveals that the nimtRNA cloverleaf structure is required for the observed splicing increase. Utilizing a CRISPR/Cas9 approach, we show that a partial deletion of a single endogenous nimtRNALys within intron 28 of the PPFIBP1 gene decreases inclusion of the downstream-located exon 29 of the PPFIBP1 mRNA. By employing a pull-down approach followed by mass spectrometry, a 3′-splice site-associated protein network is identified, including KHDRBS1, which we show directly interacts with nimtRNATyr by an electrophoretic mobility shift assay. Conclusions: We propose that nimtRNAs, along with associated protein factors, can act as a novel class of intronic splicing regulatory elements in the human genome by participating in the regulation of splicing. AU - Hoser, S.M.* AU - Hoffmann, A. AU - Meindl, A.* AU - Gamper, M.* AU - Fallmann, J.* AU - Bernhart, S.H.* AU - Müller, L.* AU - Ploner, M.* AU - Misslinger, M.* AU - Kremser, L.* AU - Lindner, H.* AU - Geley, S.* AU - Schaal, H.* AU - Stadler, P.F.* AU - Huettenhofer, A.* C1 - 60753 C2 - 49502 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Intronic tRNAs of mitochondrial origin regulate constitutive and alternative splicing. JO - Genome Biol. VL - 21 IS - 1 PB - Bmc PY - 2020 SN - 1474-760X ER - TY - JOUR AB - The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands - or even millions - of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years. AU - Lähnemann, D.* AU - Köster, J.* AU - Szczurek, E.* AU - McCarthy, D.J.* AU - Hicks, S.C.* AU - Robinson, M.D.* AU - Vallejos, C.A.* AU - Campbell, K.R.* AU - Beerenwinkel, N.* AU - Mahfouz, A.* AU - Pinello, L.* AU - Skums, P.* AU - Stamatakis, A.* AU - Attolini, C.S.O.* AU - Aparicio, S.* AU - Baaijens, J.* AU - Balvert, M.* AU - Barbanson, B.d.* AU - Cappuccio, A.* AU - Corleone, G.* AU - Dutilh, B.E.* AU - Florescu, M.* AU - Guryev, V.* AU - Holmer, R.* AU - Jahn, K.* AU - Lobo, T.J.* AU - Keizer, E.M.* AU - Khatri, I.* AU - Kielbasa, S.M.* AU - Korbel, J.O.* AU - Kozlov, A.M.* AU - Kuo, T.H.* AU - Lelieveldt, B.P.F.* AU - Mandoiu, I.I.* AU - Marioni, J.C.* AU - Marschall, T.* AU - Mölder, F.* AU - Niknejad, A.* AU - Raczkowski, L.* AU - Reinders, M.* AU - Ridder, J.d.* AU - Saliba, A.E.* AU - Somarakis, A.* AU - Stegle, O.* AU - Theis, F.J. AU - Yang, H.* AU - Zelikovsky, A.* AU - McHardy, A.C.* AU - Raphael, B.J.* AU - Shah, S.P.* AU - Schönhuth, A.* C1 - 58955 C2 - 48431 TI - Eleven grand challenges in single-cell data science. JO - Genome Biol. VL - 21 IS - 1 PY - 2020 SN - 1474-760X ER - TY - JOUR AU - Lloyd, K.C.K.* AU - Adams, D.J.* AU - Baynam, G.* AU - Beaudet, A.L.* AU - Bosch, F.* AU - Boycott, K.M.* AU - Braun, R.E.* AU - Caulfield, M.* AU - Cohn, R.* AU - Dickinson, M.E.* AU - Dobbie, M.S.* AU - Flenniken, A.M.* AU - Flicek, P.* AU - Galande, S.* AU - Gao, X.* AU - Grobler, A.* AU - Heaney, J.D.* AU - Herault, Y.* AU - Hrabě de Angelis, M. AU - Lupski, J.R.* AU - Lyonnet, S.* AU - Mallon, A.M.* AU - Mammano, F.* AU - MacRae, C.A.* AU - McInnes, R.* AU - McKerlie, C.* AU - Meehan, T.F.* AU - Murray, S.A.* AU - Nutter, L.M.J.* AU - Obata, Y.* AU - Parkinson, H.* AU - Pepper, M.S.* AU - Sedlacek, R.* AU - Seong, J.K.* AU - Shiroishi, T.* AU - Smedley, D.* AU - Tocchini-Valentini, G.* AU - Valle, D.* AU - Wang, C.-K.L.* AU - Wells, S.* AU - White, J.* AU - Wurst, W. AU - Xu, Y.* AU - Brown, S.D.M.* C1 - 58032 C2 - 48158 CY - Campus, 4 Crinan St, London N1 9xw, England TI - The deep genome project. JO - Genome Biol. VL - 21 IS - 1 PB - Bmc PY - 2020 SN - 1474-760X ER - TY - JOUR AB - Stochastic changes in DNA methylation (i.e., spontaneous epimutations) contribute to methylome diversity in plants. Here, we describe AlphaBeta, a computational method for estimating the precise rate of such stochastic events using pedigree-based DNA methylation data as input. We demonstrate how AlphaBeta can be employed to study transgenerationally heritable epimutations in clonal or sexually derived mutation accumulation lines, as well as somatic epimutations in long-lived perennials. Application of our method to published and new data reveals that spontaneous epimutations accumulate neutrally at the genome-wide scale, originate mainly during somatic development and that they can be used as a molecular clock for age-dating trees. AU - Shahryary, Y.* AU - Symeonidi, A.* AU - Hazarika, R.R.* AU - Denkena, J. AU - Mubeen, T.* AU - Hofmeister, B.* AU - Van Gurp, T.* AU - Colomé-Tatché, M. AU - Verhoeven, K.J.F.* AU - Tuskan, G.* AU - Schmitz, R.J.* AU - Johannes, F.* C1 - 60306 C2 - 49375 CY - Campus, 4 Crinan St, London N1 9xw, England TI - AlphaBeta: Computational inference of epimutation rates and spectra from high-throughput DNA methylation data in plants. JO - Genome Biol. VL - 21 IS - 1 PB - Bmc PY - 2020 SN - 1474-760X ER - TY - JOUR AB - Massively parallel reporter assays (MPRAs) can measure the regulatory function of thousands of DNA sequences in a single experiment. Despite growing popularity, MPRA studies are limited by a lack of a unified framework for analyzing the resulting data. Here we present MPRAnalyze: a statistical framework for analyzing MPRA count data. Our model leverages the unique structure of MPRA data to quantify the function of regulatory sequences, compare sequences' activity across different conditions, and provide necessary flexibility in an evolving field. We demonstrate the accuracy and applicability of MPRAnalyze on simulated and published data and compare it with existing methods. AU - Ashuach, T.* AU - Fischer, D.S. AU - Kreimer, A.* AU - Ahituv, N.* AU - Theis, F.J. AU - Yosef, N.* C1 - 56844 C2 - 47337 CY - Campus, 4 Crinan St, London N1 9xw, England TI - MPRAnalyze: Statistical framework for massively parallel reporter assays. JO - Genome Biol. VL - 20 IS - 1 PB - Bmc PY - 2019 SN - 1474-760X ER - TY - JOUR AB - Background: Genomic imprinting is an epigenetic phenomenon that allows a subset of genes to be expressed mono-allelically based on the parent of origin and is typically regulated by differential DNA methylation inherited from gametes. Imprinting is pervasive in murine extra-embryonic lineages, and uniquely, the imprinting of several genes has been found to be conferred non-canonically through maternally inherited repressive histone modification H3K27me3. However, the underlying regulatory mechanisms of non-canonical imprinting in postimplantation development remain unexplored.Results: We identify imprinted regions in post-implantation epiblast and extra-embryonic ectoderm (ExE) by assaying allelic histone modifications (H3K4me3, H3K36me3, H3K27me3), gene expression, and DNA methylation in reciprocal C57BL/6 and CAST hybrid embryos. We distinguish loci with DNA methylation-dependent (canonical) and independent (non-canonical) imprinting by assaying hybrid embryos with ablated maternally inherited DNA methylation. We find that non-canonical imprints are localized to endogenous retrovirus-K (ERVK) long terminal repeats (LTRs), which act as imprinted promoters specifically in extra-embryonic lineages. Transcribed ERVK LTRs are CpG-rich and located in close proximity to gene promoters, and imprinting status is determined by their epigenetic patterning in the oocyte. Finally, we show that oocyte-derived H3K27me3 associated with non-canonical imprints is not maintained beyond pre-implantation development at these elements and is replaced by secondary imprinted DNA methylation on the maternal allele in post-implantation ExE, while being completely silenced by bi-allelic DNA methylation in the epiblast.Conclusions: This study reveals distinct epigenetic mechanisms regulating non-canonical imprinted gene expression between embryonic and extra-embryonic development and identifies an integral role for ERVK LTR repetitive elements. AU - Hanna, C.W.* AU - Pérez-Palacios, R.* AU - Gahurova, L.* AU - Schubert, M.* AU - Krueger, F.* AU - Biggins, L.* AU - Andrews, S.* AU - Colomé-Tatché, M. AU - Bourc'his, D.* AU - Dean, W.* AU - Kelsey, G.* C1 - 57214 C2 - 47619 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Endogenous retroviral insertions drive non-canonical imprinting in extra-embryonic tissues. JO - Genome Biol. VL - 20 IS - 1 PB - Bmc PY - 2019 SN - 1474-760X ER - TY - JOUR AB - Chromosome-scale genome sequence assemblies underpin pan-genomic studies. Recent genome assembly efforts in the large-genome Triticeae crops wheat and barley have relied on the commercial closed-source assembly algorithm DeNovoMagic. We present TRITEX, an open-source computational workflow that combines paired-end, mate-pair, 10X Genomics linked-read with chromosome conformation capture sequencing data to construct sequence scaffolds with megabase-scale contiguity ordered into chromosomal pseudomolecules. We evaluate the performance of TRITEX on publicly available sequence data of tetraploid wild emmer and hexaploid bread wheat, and construct an improved annotated reference genome sequence assembly of the barley cultivar Morex as a community resource. AU - Monat, C.* AU - Padmarasu, S.* AU - Lux, T. AU - Wicker, T.* AU - Gundlach, H. AU - Himmelbach, A.* AU - Ens, J.* AU - Li, C.* AU - Muehlbauer, G.J.* AU - Schulman, A.H.* AU - Waugh, R.* AU - Braumann, I.* AU - Pozniak, C.* AU - Scholz, U.* AU - Mayer, K.F.X. AU - Spannagl, M. AU - Stein, N.* AU - Mascher, M.* C1 - 57658 C2 - 47995 CY - Campus, 4 Crinan St, London N1 9xw, England TI - TRITEX: Chromosome-scale sequence assembly of Triticeae genomes with open-source tools. JO - Genome Biol. VL - 20 IS - 1 PB - Bmc PY - 2019 SN - 1474-760X ER - TY - JOUR AB - We describe a highly sensitive, quantitative, and inexpensive technique for targeted sequencing of transcript cohorts or genomic regions from thousands of bulk samples or single cells in parallel. Multiplexing is based on a simple method that produces extensive matrices of diverse DNA barcodes attached to invariant primer sets, which are all pre-selected and optimized in silico. By applying the matrices in a novel workflow named Barcode Assembly foR Targeted Sequencing (BART-Seq), we analyze developmental states of thousands of single human pluripotent stem cells, either in different maintenance media or upon Wnt/beta-catenin pathway activation, which identifies the mechanisms of differentiation induction. Moreover, we apply BART-Seq to the genetic screening of breast cancer patients and identify BRCA mutations with very high precision. The processing of thousands of samples and dynamic range measurements that outperform global transcriptomics techniques makes BART-Seq first targeted sequencing technique suitable for numerous research applications. AU - Uzbas, F. AU - Opperer, F. AU - Sönmezer, C. AU - Shaposhnikov, D. AU - Sass, S. AU - Krendl, C. AU - Angerer, P. AU - Theis, F.J. AU - Müller, N.S. AU - Drukker, M. C1 - 56720 C2 - 47247 CY - Campus, 4 Crinan St, London N1 9xw, England TI - BART-Seq: cost-effective massively parallelized targeted sequencing for genomics, transcriptomics, and single-cell analysis. JO - Genome Biol. VL - 20 IS - 1 PB - Bmc PY - 2019 SN - 1474-760X ER - TY - JOUR AB - Single-cell RNA-seq quantifies biological heterogeneity across both discrete cell types and continuous cell transitions. Partition-based graph abstraction (PAGA) provides an interpretable graph-like map of the arising data manifold, based on estimating connectivity of manifold partitions (https://github.com/theislab/paga). PAGA maps preserve the global topology of data, allow analyzing data at different resolutions, and result in much higher computational efficiency of the typical exploratory data analysis workflow. We demonstrate the method by inferring structure-rich cell maps with consistent topology across four hematopoietic datasets, adult planaria and the zebrafish embryo and benchmark computational performance on one million neurons. AU - Wolf, F.A. AU - Hamey, F.K.* AU - Plass, M.* AU - Solana, J.* AU - Dahlin, J.S.* AU - Göttgens, B.* AU - Rajewsky, N.* AU - Simon, L. AU - Theis, F.J. C1 - 55717 C2 - 46481 CY - Campus, 4 Crinan St, London N1 9xw, England TI - PAGA: Graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. JO - Genome Biol. VL - 20 IS - 1 PB - Bmc PY - 2019 SN - 1474-760X ER - TY - JOUR AB - Background: Numerous scaffold-level sequences for wheat are now being released and, in this context, we report on a strategy for improving the overall assembly to a level comparable to that of the human genome.Results: Using chromosome 7A of wheat as a model, sequence-finished megabase-scale sections of this chromosome were established by combining a new independent assembly using a bacterial artificial chromosome (BAC)-based physical map, BAC pool paired-end sequencing, chromosome-arm-specific mate-pair sequencing and Bionano optical mapping with the International Wheat Genome Sequencing Consortium RefSeq v1.0 sequence and its underlying raw data. The combined assembly results in 18 super-scaffolds across the chromosome. The value of finished genome regions is demonstrated for two approximately 2.5 Mb regions associated with yield and the grain quality phenotype of fructan carbohydrate grain levels. In addition, the 50 Mb centromere region analysis incorporates cytological data highlighting the importance of non-sequence data in the assembly of this complex genome region.Conclusions: Sufficient genome sequence information is shown to now be available for the wheat community to produce sequence-finished releases of each chromosome of the reference genome. The high-level completion identified that an array of seven fructosyl transferase genes underpins grain quality and that yield attributes are affected by five F-box-only-protein-ubiquitin ligase domain and four root-specific lipid transfer domain genes. The completed sequence also includes the centromere. AU - Keeble-Gagnère, G.* AU - Rigault, P.* AU - Tibbits, J.* AU - Pasam, R.K.* AU - Hayden, M.* AU - Forrest, K.* AU - Frenkel, Z.* AU - Korol, A.* AU - Huang, B.E.* AU - Cavanagh, C.* AU - Taylor, J.* AU - Abrouk, M.* AU - Sharpe, A.* AU - Konkin, D.* AU - Sourdille, P.* AU - Darrier, B.* AU - Choulet, F.* AU - Bernard, A.* AU - Rochfort, S.* AU - Dimech, A.* AU - Watson-Haigh, N.* AU - Baumann, U.* AU - Eckermann, P.* AU - Fleury, D.* AU - Juhász, A.* AU - Boisvert, S.* AU - Nolin, M.A.* AU - Doležel, J.* AU - Šimková, H.* AU - Toegelová, H.* AU - Šafář, J.* AU - Luo, M.C.* AU - Câmara, F.* AU - Pfeifer, M. AU - Isdale, D.* AU - Nyström-Persson, J.* AU - Iwgsc, .* AU - Koo, D.H.* AU - Tinning, M.* AU - Cui, D.* AU - Ru, Z.* AU - Appels, R.* C1 - 54141 C2 - 45337 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Optical and physical mapping with local finishing enables megabase-scale resolution of agronomically important regions in the wheat genome. JO - Genome Biol. VL - 19 IS - 1 PB - Bmc PY - 2018 SN - 1474-760X ER - TY - JOUR AB - Background: Genome-wide association studies conducted on QRS duration, an electrocardiographic measurement associated with heart failure and sudden cardiac death, have led to novel biological insights into cardiac function. However, the variants identified fall predominantly in non-coding regions and their underlying mechanisms remain unclear. Results: Here, we identify putative functional coding variation associated with changes in the QRS interval duration by combining Illumina HumanExome BeadChip genotype data from 77,898 participants of European ancestry and 7695 of African descent in our discovery cohort, followed by replication in 111,874 individuals of European ancestry from the UK Biobank and deCODE cohorts. We identify ten novel loci, seven within coding regions, including ADAMTS6, significantly associated with QRS duration in gene-based analyses. ADAMTS6 encodes a secreted metalloprotease of currently unknown function. In vitro validation analysis shows that the QRS-associated variants lead to impaired ADAMTS6 secretion and loss-of function analysis in mice demonstrates a previously unappreciated role for ADAMTS6 in connexin 43 gap junction expression, which is essential for myocardial conduction. Conclusions: Our approach identifies novel coding and non-coding variants underlying ventricular depolarization and provides a possible mechanism for the ADAMTS6-associated conduction changes. AU - Prins, B.P.* AU - Mead, T.J.* AU - Brody, J.A.* AU - Sveinbjornsson, G.* AU - Ntalla, I.* AU - Bihlmeyer, N.A.* AU - van den Berg, M.* AU - Bork-Jensen, J.* AU - Cappellani, S.* AU - Van Duijvenboden, S.* AU - Klena, N.T.* AU - Gabriel, G.C.* AU - Liu, X.* AU - Gulec, C.* AU - Grarup, N.* AU - Haessler, J.* AU - Hall, L.M.* AU - Iorio, A.* AU - Isaacs, A.* AU - Li-Gao, R.* AU - Lin, H.* AU - Liu, C.-T.* AU - Lyytikäinen, L.-P.* AU - Marten, J.* AU - Mei, H.* AU - Müller-Nurasyid, M. AU - Orini, M.* AU - Padmanabhan, S.* AU - Radmanesh, F.* AU - Ramirez, J.* AU - Robino, A.* AU - Schwartz, M.* AU - van Setten, J.* AU - Smith, A.V.* AU - Verweij, N.* AU - Warren, H.R.* AU - Weiss, S.* AU - Alonso, A.* AU - Arnar, D.O.* AU - Bots, M.L.* AU - de Boer, R.A.* AU - Dominiczak, A.F.* AU - Eijgelsheim, M.* AU - Ellinor, P.T.* AU - Guo, X.* AU - Felix, S.B.* AU - Harris, T.B.* AU - Hayward, C.* AU - Heckbert, S.R.* AU - Huang, P.L.* AU - Strauch, K. AU - Jamshidi, Y.* AU - Kors, J.A.* AU - Lambiase, P.D.* AU - Launer, L.J.* AU - Li, M.* AU - Linneberg, A.* AU - Nelson, C.P.* AU - Pedersen, O.* AU - Perez, M.L.* AU - Peters, A. AU - Polasek, O.* AU - Psaty, B.M.* AU - Raitakari, O.T.* AU - Rice, K.M.* AU - Rotter, J.I.* AU - Sinner, M.F.* AU - Soliman, E.Z.* AU - Spector, T.D.* AU - Waldenberger, M. AU - Lo, C.W.* C1 - 54001 C2 - 45194 TI - Exome-chip meta-analysis identifies novel loci associated with cardiac conduction, including ADAMTS6. JO - Genome Biol. VL - 19 IS - 1 PY - 2018 SN - 1474-760X ER - TY - JOUR AB - Background: Recent improvements in DNA sequencing and genome scaffolding have paved the way to generate Vigil quality de novo assemblies of pseudomolecules representing complete chromosomes of wheat and its wild relatives. These assemblies form the basis to compare the dynamics of wheat genomes on a megabase scale.Results: Here, we provide a comparative sequence analysis of the 700-megabase chromosome 2D between two bread wheat genotypes-the old landrace Chinese Spring and the elite Swiss spring wheat line 'CH Campala Lr22a'. Both chromosomes were assembled into megabase-sized scaffolds. There is a high degree of sequence conservation between the two chromosomes. Analysis of large structural variations reveals four large indels of more than 100 kb. Based on the molecular signatures at the breakpoints, unequal crossing over and double-strand break repair were identified as the molecular mechanisms that caused these indels. Three of the large indels affect copy number of NLRs, a gene family involved in plant immunity. Analysis of SNP density reveals four haploblocks of 4, 8, 9 and 48 Mb with a 35-fold increased SNP density compared to the rest of the chromosome. Gene content across the two chromosomes was highly conserved. Ninety-nine percent of the genic sequences were present in both genotypes and the fraction of unique genes ranged from 0.4 to 0.7%.Conclusions: This comparative analysis of two high-quality chromosome assemblies enabled a comprehensive assessment of large structural variations and gene content. The insight obtained from this analysis will form the basis of future wheat pan-genome studies. AU - Thind, A.K.* AU - Wicker, T.* AU - Müller, T.* AU - Ackermann, P.M.* AU - Steuernagel, B.* AU - Wulff, B.B.H.* AU - Spannagl, M. AU - Twardziok, S.O. AU - Felder, M. AU - Lux, T. AU - Mayer, K.F.X. AU - International Wheat Genome Sequencing Consortium* AU - Keller, B.* AU - Krattinger, S.G.* C1 - 54143 C2 - 45314 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Chromosome-scale comparative sequence analysis unravels molecular mechanisms of genome dynamics between two wheat cultivars. JO - Genome Biol. VL - 19 IS - 1 PB - Bmc PY - 2018 SN - 1474-760X ER - TY - JOUR AB - Background: Transposable elements (TEs) are major components of large plant genomes and main drivers of genome evolution. The most recent assembly of hexaploid bread wheat recovered the highly repetitive TE space in an almost complete chromosomal context and enabled a detailed view into the dynamics of TEs in the A, B, and D subgenomes.Results: The overall TE content is very similar between the A, B, and D subgenomes, although we find no evidence for bursts of TE amplification after the polyploidization events. Despite the near-complete turnover of TEs since the subgenome lineages diverged from a common ancestor, 76% of TE families are still present in similar proportions in each subgenome. Moreover, spacing between syntenic genes is also conserved, even though syntenic TEs have been replaced by new insertions over time, suggesting that distances between genes, but not sequences, are under evolutionary constraints. The TE composition of the immediate gene vicinity differs from the core intergenic regions. We find the same TE families to be enriched or depleted near genes in all three subgenomes. Evaluations at the subfamily level of timed long terminal repeat-retrotransposon insertions highlight the independent evolution of the diploid A, B, and D lineages before polyploidization and cases of concerted proliferation in the AB tetraploid.Conclusions: Even though the intergenic space is changed by the TE turnover, an unexpected preservation is observed between the A, B, and D subgenomes for features like TE family proportions, gene spacing, and TE enrichment near genes. AU - Wicker, T.* AU - Gundlach, H. AU - Spannagl, M. AU - Uauy, C.* AU - Borrill, P* AU - Ramírez-González, R.H.* AU - De Oliveira, R.* AU - Mayer, K.F.X. AU - Paux, E.* AU - Choulet, F.* C1 - 54142 C2 - 45311 CY - Campus, 4 Crinan St, London N1 9xw, England TI - Impact of transposable elements on genome structure and evolution in bread wheat. JO - Genome Biol. VL - 19 IS - 1 PB - Bmc PY - 2018 SN - 1474-760X ER - TY - JOUR AB - SCANPY is a scalable toolkit for analyzing single-cell gene expression data. It includes methods for preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing, and simulation of gene regulatory networks. Its Python-based implementation efficiently deals with data sets of more than one million cells (https://github.com/theislab/Scanpy). Along with SCANPY, we present ANNDATA, a generic class for handling annotated data matrices (https://github.com/theislab/anndata). AU - Wolf, F.A. AU - Angerer, P. AU - Theis, F.J. C1 - 52893 C2 - 44321 CY - London TI - SCANPY: Large-scale single-cell gene expression data analysis. JO - Genome Biol. VL - 19 IS - 1 PB - Biomed Central Ltd PY - 2018 SN - 1474-760X ER - TY - JOUR AB - Upon publication of the original article [1] it was highlighted by the authors that a transposition error affected Additional file 1, causing the misplacement of several columns and rendering the table difficult to read. This transposition does not influence any of the results nor analyses presented in the paper and has since been formally noted in this correction article; the corrected file is available here as an Additional File. The publisher apologizes for this error. AU - Zannas, A.S.* AU - Knauer-Arloth, J. AU - Carrillo-Roa, T.* AU - Iurato, S.* AU - Roeh, S.* AU - Ressler, K.J.* AU - Nemeroff, C.B.* AU - Smith, A.K.* AU - Bradley, B.* AU - Heim, C.* AU - Menke, A.* AU - Lange, J.F.* AU - Brueckl, T.* AU - Ising, M.* AU - Wray, N.R.* AU - Erhardt, A.* AU - Binder, E.B.* AU - Mehta, D.* C1 - 53781 C2 - 45011 CY - 1200 New York Ave, Nw, Washington, Dc 20005 Usa TI - Correction: Lifetime stress accelerates epigenetic aging in an urban, African American cohort: Relevance of glucocorticoid signaling [Genome Biol., 16, 1, (2015) (266)] DOI: 10.1186/s13059-015-0828-5. JO - Genome Biol. VL - 19 IS - 1 PB - Amer Assoc Advancement Science PY - 2018 SN - 1474-760X ER - TY - JOUR AB - Single-cell RNA-sequencing (scRNA-seq) allows studying heterogeneity in gene expression in large cell populations. Such heterogeneity can arise due to technical or biological factors, making decomposing sources of variation difficult. We here describe f-scLVM (factorial single-cell latent variable model), a method based on factor analysis that uses pathway annotations to guide the inference of interpretable factors underpinning the heterogeneity. Our model jointly estimates the relevance of individual factors, refines gene set annotations, and infers factors without annotation. In applications to multiple scRNA-seq datasets, we find that f-scLVM robustly decomposes scRNA-seq datasets into interpretable components, thereby facilitating the identification of novel subpopulations. AU - Buettner, F. AU - Pratanwanich, N.* AU - McCarthy, D.J.* AU - Marioni, J.C.* AU - Stegle, O.* C1 - 52915 C2 - 44356 CY - London TI - f-scLVM: Scalable and versatile factor analysis for single-cell RNA-seq. JO - Genome Biol. VL - 18 PB - Biomed Central Ltd PY - 2017 SN - 1474-760X ER - TY - JOUR AB - Open Science is encouraged by the European Union and many other political and scientific institutions. However, scientific practice is proving slow to change. We propose, as early career researchers, that it is our task to change scientific research into open scientific research and commit to Open Science principles. AU - Farnham, A.* AU - Kurz, C.F. AU - Ötzürk, M.A.* AU - Solbiati, M.* AU - Myllyntaus, O.* AU - Meekes, J.* AU - Pham, T.M.* AU - Paz, C.* AU - Langiewicz, M.* AU - Andrews, S.* AU - Kanninen, L.* AU - Agbemabiese, C.* AU - Guler, A.T.* AU - Durieux, J.* AU - Jasim, S.* AU - Viessmann, O.* AU - Frattini, S.* AU - Yembergenova, D.* AU - Benito, C.M.* AU - Porte, M.* AU - Grangeray-Vilmint, A.* AU - Prieto Curiel, R.* AU - Rehncrona, C.* AU - Malas, T.* AU - Esposito, F.* AU - Hettne, K.* C1 - 52340 C2 - 43885 CY - London TI - Early career researchers want Open Science. JO - Genome Biol. VL - 18 IS - 1 PB - Biomed Central Ltd PY - 2017 SN - 1474-760X ER - TY - JOUR AB - Background: Genetic variation is an important determinant of RNA transcription and splicing, which in turn contributes to variation in human traits, including cardiovascular diseases. Results: Here we report the first in-depth survey of heart transcriptome variation using RNA-sequencing in 97 patients with dilated cardiomyopathy and 108 non-diseased controls. We reveal extensive differences of gene expression and splicing between dilated cardiomyopathy patients and controls, affecting known as well as novel dilated cardiomyopathy genes. Moreover, we show a widespread effect of genetic variation on the regulation of transcription, isoform usage, and allele-specific expression. Systematic annotation of genome-wide association SNPs identifies 60 functional candidate genes for heart phenotypes, representing 20% of all published heart genome-wide association loci. Focusing on the dilated cardiomyopathy phenotype we found that eQTL variants are also enriched for dilated cardiomyopathy genome-wide association signals in two independent cohorts. Conclusions: RNA transcription, splicing, and allele-specific expression are each important determinants of the dilated cardiomyopathy phenotype and are controlled by genetic factors. Our results represent a powerful resource for the field of cardiovascular genetics. AU - Heinig, M. AU - Adriaens, M.E.* AU - Schafer, S.* AU - van Deutekom, H.W.M.* AU - Lodder, E.M.* AU - Ware, J.S.* AU - Schneider, V.* AU - Felkin, L.E.* AU - Creemers, E.E.* AU - Meder, B.* AU - Katus, H.A.* AU - Rühle, F.* AU - Stoll, M.* AU - Cambien, F.* AU - Villard, E.* AU - Charron, P.* AU - Varro, A.* AU - Bishopric, N.H.* AU - George, A.L.* AU - dos Remedios, C.* AU - Moreno-Moral, A.* AU - Pesce, F.* AU - Bauerfeind, A.* AU - Rüschendorf, F.* AU - Rintisch, C.* AU - Petretto, E.* AU - Barton, P.J.* AU - Cook, S.A.* AU - Pinto, Y.M.* AU - Bezzina, C.R.* AU - Hubner, N.* C1 - 51984 C2 - 43631 CY - London TI - Natural genetic variation of the cardiac transcriptome in non-diseased donors and patients with dilated cardiomyopathy. JO - Genome Biol. VL - 18 IS - 1 PB - Biomed Central Ltd PY - 2017 SN - 1474-760X ER - TY - JOUR AB - Background: Whole-exome sequencing (WES) has been successful in identifying genes that cause familial Parkinson's disease (PD). However, until now this approach has not been deployed to study large cohorts of unrelated participants. To discover rare PD susceptibility variants, we performed WES in 1148 unrelated cases and 503 control participants. Candidate genes were subsequently validated for functions relevant to PD based on parallel RNA-interference (RNAi) screens in human cell culture and Drosophila and C. elegans models. Results: Assuming autosomal recessive inheritance, we identify 27 genes that have homozygous or compound heterozygous loss-of-function variants in PD cases. Definitive replication and confirmation of these findings were hindered by potential heterogeneity and by the rarity of the implicated alleles. We therefore looked for potential genetic interactions with established PD mechanisms. Following RNAi-mediated knockdown, 15 of the genes modulated mitochondrial dynamics in human neuronal cultures and four candidates enhanced α-synuclein-induced neurodegeneration in Drosophila. Based on complementary analyses in independent human datasets, five functionally validated genes-GPATCH2L, UHRF1BP1L, PTPRH, ARSB, and VPS13C-also showed evidence consistent with genetic replication. Conclusions: By integrating human genetic and functional evidence, we identify several PD susceptibility gene candidates for further investigation. Our approach highlights a powerful experimental strategy with broad applicability for future studies of disorders with complex genetic etiologies. AU - Jansen, I.E.* AU - Ye, H.* AU - Heetveld, S.* AU - Lechler, M.C.* AU - Michels, H.* AU - Seinstra, R.I.* AU - Lubbe, S.J.* AU - Drouet, V.* AU - Lesage, S.* AU - Majounie, E.* AU - Gibbs, J.R.* AU - Nalls, M.A.* AU - Ryten, M.* AU - Botia, J.A.* AU - Vandrovcova, J.* AU - Simon-Sanchez, J.* AU - Castillo-Lizardo, M.* AU - Rizzu, P.* AU - Blauwendraat, C.* AU - Chouhan, A.K.* AU - Li, Y.* AU - Yogi, P.* AU - Amin, N.* AU - van Duijn, C.M.* AU - Morris, H.R.* AU - Brice, A.* AU - Singleton, A.B.* AU - David, D.C.* AU - Nollen, E.A.* AU - Jain, S.* AU - Shulman, J.M.* AU - Heutink, P.* AU - Hernandez, D.G.* AU - Arepalli, S.* AU - Brooks, J.M.* AU - Price, R.* AU - Nicolas, A.* AU - Chong, S.* AU - Cookson, M.R.* AU - Dillman, A.* AU - Moore, M.* AU - Traynor, B.J.* AU - Plagnol, V.* AU - Nicholas W Wood,* AU - Sheerin, U.M.* AU - Jose M Bras,* AU - Charlesworth, G.* AU - Gardner, M.* AU - Guerreiro, R.* AU - Trabzuni, D.* AU - Hardy, J.* AU - International Parkinson's Disease Genomics Consortium (IPDGC) (Illig, T. AU - Lichtner, P.) AU - Schulte, C.* AU - Corvol, J.C.* AU - Dürr, A.* AU - Vidailhet, M.* AU - Sveinbjörnsdóttir, S.* AU - Barker, R.A.* AU - Williams-Gray, C.H.* AU - Ben-Shlomo, Y.* AU - Berendse, H.W.* AU - van Dijk, K.D.* AU - Berg, D.* AU - Brockmann, K.* AU - Wurster, I.* AU - Mätzler, W.* AU - Gasser, T.* AU - Martinez, M.* AU - de Bie, R.M.A.* AU - Biffi, A.* AU - Velseboer, D.* C1 - 50481 C2 - 42411 TI - Discovery and functional prioritization of Parkinson's disease candidate genes from large-scale whole exome sequencing. JO - Genome Biol. VL - 18 IS - 1 PY - 2017 SN - 1474-760X ER - TY - JOUR AB - Background: Chromosome instability leads to aneuploidy, a state in which cells have abnormal numbers of chromosomes, and is found in two out of three cancers. In a chromosomal instable p53 deficient mouse model with accelerated lymphomagenesis, we previously observed whole chromosome copy number changes affecting all lymphoma cells. This suggests that chromosome instability is somehow suppressed in the aneuploid lymphomas or that selection for frequently lost/gained chromosomes out-competes the CIN-imposed mis-segregation. Results: To distinguish between these explanations and to examine karyotype dynamics in chromosome instable lymphoma, we use a newly developed single-cell whole genome sequencing (scWGS) platform that provides a complete and unbiased overview of copy number variations (CNV) in individual cells. To analyse these scWGS data, we develop AneuFinder, which allows annotation of copy number changes in a fully automated fashion and quantification of CNV heterogeneity between cells. Single-cell sequencing and AneuFinder analysis reveals high levels of copy number heterogeneity in chromosome instability-driven murine T-cell lymphoma samples, indicating ongoing chromosome instability. Application of this technology to human B cell leukaemias reveals different levels of karyotype heterogeneity in these cancers. Conclusion: Our data show that even though aneuploid tumours select for particular and recurring chromosome combinations, single-cell analysis using AneuFinder reveals copy number heterogeneity. This suggests ongoing chromosome instability that other platforms fail to detect. As chromosome instability might drive tumour evolution, karyotype analysis using single-cell sequencing technology could become an essential tool for cancer treatment stratification. AU - Bakker, B.* AU - Taudt, A.* AU - Belderbos, M.E.* AU - Porubsky, D.* AU - Spierings, D.C.J.* AU - de Jong, T.V.* AU - Halsema, N.* AU - Kazemier, H.G.* AU - Hoekstra-Wakker, K.* AU - Bradley, A.* AU - de Bont, E.S.J.M.* AU - van den Berg, A.* AU - Guryev, V.* AU - Lansdorp, P.M.* AU - Colomé-Tatché, M. AU - Foijer, F.* C1 - 48801 C2 - 41417 CY - London TI - Single-cell sequencing reveals karyotype heterogeneity in murine and human malignancies. JO - Genome Biol. VL - 17 IS - 1 PB - Biomed Central Ltd PY - 2016 SN - 1474-760X ER - TY - JOUR AB - By performing high-throughput chromosome conformation capture analyses in embryonic stem cells depleted of the linker histone H1, Geeven and colleagues have uncovered exciting new evidence concerning a role for this histone in modulating three-dimensional genome architecture and chromatin organization. Please see link to Research article: http://www.genomebiology.com/2015/16/1/289 AU - Izzo, A.* AU - Schneider, R. C1 - 47748 C2 - 39570 TI - H1 gets the genome in shape. JO - Genome Biol. VL - 17 PY - 2016 SN - 1474-760X ER - TY - JOUR AB - Background: Chronic low-grade inflammation reflects a subclinical immune response implicated in the pathogenesis of complex diseases. Identifying genetic loci where DNA methylation is associated with chronic low-grade inflammation may reveal novel pathways or therapeutic targets for inflammation. Results: We performed a meta-analysis of epigenome-wide association studies (EWAS) of serum C-reactive protein (CRP), which is a sensitive marker of low-grade inflammation, in a large European population (n = 8863) and trans-ethnic replication in African Americans (n = 4111). We found differential methylation at 218 CpG sites to be associated with CRP (P < 1.15 × 10-7) in the discovery panel of European ancestry and replicated (P < 2.29 × 10-4) 58 CpG sites (45 unique loci) among African Americans. To further characterize the molecular and clinical relevance of the findings, we examined the association with gene expression, genetic sequence variants, and clinical outcomes. DNA methylation at nine (16%) CpG sites was associated with whole blood gene expression in cis (P < 8.47 × 10-5), ten (17%) CpG sites were associated with a nearby genetic variant (P < 2.50 × 10-3), and 51 (88%) were also associated with at least one related cardiometabolic entity (P < 9.58 × 10-5). An additive weighted score of replicated CpG sites accounted for up to 6% inter-individual variation (R2) of age-adjusted and sex-adjusted CRP, independent of known CRP-related genetic variants. Conclusion: We have completed an EWAS of chronic low-grade inflammation and identified many novel genetic loci underlying inflammation that may serve as targets for the development of novel therapeutic interventions for inflammation. AU - Ligthart, S.* AU - Marzi, C. AU - Aslibekyan, S.* AU - Mendelson, M.M.* AU - Conneely, K.N.* AU - Tanaka, T.* AU - Colicino, E.* AU - Waite, L.L.* AU - Joehanes, R.* AU - Guan, W.* AU - Brody, J.A.* AU - Elks, C.E.* AU - Marioni, R.E.* AU - Jhun, M.A.* AU - Agha, G.* AU - Bressler, J.* AU - Ward-Caviness, C.K. AU - Chen, B.H.* AU - Huan, T.* AU - Bakulski, K.* AU - Salfati, E.L.* AU - Fiorito, G.* AU - Wahl, S. AU - Schramm, K. AU - Sha, J.* AU - Hernandez, D.G.* AU - Just, A.C.* AU - Smith, J.A.* AU - Sotoodehnia, N.* AU - Pilling, L.C.* AU - Pankow, J.S.* AU - Tsao, P.S.* AU - Liu, C.* AU - Zhao, W.* AU - Guarrera, S.* AU - Michopoulos, V.J.* AU - Smith, A.K.* AU - Peters, M.J.* AU - Melzer, D.* AU - Vokonas, P.* AU - Fornage, M.* AU - Prokisch, H. AU - Bis, J.C.* AU - Chu, A.Y.* AU - Herder, C.* AU - Grallert, H. AU - Yao, C.* AU - Shah, S.* AU - McRae, A.F.* AU - Lin, H.* AU - Horvath, S.* AU - Fallin, D.* AU - Hofman, A.* AU - Wareham, N.J.* AU - Wiggins, K.L.* AU - Feinberg, A.P.* AU - Starr, J.M.* AU - Visscher, P.M.* AU - Murabito, J.M.* AU - Kardia, S.L.R.* AU - Absher, D.M.* AU - Binder, E.B.* AU - Singleton, A.B.* AU - Bandinelli, S.* AU - Peters, A. AU - Waldenberger, M. AU - Matullo, G.* AU - Schwartz, J.D.* AU - Demerath, E.W.* AU - Uitterlinden, A.G.* AU - Meurs, J.B.J.* AU - Franco, O.H.* AU - Chen, Y.D.I.* AU - Levy, D.* AU - Turner, S.T.* AU - Deary, I.J.* C1 - 50168 C2 - 41259 CY - London TI - DNA methylation signatures of chronic low-grade inflammation are associated with complex diseases. JO - Genome Biol. VL - 17 IS - 1 PB - Biomed Central Ltd PY - 2016 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: Dent and Flint represent two major germplasm pools exploited in maize breeding. Several traits differentiate the two pools, like cold tolerance, early vigor, and flowering time. A comparative investigation of their genomic architecture relevant for quantitative trait expression has not been reported so far. Understanding the genomic differences between germplasm pools may contribute to a better understanding of the complementarity in heterotic patterns exploited in hybrid breeding and of mechanisms involved in adaptation to different environments. RESULTS: We perform whole-genome screens for signatures of selection specific to temperate Dent and Flint maize by comparing high-density genotyping data of 70 American and European Dent and 66 European Flint inbred lines. We find 2.2 % and 1.4 % of the genes are under selective pressure, respectively, and identify candidate genes associated with agronomic traits known to differ between the two pools. Taking flowering time as an example for the differentiation between Dent and Flint, we investigate candidate genes involved in the flowering network by phenotypic analyses in a Dent-Flint introgression library and find that the Flint haplotypes of the candidates promote earlier flowering. Within the flowering network, the majority of Flint candidates are associated with endogenous pathways in contrast to Dent candidate genes, which are mainly involved in response to environmental factors like light and photoperiod. The diversity patterns of the candidates in a unique panel of more than 900 individuals from 38 European landraces indicate a major contribution of landraces from France, Germany, and Spain to the candidate gene diversity of the Flint elite lines. CONCLUSIONS: In this study, we report the investigation of pool-specific differences between temperate Dent and Flint on a genome-wide scale. The identified candidate genes represent a promising source for the functional investigation of pool-specific haplotypes in different genetic backgrounds and for the evaluation of their potential for future crop improvement like the adaptation to specific environments. AU - Unterseer, S. AU - Pophaly, S.D.* AU - Peis, R.* AU - Westermeier, P.* AU - Mayer, M.* AU - Seidel, M. AU - Haberer, G. AU - Mayer, K.F.X. AU - Ordas, B.* AU - Pausch, H.* AU - Tellier, A.* AU - Bauer, E.* AU - Schön, C.C.* C1 - 49047 C2 - 41606 CY - London TI - A comprehensive study of the genomic differentiation between temperate Dent and Flint maize. JO - Genome Biol. VL - 17 IS - 1 PB - Biomed Central Ltd PY - 2016 SN - 1474-760X ER - TY - JOUR AU - van den Bos, H.* AU - Spierings, D.C.J.* AU - Taudt, A. AU - Bakker, B.* AU - Porubsky, D.* AU - Falconer, E.* AU - Novoa, C.* AU - Halsema, N.* AU - Kazemier, H.G.* AU - Hoekstra-Wakker, K.* AU - Guryev, V.* AU - den Dunnen, W.F.A.* AU - Foijer, F.* AU - Colomé-Tatché, M. AU - Boddeke, H.W.G.M.* AU - Lansdorp, P.M.* C1 - 49032 C2 - 41566 CY - London TI - Erratum to: Single-cell whole genome sequencing reveals no evidence for common aneuploidy in normal and Alzheimer's disease neurons [Genome Biol, (2016), 17, 116]. JO - Genome Biol. VL - 17 IS - 1 PB - Biomed Central Ltd PY - 2016 SN - 1474-760X ER - TY - JOUR AB - Background: Alzheimer's disease (AD) is a neurodegenerative disease of the brain and the most common form of dementia in the elderly. Aneuploidy, a state in which cells have an abnormal number of chromosomes, has been proposed to play a role in neurodegeneration in AD patients. Several studies using fluorescence in situ hybridization have shown that the brains of AD patients contain an increased number of aneuploid cells. However, because the reported rate of aneuploidy in neurons ranges widely, a more sensitive method is needed to establish a possible role of aneuploidy in AD pathology. Results: In the current study, we used a novel single-cell whole genome sequencing (scWGS) approach to assess aneuploidy in isolated neurons from the frontal cortex of normal control individuals (n = 6) and patients with AD (n = 10). The sensitivity and specificity of our method was shown by the presence of three copies of chromosome 21 in all analyzed neuronal nuclei of a Down's syndrome sample (n = 36). Very low levels of aneuploidy were found in the brains from control individuals (n = 589) and AD patients (n = 893). In contrast to other studies, we observe no selective gain of chromosomes 17 or 21 in neurons of AD patients. Conclusion: scWGS showed no evidence for common aneuploidy in normal and AD neurons. Therefore, our results do not support an important role for aneuploidy in neuronal cells in the pathogenesis of AD. This will need to be confirmed by future studies in larger cohorts. AU - van den Bos, H.* AU - Spierings, D.C.J.* AU - Taudt, A. AU - Bakker, B.* AU - Porubsky, D.* AU - Falconer, E.* AU - Novoa, C.* AU - Halsema, N.* AU - Kazemier, H.G.* AU - Hoekstra-Wakker, K.* AU - Guryev, V.* AU - den Dunnen, W.F.A.* AU - Foijer, F.* AU - Colomé-Tatché, M. AU - Boddeke, H.W.G.M.* AU - Lansdorp, P.M.* C1 - 48984 C2 - 41520 CY - London TI - Single-cell whole genome sequencing reveals no evidence for common aneuploidy in normal and Alzheimer's disease neurons. JO - Genome Biol. VL - 17 IS - 1 PB - Biomed Central Ltd PY - 2016 SN - 1474-760X ER - TY - JOUR AB - Background: Hematopoietic stem cells (HSCs) are a rare cell type with the ability of long-term self-renewal and multipotency to reconstitute all blood lineages. HSCs are typically purified from the bone marrow using cell surface markers. Recent studies have identified significant cellular heterogeneities in the HSC compartment with subsets of HSCs displaying lineage bias. We previously discovered that the transcription factor Bcl11a has critical functions in the lymphoid development of the HSC compartment. Results: In this report, we employ single-cell transcriptomic analysis to dissect the molecular heterogeneities in HSCs. We profile the transcriptomes of 180 highly purified HSCs (Bcl11a +/+ and Bcl11a -/-). Detailed analysis of the RNA-seq data identifies cell cycle activity as the major source of transcriptomic variation in the HSC compartment, which allows reconstruction of HSC cell cycle progression in silico. Single-cell RNA-seq profiling of Bcl11a -/- HSCs reveals abnormal proliferative phenotypes. Analysis of lineage gene expression suggests that the Bcl11a -/- HSCs are constituted of two distinct myeloerythroid-restricted subpopulations. Remarkably, similar myeloid-restricted cells could also be detected in the wild-type HSC compartment, suggesting selective elimination of lymphoid-competent HSCs after Bcl11a deletion. These defects are experimentally validated in serial transplantation experiments where Bcl11a -/- HSCs are myeloerythroid-restricted and defective in self-renewal. Conclusions: Our study demonstrates the power of single-cell transcriptomics in dissecting cellular process and lineage heterogeneities in stem cell compartments, and further reveals the molecular and cellular defects in the Bcl11a-deficient HSC compartment. AU - Tsang, J.C.H.* AU - Yu, Y.* AU - Burke, S.* AU - Buettner, F. AU - Wang, C.* AU - Kolodziejczyk, A.A.* AU - Teichmann, S.A.* AU - Lu, L.* AU - Liu, P.* C1 - 46886 C2 - 39015 TI - Single-cell transcriptomic reconstruction reveals cell cycle and multi-lineage differentiation defects in Bcl11a-deficient hematopoietic stem cells. JO - Genome Biol. VL - 16 IS - 1 PY - 2015 SN - 1474-760X ER - TY - JOUR AB - Background: Chronic psychological stress is associated with accelerated aging and increased risk for aging-related diseases, but the underlying molecular mechanisms are unclear. Results: We examined the effect of lifetime stressors on a DNA methylation-based age predictor, epigenetic clock. After controlling for blood cell-type composition and lifestyle parameters, cumulative lifetime stress, but not childhood maltreatment or current stress alone, predicted accelerated epigenetic aging in an urban, African American cohort (n = 392). This effect was primarily driven by personal life stressors, was more pronounced with advancing age, and was blunted in individuals with higher childhood abuse exposure. Hypothesizing that these epigenetic effects could be mediated by glucocorticoid signaling, we found that a high number (n = 85) of epigenetic clock CpG sites were located within glucocorticoid response elements. We further examined the functional effects of glucocorticoids on epigenetic clock CpGs in an independent sample with genome-wide DNA methylation (n = 124) and gene expression data (n = 297) before and after exposure to the glucocorticoid receptor agonist dexamethasone. Dexamethasone induced dynamic changes in methylation in 31.2 % (110/353) of these CpGs and transcription in 81.7 % (139/170) of genes neighboring epigenetic clock CpGs. Disease enrichment analysis of these dexamethasone-regulated genes showed enriched association for aging-related diseases, including coronary artery disease, arteriosclerosis, and leukemias. Conclusions: Cumulative lifetime stress may accelerate epigenetic aging, an effect that could be driven by glucocorticoid-induced epigenetic changes. These findings contribute to our understanding of mechanisms linking chronic stress with accelerated aging and heightened disease risk. AU - Zannas, A.S.* AU - Knauer-Arloth, J. AU - Carrillo-Roa, T.* AU - Iurato, S.* AU - Röh, S.* AU - Ressler, K.J.* AU - Nemeroff, C.B.* AU - Smith, A.K.* AU - Bradley, B.* AU - Heim, C.* AU - Menke, A.* AU - Lange, J.F.* AU - Brückl, T.* AU - Ising, M.* AU - Wray, N.R.* AU - Erhardt, A.* AU - Binder, E.B.* AU - Mehta, D.* C1 - 47593 C2 - 39413 TI - Lifetime stress accelerates epigenetic aging in an urban, African American cohort: Relevance of glucocorticoid signaling. JO - Genome Biol. VL - 16 IS - 1 PY - 2015 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: There is growing evidence for the prevalence of copy number variation (CNV) and its role in phenotypic variation in many eukaryotic species. Here we use array comparative genomic hybridization to explore the extent of this type of structural variation in domesticated barley cultivars and wild barleys. RESULTS: A collection of 14 barley genotypes including eight cultivars and six wild barleys were used for comparative genomic hybridization. CNV affects 14.9% of all the sequences that were assessed. Higher levels of CNV diversity are present in the wild accessions relative to cultivated barley. CNVs are enriched near the ends of all chromosomes except 4H, which exhibits the lowest frequency of CNVs. CNV affects 9.5% of the coding sequences represented on the array and the genes affected by CNV are enriched for sequences annotated as disease-resistance proteins and protein kinases. Sequence-based comparisons of CNV between cultivars Barke and Morex provided evidence that DNA repair mechanisms of double-strand breaks via single-stranded annealing and synthesis-dependent strand annealing play an important role in the origin of CNV in barley. CONCLUSIONS: We present the first catalog of CNVs in a diploid Triticeae species, which opens the door for future genome diversity research in a tribe that comprises the economically important cereal species wheat, barley, and rye. Our findings constitute a valuable resource for the identification of CNV affecting genes of agronomic importance. We also identify potential mechanisms that can generate variation in copy number in plant genomes. AU - Muñoz-Amatriaín, M.* AU - Eichten, S.R.* AU - Wicker, T.* AU - Richmond, T.A.* AU - Mascher, M.* AU - Steuernagel, B.* AU - Scholz, U.* AU - Ariyadasa, R.* AU - Spannagl, M. AU - Nussbaumer, T. AU - Mayer, K.F.X. AU - Taudien, S.* AU - Platzer, M.* AU - Jeddeloh, J.A.* AU - Springer, N.M.* AU - Muehlbauer, G.J.* AU - Stein, N.* C1 - 25166 C2 - 31839 TI - Distribution, functional impact, and origin mechanisms of copy number variation in the barley genome. JO - Genome Biol. VL - 14 IS - 6 PB - BioMed Central PY - 2013 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: As for other major crops, achieving a complete wheat genome sequence is essential for the application of genomics to breeding new and improved varieties. To overcome the complexities of the large, highly repetitive and hexaploid wheat genome, the International Wheat Genome Sequencing Consortium established a chromosome-based strategy that was validated by the construction of the physical map of chromosome 3B. Here, we present improved strategies for the construction of highly integrated and ordered wheat physical maps, using chromosome 1BL as a template, and illustrate their potential for evolutionary studies and map-based cloning. RESULTS: Using a combination of novel high throughput marker assays and an assembly program, we developed a high quality physical map representing 93% of wheat chromosome 1BL, anchored and ordered with 5,489 markers including 1,161 genes. Analysis of the gene space organization and evolution revealed that gene distribution and conservation along the chromosome results from the superimposition of the ancestral grass and recent wheat evolutionary patterns leading to a peak of synteny in the central part of the chromosome arm and an increased density of non collinear genes towards the telomere. With a density of about 11 markers per Mb, the 1BL physical map provides 916 markers, including 193 genes, for fine mapping the 40 QTLs mapped on this chromosome. CONCLUSIONS: Here, we demonstrate that high marker density physical maps can be developed in complex genomes such as wheat to accelerate map-based cloning, gain new insights into genome evolution, and provide a foundation for reference sequencing. AU - Philippe, R.* AU - Paux, E.* AU - Bertin, I.* AU - Sourdille, P.* AU - Choulet, F.* AU - Laugier, C.* AU - Šimková, H.* AU - Safar, J.* AU - Bellec, A.* AU - Vautrin, S.* AU - Frenkel, Z.* AU - Cattonaro, F.* AU - Magni, F.* AU - Scalabrin, S.* AU - Martis, M.M. AU - Mayer, K.F.X. AU - Korol, A.* AU - Bergès, H.* AU - Dolezel, J.* AU - Feuillet, C.* C1 - 25164 C2 - 31838 TI - A high density physical map of chromosome 1BL supports evolutionary studies, map-based cloning and sequencing in wheat. JO - Genome Biol. VL - 14 IS - 6 PB - BioMed Central PY - 2013 SN - 1474-760X ER - TY - JOUR AB - BACKGROUND: The mouse inbred line C57BL/6J is widely used in mouse genetics and its genome has been incorporated into many genetic reference populations. More recently large initiatives such as The International Knockout Mouse Consortium (IKMC) are using the C57BL/6N mouse strain to generate null alleles for all mouse genes. Hence both strains are now widely used in mouse genetics studies. Here we perform a comprehensive genomic and phenotypic analysis of the two strains to identify differences that may influence their underlying genetic mechanisms. RESULTS: We undertake genome sequence comparisons of C57BL/6J and C57BL/6N to identify SNPs, indels and structural variants, with a focus on identifying all coding variants. We annotate 34 SNPs and 2 indels that distinguish C57BL/6J and C57BL/6N coding sequences, as well as 15 structural variants that overlap a gene. In parallel we assess the comparative phenotypes of the two inbred lines utilizing the EMPReSSslim phenotyping pipeline, a broad based assessment encompassing diverse biological systems. We perform additional secondary phenotyping assessments to explore other phenotype domains and to elaborate phenotype differences identified in the primary assessment. We uncover significant phenotypic differences between the two lines, replicated across multiple centers, in a number of physiological, biochemical and behavioral systems. CONCLUSIONS: Comparison of C57BL/6J and C57BL/6N demonstrates a range of phenotypic differences that have the potential to impact upon penetrance and expressivity of mutational effects in these strains. Moreover, the sequence variants we identify provide a set of candidate genes for the phenotypic differences observed between the two strains.   AU - Simon, M.M.* AU - Greenaway, S.* AU - White, J.K.* AU - Fuchs, H. AU - Gailus-Durner, V. AU - Sorg, T.* AU - Wong, K.* AU - Bedu, E.* AU - Cartwright, E.J.* AU - Dacquin, R.* AU - Estabel, J.* AU - Graw, J. AU - Ingham, N.J.* AU - Jackson, I.J.* AU - Lengeling, A.* AU - Mandillo, S.* AU - Marvel, J.* AU - Meziane, H.* AU - Preitner, F.* AU - Puk, O. AU - Roux, M.* AU - Adams, D.J.* AU - Atkins, S.* AU - Ayadi, A.* AU - Becker, L. AU - Blake, A.* AU - Brooker, D.* AU - Cater, H.* AU - Champy, M.-F.* AU - Combe, R.* AU - Danecek, P.* AU - di Fenza, A.* AU - Gates, H.* AU - Gerdin, A.-K.* AU - Golini, E.* AU - Hancock, J.M.* AU - Hans, W. AU - Hölter, S.M. AU - Hough, T.* AU - Jurdic, P.* AU - Keane, T.M* AU - Morgan, H.* AU - Müller, W.* AU - Neff, F. AU - Nicholson, G.* AU - Pasche, B.* AU - Roberson, L.-A.* AU - Rozman, J. AU - Sanderson, M.* AU - Santos, L.* AU - Selloum, M.* AU - Shannon, C.* AU - Southwell, A.* AU - Tocchini-Valentini, G.P.* AU - Vancollie, V.E.* AU - Wells, S.* AU - Westerberg, H.* AU - Wurst, W. AU - Zi, M.* AU - Yalcin, B.* AU - Ramirez-Solis, R.* AU - Steel, K.P.* AU - Mallon, A.-M.* AU - Hrabě de Angelis, M. AU - Herault, Y.* AU - Brown, S.D.M.* C1 - 26715 C2 - 32357 TI - A comparative phenotypic and genomic analysis of C57BL/6J and C57BL/6N mouse strains. JO - Genome Biol. VL - 14 IS - 7 PB - Biomed Central PY - 2013 SN - 1474-760X ER - TY - JOUR AB - ABSTRACT: The pathobiology of common diseases is influenced by heterogeneous factors interacting in complex networks. CIDeR http://mips.helmholtz-muenchen.de/cider/ is a publicly available, manually curated, integrative database of metabolic and neurological disorders. The resource provides structured information on 18,813 experimentally validated interactions between molecules, bioprocesses and environmental factors extracted from the scientific literature. Systematic annotation and interactive graphical representation of disease networks make CIDeR a versatile knowledge base for biologists, analysis of large-scale data and systems biology approaches. AU - Lechner, M. AU - Höhn, V. AU - Brauner, B. AU - Dunger, I. AU - Fobo, G. AU - Frishman, G. AU - Montrone, C. AU - Kastenmüller, G. AU - Wägele, B. AU - Ruepp, A. C1 - 10561 C2 - 30304 TI - CIDeR: Multifactorial interaction networks in human diseases. JO - Genome Biol. VL - 13 IS - 7 PB - BioMed Central Ltd PY - 2012 SN - 1474-760X ER - TY - JOUR AB - In recent years, microRNAs have been shown to play important roles in physiological as well as malignant processes. The PhenomiR database http://mips.helmholtz-muenchen.de/phenomir provides data from 542 studies that investigate deregulation of microRNA expression in diseases and biological processes as a systematic, manually curated resource. Using the PhenomiR dataset, we could demonstrate that, depending on disease type, independent information from cell culture studies contrasts with conclusions drawn from patient studies. AU - Ruepp, A. AU - Kowarsch, A. AU - Schmidl, D. AU - Buggenthin, F. AU - Brauner, B. AU - Dunger, I. AU - Fobo, G. AU - Frishman, G. AU - Montrone, C. AU - Theis, F.J. C1 - 5164 C2 - 27864 TI - PhenomiR: A knowledgebase for microRNA expression in diseases and biological processes. JO - Genome Biol. VL - 11 IS - 1 PB - BioMed Central Ltd. PY - 2010 SN - 1474-760X ER - TY - JOUR AB - The majority of the 2 million bovine single nucleotide polymorphisms (SNPs) currently available in dbSNP have been identified in a single breed, Hereford cattle, during the bovine genome project. In an attempt to evaluate the variance of a second breed, we have produced a whole genome sequence at low coverage of a single Fleckvieh bull. Results: We generated 24 gigabases of sequence, mainly using 36-bp paired-end reads, resulting in an average 7.4-fold sequence depth. This coverage was sufficient to identify 2.44 million SNPs, 82% of which were previously unknown, and 115,000 small indels. A comparison with the genotypes of the same animal, generated on a 50 k oligonucleotide chip, revealed a detection rate of 74% and 30% for homozygous and heterozygous SNPs, respectively. The false positive rate, as determined by comparison with genotypes determined for 196 randomly selected SNPs, was approximately 1.1%. We further determined the allele frequencies of the 196 SNPs in 48 Fleckvieh and 48 Braunvieh bulls. 95% of the SNPs were polymorphic with an average minor allele frequency of 24.5% and with 83% of the SNPs having a minor allele frequency larger than 5%. Conclusions: This work provides the first single cattle genome by next-generation sequencing. The chosen approach-low to medium coverage re-sequencing-added more than 2 million novel SNPs to the currently publicly available SNP resource, providing a valuable resource for the construction of high density oligonucleotide arrays in the context of genome-wide association studies. AU - Eck, S.H. AU - Benet-Pagès, A. AU - Flisikowski, K.* AU - Meitinger, T. AU - Fries, R.* AU - Strom, T.M. C1 - 1418 C2 - 26437 TI - Whole genome sequencing of a single Bos taurus animal for single nucleotide polymorphism discovery. JO - Genome Biol. VL - 10 IS - 8 PB - Biomed Central Ltd PY - 2009 SN - 1474-760X ER - TY - JOUR AB - Identifying the biochemical basis of microbial phenotypes is a main objective of comparative genomics. Here we present a novel method using multivariate machine learning techniques for comparing automatically derived metabolic reconstructions of sequenced genomes on a large scale. Applying our method to 266 genomes directly led to testable hypotheses such as the link between the potential of microorganisms to cause periodontal disease and their ability to degrade histidine, a link also supported by clinical studies. AU - Kastenmüller, G. AU - Schenk, M.E. AU - Gasteiger, J.* AU - Mewes, H.-W. C1 - 395 C2 - 26769 TI - Uncovering metabolic pathways relevant to phenotypic traits of microbial genomes. JO - Genome Biol. VL - 10 IS - 3 PB - Biomed Central PY - 2009 SN - 1474-760X ER - TY - JOUR AB - With many genomes now sequenced, computational annotation methods to characterize genes and proteins from their sequence are increasingly important. The BioSapiens Network has developed tools to address all stages of this process, and here we review progress in the automated prediction of protein function based on protein sequence and structure. AU - Loewenstein, Y.* AU - Raimondo, D.* AU - Redfern, O.C.* AU - Watson, J.* AU - Frishman, D. AU - Linial, M.* AU - Orengo, C.* AU - Thornton, J.* AU - Tramontano, A.* C1 - 2153 C2 - 26537 TI - Protein function annotation by homology-based inference. JO - Genome Biol. VL - 10 IS - 2 PB - Biomed Central Ltd. PY - 2009 SN - 1474-760X ER - TY - JOUR AB - Addiction is a pathological dysregulation of the brain's reward systems, determined by several complex genetic pathways. The conditioned place preference test provides an evaluation of the effects of drugs in animal models, allowing the investigation of substances at a biologically relevant level with respect to reward. Our lab has previously reported the development of a reliable conditioned place preference paradigm for zebrafish. Here, this test was used to isolate a dominant N-ethyl-N-nitrosourea (ENU)-induced mutant, no addiction (nad(dne3256)), which fails to respond to amphetamine, and which we used as an entry point towards identifying the behaviorally relevant transcriptional response to amphetamine. Results: Through the combination of microarray experiments comparing the adult brain transcriptome of mutant and wild-type siblings under normal conditions, as well as their response to amphetamine, we identified genes that correlate with the mutants' altered conditioned place preference behavior. In addition to pathways classically involved in reward, this gene set shows a striking enrichment in transcription factor-encoding genes classically involved in brain development, which later appear to be reused within the adult brain. We selected a subset of them for validation by quantitative PCR and in situ hybridization, revealing that specific brain areas responding to the drug through these transcription factors include domains of ongoing adult neurogenesis. Finally, network construction revealed functional connections between several of these genes. Conclusions: Together, our results identify a new network of coordinated gene regulation that influences or accompanies amphetamine-triggered conditioned place preference behavior and that may underlie the susceptibility to addiction. AU - Webb, K.J. AU - Norton, W.H.J. AU - Trümbach, D. AU - Meijer, A.H.* AU - Ninkovic, J. AU - Topp, S. AU - Heck, D. AU - Marr, C. AU - Wurst, W. AU - Theis, F.J. AU - Spaink, H.P.* AU - Bally-Cuif, L. C1 - 1747 C2 - 26418 TI - Zebrafish reward mutants reveal novel transcripts mediating the behavioral effects of amphetamine. JO - Genome Biol. VL - 10 IS - 7 PB - Biomed Central PY - 2009 SN - 1474-760X ER - TY - JOUR AB - ABSTRACT: KEGG spider is a web-based tool for interpretation of experimentally derived gene lists in order to gain understanding of metabolism variations at a genomic level. KEGG spider implements a 'pathway-free' framework that overcomes a major bottleneck of enrichment analyses: it provides global models uniting genes from different metabolic pathways. Analyzing a number of experimentally derived gene lists, we demonstrate that KEGG spider provides deeper insights into metabolism variations in comparison to existing methods. AU - Antonov, A.V. AU - Dietmann, S. AU - Mewes, H.-W. C1 - 4379 C2 - 25874 TI - KEGG spider: Interpretation of genomics data in the context of the global gene metabolic network. JO - Genome Biol. VL - 9 IS - 12 PB - BioMed Central PY - 2008 SN - 1474-760X ER - TY - JOUR AB - We show that although the currently available isochore mapping methods agree on the isochore classification of about two-thirds of the human DNA, they produce significantly different results with regard to the location of isochore boundaries and isochore length distribution. We present a new consensus isochore assignment method based on majority voting and provide IsoBase, a comprehensive on-line database of isochore maps for all completely sequenced vertebrate genomes. AU - Schmidt, T. AU - Frishman, D. C1 - 1370 C2 - 25531 TI - Assignment of isochores for all completely sequenced vertebrate genomes using a consensus. JO - Genome Biol. VL - 9 IS - 6 PB - BioMed Central PY - 2008 SN - 1474-760X ER - TY - JOUR AB - A large number of cDNA inserts were sequenced from a high-quality library of chicken bursal lymphocyte cDNAs. Comparisons to public gene databases indicate that the cDNA collection represents more than 2,000 new, full-length transcripts. This resource defines the structure and the coding potential of a large fraction of B-cell specific and housekeeping genes whose function can be analyzed by disruption in the chicken DT40 B-cell line. AU - Caldwell, R.B. AU - Kierzek, A.M.* AU - Arakawa, H. AU - Bezzubov, Y. AU - Zaim, J.* AU - Fiedler, P. AU - Kutter, S. AU - Blagodatski, A. AU - Kostovska, D. AU - Koter, M. AU - Plachy, J.* C1 - 5216 C2 - 22379 SP - 6-6.9 TI - Full-length cDNAs from chicken bursal lymphocytes to facilitate gene function analysis. JO - Genome Biol. VL - 6 PY - 2004 SN - 1474-760X ER - TY - JOUR AB - In animals, steroid hormones regulate gene expression by binding to nuclear receptors. Plants lack genes for nuclear receptors, yet genetic evidence from Arabidopsis suggests developmental roles for lipids/sterols analogous to those in animals. In contrast to nuclear receptors, the lipid/sterol-binding StAR-related lipid transfer (START) protein domains are conserved, making them candidates for involvement in both animal and plant lipid/sterol signal transduction. AU - Schrick, K.* AU - Nguyen, D.* AU - Karlowski, W.M. AU - Mayer, K.F.X. C1 - 1599 C2 - 22224 SP - R41-R41-16 TI - START lipid/sterol-binding domains are amplified in plants and are predominantly associated with homeodomain transcription factors. JO - Genome Biol. VL - 5 PY - 2004 SN - 1474-760X ER - TY - JOUR AB - In computational analysis, the RING-finger domain is one of the most frequently detected domains in the Arabidopsis proteome. In fact, it is more abundant in Arabidopsis than in other eukaryotic genomes. However, computational analysis might classify ambiguous domains of the closely related PHD and LIM motifs as RING domains by mistake. Thus, we set out to define an ordered set of Arabidopsis RING domains by evaluating predicted domains on the basis of recent structural data. RESULTS: Inspection of the proteome with a current InterPro release predicts 446 RING domains. We evaluated each detected domain and as a result eliminated 59 false positives. The remaining 387 domains were grouped by cluster analysis and according to their metal-ligand arrangement. We further defined novel patterns for additional computational analyses of the proteome. They were based on recent structural data that enable discrimination between the related RING, PHD and LIM domains. These patterns allow us to predict with different degrees of certainty whether a particular domain is indeed likely to form a RING finger. CONCLUSIONS: AU - Kosarev, P. AU - Mayer, K.F.X. AU - Hardtke, C.S.* C1 - 22363 C2 - 21241 TI - Evaluation and classification of RING-finger domains encoded by the Arabidopsis genome. JO - Genome Biol. VL - 3 PY - 2002 SN - 1474-760X ER - TY - JOUR AB - In microarray data analysis, the comparison of gene-expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large datasets. Less work has been published concerning the assessment of the reliability of gene-selection procedures. Here we describe a method to assess reliability in multivariate microarray data analysis using permutation-validated principal components analysis (PCA). The approach is designed for microarray data with a group structure. RESULTS: We used PCA to detect the major sources of variance underlying the hybridization conditions followed by gene selection based on PCA-derived and permutation-based test statistics. We validated our method by applying it to well characterized yeast cell-cycle data and to two datasets from our laboratory. We could describe the major sources of variance, select informative genes and visualize the relationship of genes and arrays. We observed differences in the level of the explained variance and the interpretability of the selected genes. CONCLUSIONS: Combining data visualization and permutation-based gene selection, permutation-validated PCA enables one to illustrate gene-expression variance between several conditions and to select genes by taking into account the relationship of between-group to within-group variance of genes. The method can be used to extract the leading sources of variance from microarray data, to visualize relationships between genes and hybridizations and to select informative genes in a statistically reliable manner. This selection accounts for the level of reproducibility of replicates or group structure as well as gene-specific scatter. Visualization of the data can support a straightforward biological interpretation. AU - Landgrebe, J.* AU - Wurst, W. AU - Welzl, G. C1 - 10159 C2 - 20774 SP - 3-11 TI - Permutation-validated principal components analysis of microarray data. JO - Genome Biol. VL - 3 PY - 2002 SN - 1474-760X ER -