TY - JOUR AB - BACKGROUND: Comparative genomics, genetic spread analysis, and context-aware ranking are crucial in understanding microbial dynamics' impact on public health. gSpreadComp streamlines the path from in silico analysis to hypothesis generation. By integrating comparative genomics, genome annotation, normalization, plasmid-mediated gene transfer, and microbial resistance-virulence risk-ranking into a unified workflow, gSpreadComp facilitates hypothesis generation from complex microbial datasets. FINDINGS: The gSpreadComp workflow works through 6 modular steps: taxonomy assignment, genome quality estimation, antimicrobial resistance (AMR) gene annotation, plasmid/chromosome classification, virulence factor annotation, and downstream analysis. Our workflow calculates gene spread using normalized weighted average prevalence and ranks potential resistance-virulence risk by integrating microbial resistance, virulence, and plasmid transmissibility data and producing an HTML report. As a use case, we analyzed 3,566 metagenome-assembled genomes recovered from human gut microbiomes across diets. Our findings indicated consistent AMR across diets, with diet-specific resistance patterns, such as increased bacitracin in vegans and tetracycline in omnivores. Notably, ketogenic diets showed a slightly higher resistance-virulence rank, while vegan and vegetarian diets encompassed more plasmid-mediated gene transfer. CONCLUSIONS: The gSpreadComp workflow aims to facilitate hypothesis generation for targeted experimental validations by the identification of concerning resistant hotspots in complex microbial datasets. Our study raises attention to a more thorough study of the critical role of diet in microbial community dynamics and the spread of AMR. This research underscores the importance of integrating genomic data into public health strategies to combat AMR. The gSpreadComp workflow is available at https://github.com/mdsufz/gSpreadComp/. AU - Kasmanas, J.C.* AU - Magnúsdóttir, S.* AU - Zhang, J.* AU - Smalla, K.* AU - Schloter, M. AU - Stadler, P.F.* AU - de Leon Ferreira de Carvalho, A.C.P.* AU - Rocha, U.* C1 - 75026 C2 - 57708 CY - Great Clarendon St, Oxford Ox2 6dp, England TI - Integrating comparative genomics and risk classification by assessing virulence, antimicrobial resistance, and plasmid spread in microbial communities with gSpreadComp. JO - GigaScience VL - 14 PB - Oxford Univ Press PY - 2025 ER - TY - JOUR AB - BACKGROUND: The growing number of metabolomics studies, based on high-dimensional data measured by hyphenated mass spectrometry (MS) and/or nuclear magnetic resonance (NMR) spectroscopy, has sparked the creation of several public metabolomics data repositories. Each repository emphasizes different aspects regarding data selection and representation, but most offer only limited options for privacy-preserving data sharing. RESULTS: We present MetaboSERV, an open-source, browser-based metabolomics platform dedicated to the selection, integration, and sharing of quantitative metabolomics data and metadata with controlled data access. MetaboSERV aims to aid researchers in analyzing their results by facilitating means to browse, visualize, and compare data across available datasets. It provides different access control functionalities, creating an environment in which data can be shared safely in a privacy-preserving manner to support collaborative and interdisciplinary research. Furthermore, it is designed to be extensible and adaptable to existing data management infrastructures through the creation of self-managed MetaboSERV instances, for which we provide the source code and a set of configurable Docker images. CONCLUSIONS: The public MetaboSERV instance is available at https://metaboserv.ckdn.app, and the source code can be found at https://gitlab.gwdg.de/MedBioinf/metabolomics/metaboserv. The Research Resource Identifier (RRID) for MetaboSERV is SCR_025496. AU - Tucholski, T.* AU - Maennel, A.* AU - Njipouombe Nsangou, Y.A. AU - Schuchardt, S.* AU - Gruber, M.* AU - Kellermeier, F.* AU - Dettmer, K.* AU - Oefner, P.J.* AU - Gronwald, W.* AU - Altenbuchinger, M.* AU - Dönitz, J. AU - Zacharias, H.U.* C1 - 75279 C2 - 57923 CY - Great Clarendon St, Oxford Ox2 6dp, England TI - MetaboSERV-a platform for selecting, exchanging, and visualizing metabolomics data with controlled data access. JO - GigaScience VL - 14 PB - Oxford Univ Press PY - 2025 ER - TY - JOUR AU - Tsepilov, Y.A.* AU - Sharapov, S.Z.* AU - Zaytseva, O.O.* AU - Krumsiek, J. AU - Prehn, C. AU - Adamski, J. AU - Kastenmüller, G. AU - Wang-Sattler, R. AU - Strauch, K. AU - Gieger, C. AU - Aulchenko, Y.S.* C1 - 57739 C2 - 47883 CY - Great Clarendon St, Oxford Ox2 6dp, England TI - A network-based conditional genetic association analysis of the human metabolome (vol 7, gij137, 2018). JO - GigaScience VL - 8 IS - 12 PB - Oxford Univ Press PY - 2019 ER - TY - JOUR AB - Background: With the advent of the age of big data in bioinformatics, large volumes of data and high-performance computing power enable researchers to perform re-analyses of publicly available datasets at an unprecedented scale. Ever more studies imply the microbiome in both normal human physiology and a wide range of diseases. RNA sequencing technology (RNA-seq) is commonly used to infer global eukaryotic gene expression patterns under defined conditions, including human disease-related contexts; however, its generic nature also enables the detection of microbial and viral transcripts. Findings:We developed a bioinformatic pipeline to screen existing human RNA-seq datasets for the presence of microbial and viral reads by re-inspecting the non-human-mapping read fraction. We validated this approach by recapitulating outcomes from six independent, controlled infection experiments of cell line models and compared them with an alternative metatranscriptomic mapping strategy. We then applied the pipeline to close to 150 terabytes of publicly available raw RNA-seq data from more than 17,000 samples from more than 400 studies relevant to human disease using state-of-the-art high-performance computing systems. The resulting data from this large-scale re-analysis are made available in the presented MetaMap resource. Conclusions: Our results demonstrate that common human RNA-seq data, including those archived in public repositories, might contain valuable information to correlate microbial and viral detection patterns with diverse diseases. The presented MetaMap database thus provides a rich resource for hypothesis generation toward the role of the microbiome in human disease. Additionally, codes to process new datasets and perform statistical analyses are made available. AU - Simon, L. AU - Karg, S. AU - Westermann, A.J.* AU - Engel, M. AU - Elbehery, A.H.A. AU - Hense, B.A. AU - Heinig, M. AU - Deng, L. AU - Theis, F.J. C1 - 53652 C2 - 44932 CY - Great Clarendon St, Oxford Ox2 6dp, England SP - 1-8 TI - MetaMap: An atlas of metatranscriptomic reads in human disease-related RNA-seq data. JO - GigaScience VL - 7 IS - 6 PB - Oxford Univ Press PY - 2018 ER - TY - JOUR AB - Background: Genome-wide association studies have identified hundreds of loci that influence a wide variety of complex human traits; however, little is known regarding the biological mechanism of action of these loci. The recent accumulation of functional genomics ("omics"), including metabolomics data, has created new opportunities for studying the functional role of specific changes in the genome. Functional genomic data are characterized by their high dimensionality, the presence of (strong) statistical dependency between traits, and, potentially, complex genetic control. Therefore, the analysis of such data requires specific statistical genetics methods. Results: To facilitate our understanding of the genetic control of omics phenotypes, we propose a trait-centered, network-based conditional genetic association (cGAS) approach for identifying the direct effects of genetic variants on omics-based traits. For each trait of interest, we selected from a biological network a set of other traits to be used as covariates in the cGAS. The network can be reconstructed either from biological pathway databases (a mechanistic approach) or directly from the data, using a Gaussian graphical model applied to the metabolome (a data-driven approach). We derived mathematical expressions that allow comparison of the power of univariate analyses with conditional genetic association analyses. We then tested our approach using data from a population-based Cooperative Health Research in the region of Augsburg (KORA) study (n = 1,784 subjects, 1.7 million single-nucleotide polymorphisms) with measured data for 151 metabolites. Conclusions: We found that compared to single-trait analysis, performing a genetic association analysis that includes biologically relevant covariates can either gain or lose power, depending on specific pleiotropic scenarios, for which we provide empirical examples. In the context of analyzed metabolomics data, the mechanistic network approach had more power compared to the data-driven approach. Nevertheless, we believe that our analysis shows that neither a prior-knowledge-only approach nor a phenotypic-data-only approach is optimal, and we discuss possibilities for improvement. AU - Tsepilov, Y.A.* AU - Sharapov, S.Z.* AU - Zaytseva, O.O.* AU - Krumsiek, J. AU - Prehn, C. AU - Adamski, J. AU - Kastenmüller, G. AU - Wang-Sattler, R. AU - Strauch, K. AU - Gieger, C. AU - Aulchenko, Y.S.* C1 - 54852 C2 - 45835 TI - A network-based conditional genetic association analysis of the human metabolome. JO - GigaScience VL - 7 IS - 12 PY - 2018 ER - TY - JOUR AB - BACKGROUND: Three-dimensional (3D) imaging mass spectrometry (MS) is an analytical chemistry technique for the 3D molecular analysis of a tissue specimen, entire organ, or microbial colonies on an agar plate. 3D-imaging MS has unique advantages over existing 3D imaging techniques, offers novel perspectives for understanding the spatial organization of biological processes, and has growing potential to be introduced into routine use in both biology and medicine. Owing to the sheer quantity of data generated, the visualization, analysis, and interpretation of 3D imaging MS data remain a significant challenge. Bioinformatics research in this field is hampered by the lack of publicly available benchmark datasets needed to evaluate and compare algorithms. FINDINGS: High-quality 3D imaging MS datasets from different biological systems at several labs were acquired, supplied with overview images and scripts demonstrating how to read them, and deposited into MetaboLights, an open repository for metabolomics data. 3D imaging MS data were collected from five samples using two types of 3D imaging MS. 3D matrix-assisted laser desorption/ionization imaging (MALDI) MS data were collected from murine pancreas, murine kidney, human oral squamous cell carcinoma, and interacting microbial colonies cultured in Petri dishes. 3D desorption electrospray ionization (DESI) imaging MS data were collected from a human colorectal adenocarcinoma. CONCLUSIONS: With the aim to stimulate computational research in the field of computational 3D imaging MS, selected high-quality 3D imaging MS datasets are provided that could be used by algorithm developers as benchmark datasets. AU - Oetjen, J.* AU - Veselkov, K.* AU - Watrous, J.* AU - McKenzie, J.S.* AU - Becker, M.* AU - Hauberg-Lotte, L.* AU - Kobarg, J.H.* AU - Strittmatter, N.* AU - Mróz, A.K.* AU - Hoffmann, F.* AU - Trede, D.* AU - Palmer, A.* AU - Schiffler, S.* AU - Steinhorst, K.* AU - Aichler, M. AU - Goldin, R.* AU - Guntinas-Lichius, O.* AU - von Eggeling, F.* AU - Thiele, H.* AU - Maedler, K.* AU - Walch, A.K. AU - Maass, P.* AU - Dorrestein, P.C.* AU - Takats, Z.* AU - Alexandrov, T.* C1 - 44656 C2 - 36994 TI - Benchmark datasets for 3D MALDI- and DESI-imaging mass spectrometry. JO - GigaScience VL - 4 PY - 2015 ER -