Dieckmann, M.A.* ; Beyvers, S.* ; Nkouamedjo-Fankep, R.C.* ; Hanel, P. ; Jelonek, L.* ; Blom, J.* ; Goesmann, A.*
EDGAR3.0: Comparative genomics and phylogenomics on a scalable infrastructure.
Nucleic Acids Res. 49, W185-W192 (2021)
The EDGAR platform, a web server providing databases of precomputed orthology data for thousands of microbial genomes, is one of the most established tools in the field of comparative genomics and phylogenomics. Based on precomputed gene alignments, EDGAR allows quick identification of the differential gene content, i.e. the pan genome, the core genome, or singleton genes. Furthermore, EDGAR features a wide range of analyses and visualizations like Venn diagrams, synteny plots, phylogenetic trees, as well as Amino Acid Identity (AAI) and Average Nucleotide Identity (ANI) matrices. During the last few years, the average number of genomes analyzed in an EDGAR project increased by two orders of magnitude. To handle this massive increase, a completely new technical backend infrastructure for the EDGAR platform was designed and launched as EDGAR3.0. For the calculation of new EDGAR3.0 projects, we are now using a scalable Kubernetes cluster running in a cloud environment. A new storage infrastructure was developed using a file-based high-performance storage backend which ensures timely data handling and efficient access. The new data backend guarantees a memory efficient calculation of orthologs, and parallelization has led to drastically reduced processing times. Based on the advanced technical infrastructure new analysis features could be implemented including POCP and FastANI genomes similarity indices, UpSet intersecting set visualization, and circular genome plots. Also the public database section of EDGAR was largely updated and now offers access to 24,317 genomes in 749 free-to-use projects. In summary, EDGAR 3.0 provides a new, scalable infrastructure for comprehensive microbial comparative gene content analysis. The web server is accessible at http://edgar3.computational.bio.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
Publikationstyp
Artikel: Journalartikel
Dokumenttyp
Wissenschaftlicher Artikel
Typ der Hochschulschrift
Herausgeber
Schlagwörter
Species Definition; Visualization; Package; Circos; Web
Keywords plus
Sprache
englisch
Veröffentlichungsjahr
2021
Prepublished im Jahr
HGF-Berichtsjahr
2021
ISSN (print) / ISBN
0305-1048
e-ISSN
1362-4962
ISBN
Bandtitel
Konferenztitel
Konferzenzdatum
Konferenzort
Konferenzband
Quellenangaben
Band: 49,
Heft: W1,
Seiten: W185-W192
Artikelnummer: ,
Supplement: ,
Reihe
Verlag
Oxford University Press
Verlagsort
Great Clarendon St, Oxford Ox2 6dp, England
Tag d. mündl. Prüfung
0000-00-00
Betreuer
Gutachter
Prüfer
Topic
Hochschule
Hochschulort
Fakultät
Veröffentlichungsdatum
0000-00-00
Anmeldedatum
0000-00-00
Anmelder/Inhaber
weitere Inhaber
Anmeldeland
Priorität
Begutachtungsstatus
Peer reviewed
POF Topic(s)
30205 - Bioengineering and Digital Health
Forschungsfeld(er)
Enabling and Novel Technologies
PSP-Element(e)
G-503800-001
Förderungen
de.NBI cloud
Budgetary funds
German Federal Ministry of Education and Research within the de.NBI network
Copyright
Erfassungsdatum
2021-06-18