McMurry, J.A.* ; Juty, N.* ; Blomberg, N.* ; Burdett, T.* ; Conlin, T.* ; Conte, N.* ; Courtot, M.* ; Deck, J.* ; Dumontier, M.* ; Fellows, D.K.* ; Gonzalez-Beltran, A.* ; Gormanns, P. ; Grethe, J.* ; Hastings, J.* ; Hériché, J.K.* ; Hermjakob, H.* ; Ison, J.C.* ; Jimenez, R.C.* ; Jupp, S.* ; Kunze, J.* ; Laibe, C.* ; Le Novère, N.* ; Malone, J.* ; Martin, M.J.* ; McEntyre, J.R.* ; Morris, C.* ; Muilu, J.* ; Müller, W.* ; Rocca-Serra, P.* ; Sansone, S.A.* ; Sariyar, M.* ; Snoep, J.L.* ; Soiland-Reyes, S.* ; Stanford, N.J.* ; Swainston, N.* ; Washington, N.* ; Williams, A.R.* ; Wimalaratne, S.M.* ; Winfree, L.M.* ; Wolstencroft, K.* ; Goble, C.* ; Mungall, C.J.* ; Haendel, M.A.* ; Parkinson, H.*
Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data.
PLoS Biol. 15:e2001414 (2017)
In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
Publication type
Article: Journal article
Document type
Scientific Article
Thesis type
Editors
Keywords
Gene Name Errors; Ontologies; Community
Keywords plus
Language
english
Publication Year
2017
Prepublished in Year
HGF-reported in Year
2017
ISSN (print) / ISBN
1544-9173
e-ISSN
1545-7885
ISBN
Book Volume Title
Conference Title
Conference Date
Conference Location
Proceedings Title
Quellenangaben
Volume: 15,
Issue: 6,
Pages: ,
Article Number: e2001414
Supplement: ,
Series
Publisher
Public Library of Science (PLoS)
Publishing Place
San Francisco
Day of Oral Examination
0000-00-00
Advisor
Referee
Examiner
Topic
University
University place
Faculty
Publication date
0000-00-00
Application date
0000-00-00
Patent owner
Further owners
Application country
Patent priority
Reviewing status
Peer reviewed
POF-Topic(s)
30201 - Metabolic Health
Research field(s)
Genetics and Epidemiology
PSP Element(s)
G-500691-001
Grants
Copyright
Erfassungsdatum
2017-07-26