PuSH - Publication Server of Helmholtz Zentrum München

Super paramagnetic clustering of protein sequences.

BMC Bioinformatics 6:82 (2005)
Publ. Version/Full Text Volltext DOI PMC
Open Access Gold
Creative Commons Lizenzvertrag
BACKGROUND: Detection of sequence homologues represents a challenging task that is important for the discovery of protein families and the reliable application of automatic annotation methods. The presence of domains in protein families of diverse function, inhomogeneity and different sizes of protein families create considerable difficulties for the application of published clustering methods. RESULTS: Our work analyses the Super Paramagnetic Clustering (SPC) and its extension, global SPC (gSPC) algorithm. These algorithms cluster input data based on a method that is analogous to the treatment of an inhomogeneous ferromagnet in physics. For the SwissProt and SCOP databases we show that the gSPC improves the specificity and sensitivity of clustering over the original SPC and Markov Cluster algorithm (TRIBE-MCL) up to 30%. The three algorithms provided similar results for the MIPS FunCat 1.3 annotation of four bacterial genomes, Bacillus subtilis, Helicobacter pylori, Listeria innocua and Listeria monocytogenes. However, the gSPC covered about 12% more sequences compared to the other methods. The SPC algorithm was programmed in house using C++ and it is available at http://mips.gsf.de/proj/spc. The FunCat annotation is available at http://mips.gsf.de. CONCLUSION: The gSPC calculated to a higher accuracy or covered a larger number of sequences than the TRIBE-MCL algorithm. Thus it is a useful approach for automatic detection of protein families and unsupervised annotation of full genomes.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
5.423
0.000
34
41
Tags
Annotations
Special Publikation
Hide on homepage

Edit extra information
Edit own tags
Private
Edit own annotation
Private
Hide on publication lists
on hompage
Mark as special
publikation
Publication type Article: Journal article
Document type Scientific Article
Keywords NEURAL-NETWORK; GENE-EXPRESSION; WHOLE GENOMES; YEAST GENOME; DATABASE; CLASSIFICATION; ANNOTATION; ALGORITHM; FAMILIES; GENERATION
Language english
Publication Year 2005
HGF-reported in Year 0
ISSN (print) / ISBN 1471-2105
e-ISSN 1471-2105
Quellenangaben Volume: 6, Issue: , Pages: , Article Number: 82 Supplement: ,
Publisher BioMed Central
Reviewing status Peer reviewed
POF-Topic(s) 30505 - New Technologies for Biomedical Discoveries
Research field(s) Enabling and Novel Technologies
PSP Element(s) G-503700-001
PubMed ID 15804359
Scopus ID 25444458854
Erfassungsdatum 2005-12-31