SEGUID Header
  ::HOME    ::OVERVIEW    ::TOOLS    ::SOAP-Match    ::WEB SERVICES    ::FTP    ::HELP 

A SEquence Globally Unique IDentifier (SEGUID) Proteome Database

 

 

There are numerous publicly available protein sequence databases containing millions of unique entries. The different databases however use their own identifiers for the same protein sequence. Although these databases normally list the aliases used at other sources, bringing together data and keeping it up to date by the end user requires substantial effort. We propose the use of a unique sequence identifier (SEGUID) that is derived from the primary sequence itself and easily generated by any user. SEGUIDs are resilient to changes in public and private databases as they remain constant throughout the lifetime of a given protein sequence. The SEGUID Proteome Database (http://bioinformatics.anl.gov/seguid/ ) provides aliases for the annotated entries available from several public databases and can be downloaded or generated easily at remote sites. SEGUIDs have been used in our proteomics laboratory for years and proved to be useful integrating mass spectrometry results, two-dimensional gel electrophoresis data, and bioinformatics information. Since SEGUIDs are stable, predictions based on the primary sequence information need to be calculated only once. On-line prediction servers could quickly generate SEGUID for a submitted sequence and provide pre-calculated prediction result if available. We have generated around 500 different calculations for the more than 2.5 million sequences and the results are available on-line or from our FTP site (ftp://bioinformatics.anl.gov/seguid ).

 


Operated by the   University of Chicago   for the   U.S. Department of Energy  
This is a Federal computer (see   Security Notice  ). For condition of use, see   ANL Disclaimer  .

last modified: