From: "Simon" Date: Thu, 13 Jul 2000 00:04:48 +0200 Subject: HELP!! Organization: TIN I am a italian chemical reasearcher and i study proteins and peptides with LC-MS/MS. I have a LCQDeca instrument and recently, I have bought SEQUEST program that allow to detect protein by the analysis of MS/MS spectrum of peptides from the proteins digestions. I have a problem: I need to obtain proteins databases in .FASTA format to use this program. Do you know where I can find this databases on internet? Thank you very much for your help. ****************************************************************************** From: phains@NOSPAMproteome.org.au (Peter Hains) Date: 13 Jul 2000 00:03:06 GMT Subject: Re: HELP!! Organization: Australian Proteome Analysis Facility You can obtain a number of databases in FASTA format, it depens on which database you want. For SWISS-PROT and TREMBL go to http://expasy.proteome.org.au/sprot/ and follow the instructions for downloading the database. You have to look for the FASTA format database, the links take you to the "SWISS-PROT" fromat not the FASTA format. For NCBI go to http://prospector.ucsf.edu/ucsfhtml3.2/instruct/allman.htm#database For OWL: http://www.biochem.ucl.ac.uk/bsm/dbbrowser/OWL/gettingowl.html For species specific genomes go to: http://www.tigr.org/ Hope this helps, Peter. -- I'm afraid I don't have a clever saying to put here. Remove NOSPAM to e-mail me. Peter Hains (PhD) Ph. +61 2 9850 6216 Australian Proteome Analysis Facility Fax. +61 2 9850 6200 Level 4 Building F7B Macquarie University, Sydney 2109 ****************************************************************************** ****************************************************************************** From: "MSweeney" Date: Thu, 13 Jul 2000 09:45:40 -0700 Subject: Re: HELP!! Organization: * also... although sequest comes with documentation, I find http://fields.scripps.edu/sequest/index.html useful as well. Many people use sequest without considering the database issues. For example, if you are searching human ms/ms data there may be little reason to search against a database with prokaryotic sequences. Most people I know use the NCBI nr database (it is big) although the est database is popular also for human data. Search times are much faster with smaller databases and so if you don't have to search against it...The bottom line is that you should know what the database contains and what the sample origin and analytical goals are. Recently on the abrf newsgroup Len Packman described a method for taking LCQ dta files and searching them over the web at http://www.matrixscience.com/cgi/index.pl?page=../home.html go to the www.abrf.com site and check the newsgroup postings in the last month of so for details (or write me). The times using mascot at matrix science are really fast. Len has a batch file listing in one of his posts that allows a large set of dta files (generated from a big lc/ms/ms triple play run) to be combined and then they can be searched over the web. Mascot running this way allows less control over the databases that you search than you can get from sequest running locally. ############################################################ #### I include a QandA from ncbi's newsletter http://www.ncbi.nlm.nih.gov/About/newsletter.html The issue here applies also to sequest searches. http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/winter00.pd f If I run a BLAST search against only the nr database, am I likely to miss anything important? Yes. The BLAST htgs (High Throughput Genomic Sequence) database is excluded from nr and must be searched separately; the same is true of the BLAST EST and GSS databases. The Microbial Genomes: Finished and Unfinishedlink on the main BLAST page provides access to data on 68 finished and unfinished microbial genomes, which are also not contained in the nr database. Researchers interested in BLASTing against human contig data should access these data by using the Human Genome BLASTlink from the main BLAST page, rather than searching nr. ****************************************************************************** From: David Stranz Date: Thu, 13 Jul 2000 12:41:59 -0700 Subject: Re: HELP!! Organization: Sierra Analytics, LLC There are a number of protein databases available in FASTA format. The Human Genome Mapping Project in the UK has copies of most of them: http://www.hgmp.mrc.ac.uk/Bioinformatics/Databases/ The Swiss-Prot non-redundant protein databases are available in FASTA format via anonymous FTP from: ftp://ftp.expasy.ch/databases/sp_tr_nrdb/fasta/ (Warning: the Swiss-Prot database, as a compressed file, is 21Mb to 59Mb, depending on which version you download. The European Bioinformatics Institute requires a license fee for commercial use of these databases). Hope this helps. Regards, David Stranz Sierra Analytics, LLC ****************************************************************************** From: "M Sweeney - MSMS Consulting" Date: Mon, 07 Aug 2000 20:45:28 GMT Subject: Re: HELP!! (fasta db locations) Organization: EarthLink Inc. -- http://www.EarthLink.net This site list many at http://gimr.garvan.unsw.edu.au/public/corthals pick the tiny fasta db link (takes you to the bottom of the page). and has a chapter or two from a good recent book in this area Corthals G.L., Gygi, S.P., Aebersold R. and Patterson, S.D., Identification of proteins by mass spectrometry, in Proteome research: 2D gel electrophoresis and detection methods, Ed. Rabilloud, T., Springer, New York, 1999, pp. 197-231 at http://gimr.garvan.unsw.edu.au/public/corthals/book/IPMS.html Matt Sweeney mattsweeney@earthlink.net Mass Spec Consulting Training/Operations/Consulting/Method Development LC/MS Pharmacokinetics, Peptides, Proteins, Metabolism, Maintenance Classes, Specialist in Finnigan Equipment and Software