What is a database and why do I need one for my sample

One misconception about proteomics (at least bottom up proteomics) is that we can identify your protein sequence directly from the data we generate from our mass spectrometers. Although this is technically possible using something called de novo sequencing, in practice it still does not work very well.

What we really do is match MS/MS patterns we generate in the mass spectrometer  with theoretical MS/MS pattern from existing sequences. Usually these sequences are stored in a FASTA formatted text file. Sometimes people refer to this as a database, which it really isn’t.

So where do we get this FASTA file? There are a few good places to look

There are a few others too

Another option is to generate your own transcriptome if you cannot find any available protein sequences.

Here is a good article that can get you started


I’ll add more to this  section hopefully soon

Please contact me if you can’t find a good database 530-754-5298. Usually we can find if one exists.

Please click here if this helped you.
3 people found this helpful.

Category: Protein ID

← FAQs
Posted in