Monday, March 18, 2013

Decoy Fusion on traditional target + decoy database

We were asked a question today by a PEAKS user about FDR result validation. He used PEAKS DB for peptide identification and enabled the built-in decoy fusion method to estimate the FDR. When examining the result, he realized that the FASTA database used for the search is a concatenation of target and decoy proteins. So his question is that is the FDR control still valid or does he have to re-run the search.

The decoy fusion method concatenate the decoy and target sequences of the same protein together as a "fused" sequence (detail explanation can be found here). This ensures that the target and decoy lengths are always the same. If in the searched database, the decoy length is the same as the target length, then PEAKS DB with decoy fusion searched exactly three times the decoy length.

As long as the decoy protein in the searched database is distinguishable, the user can simply discard those hits. The FDR reported by PEAKS is still safe to be used as it only becomes more conservative. 

