Friday, April 26, 2013

Common ptifalls of FDR estimation part three

The third pitfall is also caused due to the over-emphasis on sensitivity.

There is another trend in database search software to re-score the peptide identification results by using machine learning. The idea is straightforward: After the search, we know what the decoy hits are. The algorithm should take advantage of it, and retrain the parameters of the scoring function to get rid of the decoy hits. With this effort, it will get rid of a lot of the target false hits as well.

The method is valid, except that it may cause FDR underestimation. This is because the target false hits are unknown to the machine learning algorithm. Therefore, there is a risk that the machine learning algorithm removes more decoy hits than the target false hits.
This overfit risk is well known in machine learning. A machine learning expert can reduce the risk but can never get rid of it. 

The solution to this pitfall number 3 is trickier.

The first suggestion: don’t use it. The philosophy here is that judges cannot be players. If we want to use the decoy for result validate, the decoy information should never be released to the search algorithm.

If this re-scoring method must be used due to the low-performance of some database search software, it should only be used for very large dataset to reduce the risk of over-fit.

Perhaps the best solution is the third one. That is, the retraining of the score parameters should be done for each different instrument type, instead of each dataset. This will gain much of the benefit provided by machine learning, but without the problem of over-fitting. Indeed, this third approach is what we do in the PEAKS DB algorithm.

*The content of this post is extracted from "Practical Guide to Significantly Improve Peptide Identification Sensitivity and Accuracy" by Dr. Bin Ma, CTO of Bioinformatics Solutions Inc. You can find the link to the guide on this page.


  1. هذا ما يريده جميع العملاء بشكل مستمر لأن الشركة توفر جميع خدمات مكافحة الآفات على مدار الساعة وطوال أيام الأسبوع ، مما يجعلها واحدة من أهم الشركات العاملة في مجال مكافحة الآفات. والقضاء.
    شركة مكافحة النمل الابيض بجازان
    شركة مكافحة حشرات بجازان
    شركة رش مبيدات بجازان
    افضل شركة رش مبيدات

  2. I’m excited to uncover this page. I need to thank you for your time for this, particularly fantastic read!! I definitely really liked every part of it and I also have you saved to fav to look at new information in your site.
    Machine learning training in pune
    Machine learning classes in pune
    Machine learning course in pune

  3. Very simple and useful content. I am also wanted to write blog kindly guide me if my topic is geophyscial investigation then what should I do first and how will I create new and unique content on this topic

  4. security guard services, Los Angeles' most prestigious security guard company, where safety begins with protection and reliability begins with security guard company san francisco

    Officers are standing by to keep your belongings and lives safe and protected. We offer defence and prevention against any injury, haphazardness, or other crime. We are Fire Watch Guard Security Service, a reputable and dependable Los Angeles security firm.

  5. Hello, I read this nice article. I think You put a best effort to write this perfect article.

  6. An interesting content to read. Thanks to the author for sharing this good post. Keep sharing more good blogs. Divorce Lawyers Loudoun VA

  7. Create QR codes effortlessly using the intuitive interface of the QR Code Generator Free on It's the go-to solution for quick and free QR code generation.

  8. Advocacy requires the application of strategic thought. In order to get the greatest result for their clients, Fairfax criminal defense lawyers strategically prepare for every case they take on. They do this by foreseeing obstacles and seizing chances.Fairfax Virginia Criminal Attorney