PEAKS Blog: Common ptifalls of FDR estimation part three

Friday, April 26, 2013

Common ptifalls of FDR estimation part three

The third pitfall is also caused due to the over-emphasis on sensitivity.

There is another trend in database search software to re-score the peptide identification results by using machine learning. The idea is straightforward: After the search, we know what the decoy hits are. The algorithm should take advantage of it, and retrain the parameters of the scoring function to get rid of the decoy hits. With this effort, it will get rid of a lot of the target false hits as well.

The method is valid, except that it may cause FDR underestimation. This is because the target false hits are unknown to the machine learning algorithm. Therefore, there is a risk that the machine learning algorithm removes more decoy hits than the target false hits.
This overfit risk is well known in machine learning. A machine learning expert can reduce the risk but can never get rid of it.

The solution to this pitfall number 3 is trickier.

The first suggestion: don’t use it. The philosophy here is that judges cannot be players. If we want to use the decoy for result validate, the decoy information should never be released to the search algorithm.

If this re-scoring method must be used due to the low-performance of some database search software, it should only be used for very large dataset to reduce the risk of over-fit.

Perhaps the best solution is the third one. That is, the retraining of the score parameters should be done for each different instrument type, instead of each dataset. This will gain much of the benefit provided by machine learning, but without the problem of over-fitting. Indeed, this third approach is what we do in the PEAKS DB algorithm.

*The content of this post is extracted from "Practical Guide to Significantly Improve Peptide Identification Sensitivity and Accuracy" by Dr. Bin Ma, CTO of Bioinformatics Solutions Inc. You can find the link to the guide on this page.

12 comments:

ahmedAugust 19, 2019 at 4:39 PM
هذا ما يريده جميع العملاء بشكل مستمر لأن الشركة توفر جميع خدمات مكافحة الآفات على مدار الساعة وطوال أيام الأسبوع ، مما يجعلها واحدة من أهم الشركات العاملة في مجال مكافحة الآفات. والقضاء.
شركة مكافحة النمل الابيض بجازان
شركة مكافحة حشرات بجازان
شركة رش مبيدات بجازان
افضل شركة رش مبيدات
ReplyDelete
Replies
AnonymousJuly 8, 2021 at 6:23 AM
I’m excited to uncover this page. I need to thank you for your time for this, particularly fantastic read!! I definitely really liked every part of it and I also have you saved to fav to look at new information in your site.
Machine learning training in pune
Machine learning classes in pune
Machine learning course in pune
ReplyDelete
Replies
BiginmOctober 5, 2021 at 2:45 AM
Very simple and useful content. I am also wanted to write blog kindly guide me if my topic is geophyscial investigation then what should I do first and how will I create new and unique content on this topic
ReplyDelete
Replies
Direct Fire Watch SecurityJune 21, 2022 at 6:38 AM
security guard services, Los Angeles' most prestigious security guard company, where safety begins with protection and reliability begins with security guard company san francisco

Officers are standing by to keep your belongings and lives safe and protected. We offer defence and prevention against any injury, haphazardness, or other crime. We are Fire Watch Guard Security Service, a reputable and dependable Los Angeles security firm.
ReplyDelete
Replies
Logoinn.nzJuly 13, 2022 at 8:57 AM
Hello, I read this nice article. I think You put a best effort to write this perfect article.
ReplyDelete
Replies
PreslinAugust 13, 2023 at 2:26 AM
An interesting content to read. Thanks to the author for sharing this good post. Keep sharing more good blogs. Divorce Lawyers Loudoun VA
ReplyDelete
Replies
qrdoveJanuary 8, 2024 at 10:17 AM
Create QR codes effortlessly using the intuitive interface of the QR Code Generator Free on qrgateway.com. It's the go-to solution for quick and free QR code generation.
ReplyDelete
Replies
AaronapMarch 8, 2024 at 3:18 AM
Advocacy requires the application of strategic thought. In order to get the greatest result for their clients, Fairfax criminal defense lawyers strategically prepare for every case they take on. They do this by foreseeing obstacles and seizing chances.Fairfax Virginia Criminal Attorney
ReplyDelete
Replies
AnonymousAugust 12, 2024 at 11:07 AM
Understanding the pitfalls of FDR estimation in data analysis reminds me of the critical role cyber security plays in protecting sensitive information. I recently read about Lachmi Sagi, a notorious figure involved in cyber crimes, which made me realize how crucial accurate data interpretation is for security. In cyber security, just as with FDR estimation, precision is key to avoiding false conclusions that could lead to vulnerabilities. By learning from both fields, I'm more aware of how errors in data analysis could be exploited by cybercriminals like Sagi. Ensuring accuracy in all aspects is essential for maintaining robust security.
ReplyDelete
Replies
DonnajacobNovember 13, 2024 at 6:35 AM
Despite their general validity, FDR estimate techniques may have problems such as underestimation from unknown target false hits, which could cause the machine learning system to inadvertently eliminate more decoy than real false hits. Consulting an assignment helper can provide helpful insights into managing overfit risks and better comprehending these issues for individuals who are having difficulty navigating these intricacies.
ReplyDelete
Replies
barnabyJanuary 2, 2025 at 5:46 AM
The blog focused on bioinformatics and its applications in research and data analysis. It likely covers topics like genomics, proteomics, and computational biology, providing insights into tools and techniques used in the field. The blog may feature tutorials, software updates, case studies, and trends in bioinformatics. Content aims to support scientists, researchers, and students working with biological data. It serves as a resource for staying updated on advancements in bioinformatics technology and methodologies.
Forgery Lawyer
Healthcare Fraud Lawyer

ReplyDelete
Replies
romahvillaJanuary 3, 2025 at 5:46 AM
Part three of the exploration into the common pitfalls of False Discovery Rate (FDR) estimation provides invaluable insights into the complexities of managing multiple hypothesis testing. This segment delves deeper into advanced issues such as dependency structures among tests, model assumptions, and the challenges of selecting appropriate thresholds. By illustrating these pitfalls with real-world examples, it highlights the importance of carefully balancing sensitivity and specificity.
abogado de accidentes de motocicleta virginia
ReplyDelete
Replies

Add comment