Analysis of histone modifications with PEAKS 7
The complex nature of histone modification patterns has posed as a
challenge for bioinformatics analysis over the years. Yuan et al. [1] conducted a study using two datasets from human HeLa
histone samples, to benchmark the performance of current proteomic search engines.
This article was
published in J Proteome Res. 2014 Aug
28 (PubMed),
and the data from the two datasets, HCD_Histone and CID_Histone (PXD001118), was made publically available through ProteomeXchange. With this data, the article
uses eight different proteomic search engines to compare and evaluate the
performance and capability of each. The evaluated search engines in this study are: pFind,
Mascot, SEQUEST, ProteinPilot, PEAKS 6, OMSSA, TPP and MaxQuant.
In this study, PEAKS 6 was used to compare the performance capabilities
between search engines. However, PEAKS 7, which was released November 2013, is
the latest version available of the PEAKS Studio software. PEAKS 7 not only
includes better performance than PEAKS 6, but a lot of additional and improved
features. Our team has reanalyzed the two datasets HCD_Histone and CID_Histone with
PEAKS 7 to update the ID results presented in the publication by Yuan et al.
These updated results showed that instead, it is PEAKS, pFind and Mascot
that identify the most confident results.
Proportion of Confident IDs
As indicated in the article, the two HeLa
histone datasets were examined by each search engine using the same database
search parameters. Seven variable modifications of histone were used in the
study, and are reiterated in table 1 below.
Table 1. Modification parameters for database search
Fixed
modification
|
Propionyl[Peptide
N-term]/+56.02
|
|
Variable
modification
|
First (un)
|
Propionyl[K]/+56.026
|
Second (ac)
|
Propionyl[K]/+56.026;
Acetyl[K]/+42.011
|
|
Third (me)
|
Propionyl[K]/+56.026;
Methyl_Propionyl[K]/+70.042
|
|
Fourth (di)
|
Propionyl[K]/+56.026;
Dimethyl[K]/+28.031
|
|
Fifth (tr)
|
Propionyl[K]/+56.026;
Trimethyl[K]/+42.047
|
|
Sixth (ph)
|
Propionyl[K]/+56.026;
Phospho[ST]/+79.966
|
|
Seventh (co)
|
Propionyl[K]/+56.026;
Acetyl[K]/+42.011; Methyl_Propionyl[K]/+70.042; Dimethyl[K]/+28.031;
Trimethyl[K]/+42.047;
Phospho[ST]/+79.966
|
When
the data was run with PEAKS 7 also using these same parameters, an updated comparison
of the IDs and confident IDs from the article published by Yuan et al. was created, as shown in figure
1. The comparison includes the results produced by the eight different search
engines. IDs (shown as solid bars) from each search engine are identifications
with an FDR < 1%; whereas confident IDs (shown as striped bars) are the
number of IDs from each search engine which are also present in the ‘all_Confident’
group of IDs. The term ‘all_Confident’ was used to indicate IDs that were found
by at least two of the eight different search engines.
References
Figure 1 (a-g). Comparison of the number of IDs and confident IDs of the seven variable modifications produced by the different search engines using HeLa histone HCD and CID data
|
(a) indicates the number of first (un) modified ID; (b) number of second (ac) modified ID; (c) number of third (me) modified ID; (d) number of fourth (di) modified ID; (e) number of fifth (tr) modified ID; (f) number of sixth (ph) modified ID; and (g) number of seventh (co) modified ID.
|
By analyzing each of the graphs presented in
figure 1, PEAKS 7 produces the most confident results of the search
engines evaluated in the study, along with pFind and Mascot. This is true in
all cases (un, ac, di, tr, ph, and co; where ph tied with pFind and Mascot, and co tied for first with Mascot) except in the third modification
where pFind and Mascot found the most confident result.
Running Time
For
this analysis, PEAKS 7 was run on a typical desktop computer with an i7 CPU and
16G RAM. PEAKS 7 finished each of the first
six searches (un, ac, me, di, tr, and
ph) around 22 minutes and then 14
minutes for the HCD_Histone and CID_Histone database searches respectively. Compared to 2h-7h indicated by [1] using PEAKS
6, the speed of PEAKS 7 is much faster. For the seventh search which involved
multiple PTMs (co), PEAKS spent 30
minutes, and then 14 minutes performing the database search for HCD_Histone and
CID_Histone respectively.
Therefore, the performance time of PEAKS 7 is very comparable to the other search engines as drawn in conclusion from [1] and consistent with the performance capabilities presented in (http://peaksblog.bioinfor.com/2013/12/boost-your-analysis-speed-with-peaks-7.html).
Therefore, the performance time of PEAKS 7 is very comparable to the other search engines as drawn in conclusion from [1] and consistent with the performance capabilities presented in (http://peaksblog.bioinfor.com/2013/12/boost-your-analysis-speed-with-peaks-7.html).
References
1.
Yuan ZF, Lin S, Molden RC, Garcia BA.
Evaluation of proteomic search engines for the analysis of histone
modifications. J Proteome Res. 2014 Aug 28. [Epub ahead of print]