10.5061/DRYAD.SB669
van Wijk, Klaas J.
Cornell University
Friso, Giulia
Cornell University
Walther, Dirk
Max Planck Institute for Molecular Plant Physiology
Schulze, Waltraud X.
University of Hohenheim
Data from: Meta-analysis of Arabidopsis thaliana phospho-proteomics data
reveals compartmentalization of phosphorylation motifs
Dryad
dataset
2014
phosphoproteomics
Systems biology
Arabidopsis thaliana
phoshoproteome
2014-06-13T18:32:58Z
2014-06-13T18:32:58Z
en
https://doi.org/10.1105/tpc.114.125815
17212456 bytes
1
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Protein (de)phosphorylation plays an important role in plants. To provide
a robust foundation for subcellular phosphorylation signaling network
analysis and kinase-substrate relationships, we performed a meta-analysis
of 27 published and unpublished in-house mass spectrometry–based
phospho-proteome data sets for Arabidopsis thaliana covering a range of
processes, (non)photosynthetic tissue types, and cell cultures. This
resulted in an assembly of 60,366 phospho-peptides matching to 8141
nonredundant proteins. Filtering the data for quality and consistency
generated a set of medium and a set of high confidence phospho-proteins
and their assigned phospho-sites. The relation between single and
multiphosphorylated peptides is discussed. The distribution of p-proteins
across cellular functions and subcellular compartments was determined and
showed overrepresentation of protein kinases. Extensive differences in
frequency of pY were found between individual studies due to proteomics
and mass spectrometry workflows. Interestingly, pY was underrepresented in
peroxisomes but overrepresented in mitochondria. Using motif-finding
algorithms motif-x and MMFPh at high stringency, we identified
compartmentalization of phosphorylation motifs likely reflecting localized
kinase activity. The filtering of the data assembly improved signal/noise
ratio for such motifs. Identified motifs were linked to kinases through
(bioinformatic) enrichment analysis. This study also provides insight into
the challenges/pitfalls of using large-scale phospho-proteomic data sets
to nonexperts.
Supplemental Data Set 1. Detailed overview of the 27 published
p-proteomics studies and unpublished in-house data with their respective
metadata.SupplementalDataset_1r.xlsxSupplemental Data Set 2. The complete
unfiltered set of 60366 p-peptides with matched protein id, their
metadata, p-15-mers, annotation from PPDB, SUBA3 consensus prediction and
assignment to one of seven
locations.SupplementalDataSet_2rfinal.xlsxSupplemental Data Set 3.
Non-redundant Arabidopsis p-proteins before filtering (8141 proteins) or
after filters 1&2 (set A – 4494 proteins) and after filters
1,2,3,4 (set B -3687 proteins) with their
annotations.SupplementalDataSet_3r.xlsxSupplemental Data Set 4.
Non-redundant p-15-mers prior to filtering and for sets A and
B.SupplementalDataSet_4r.xlsxSupplemental Data Set 5A,B. Analysis of pY
peptides (A) and pY proteins (B).SupplementalDataSet_5r.xlsxSupplemental
Data Set 6. Published plant p-motifs in various plant species based on
motif-x searches against p-proteomics
data.SupplementalDataSet_6r.xlsxSupplemental Data Set 7. P-motifs for pS,
pT and pY and their fold-enrichment in sets A and B, and the localization
of p-15-mer sets using motif-x at the 10-6 threshold and 1%, 3% and 5%
occurrence rates.SupplementalDataSet_7r.xlsxSupplemental Data Set 8.
P-motifs for pS, pT and pY in sets A and B and the localization of
p-15-mer sets using MMFPh at the 10-6 threshold and 1%, 5% and 10%
occurrence rates.SupplementalDataSet_8r.xlsxSupplemental Data Set 9.
Motifs for pS, pT and pY found by motif-x and MMFPh for sets A and B and
subcellular sets at all occurrence
thresholdsSupplementalDataSet_9r.xlsxSupplemental Data Set 10. P-proteins
with their p-15-mers and their most significant motifs (from Table
2).SupplementalDataSet_10r.xlsxSupplemental Figure 1A. Hierarchical
clustering (average linkage method) of all 364 non-redundant pS motifs
identified by motif-x and/or MMFPh in sets A, B and the subcellular
sets.SupplementalFigure_1A.jpegSupplemental Figure 1B. Hierarchical
clustering (average linkage method) of all 26 non-redundant pT motifs
identified by motif-x and/or MMFPh in sets A, B and the subcellular
sets.SupplementalFigure_1B.jpegSupplemental Figure 2. Kinase recognition
motifs for different kinase families in
Arabidopsis.SupplementalFigure-2revised.pdf