Protein phosphorylation is a key post-translational modification that plays a central role in many cellular processes. With recent advances in biotechnology, thousands of phosphorylated sites can be identified and quantified in a given sample, enabling proteome-wide screening of cellular signaling. However, for most (> 90%) of the phosphorylation sites that are identified in these experiments, the kinase(s) that target these sites are unknown. To broadly utilize available structural, functional, evolutionary, and contextual information in predicting kinase-substrate associations (KSAs), we develop a network-based machine learning framework. Our framework integrates a multitude of data sources to characterize the landscape of functional relationships and associations among phosphosites and kinases. To construct a phosphosite-phosphosite association network, we use sequence similarity, shared biological pathways, co-evolution, co-occurrence, and co-phosphorylation of phosphosites across different biological states. To construct a kinase-kinase association network, we integrate protein-protein interactions, shared biological pathways, and membership in common kinase families. We use node embeddings computed from these heterogeneous networks to train machine learning models for predicting kinase-substrate associations. Our systematic computational experiments using the PhosphositePLUS database shows that the resulting algorithm, NetKSA, outperforms two state-of-the-art algorithms, including KinomeXplorer and LinkPhinder, in overall KSA prediction. By stratifying the ranking of kinases, NetKSA also enables annotation of phosphosites that are targeted by relatively less-studied kinases. Availability: The code and data are available at compbio.case.edu/NetKSA/.
phosphoproteomics, kinase-substrate association, network embedding
Pacific Symposium on Biocomputing
© 2022 The Authors.
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Prediction of Kinase-Substrate Associations Using The Functional Landscape of Kinases and Phosphorylation Sites Marzieh Ayati, Serhan Yilmaz, Filipa Blasco Tavares Pereira Lopes, Mark Chance, and Mehmet Koyuturk. Biocomputing 2023. November 2022, 73-84