נושא הפרוייקט
מספר פרוייקט
מחלקה
שמות סטודנטים
אימייל
שמות מנחים
גישות למידה חישובית לחיזוי מולקולות מודולטורים לאינטרקציות חלבון-חלבון.
Machine learning based approaches for predicting protein-protein interactions modulators.
תקציר בעיברית
תקציר באנגלית
Protein–protein interactions (PPIs) are prospective but challenging targets for drug discovery, because screening with traditional small-molecule libraries often fails to identify hits. In order to develop novel anticancer, antiviral, and antimicrobial drugs, small molecule modulators of PPIs are being pursued. Previous works that used machine learning used only handcrafted fingerprints or chemical descriptors of modulator molecules. The dataset from which the prediction task is performed contains only features that describe modulator molecules, and does not include features that describe PPIs. The classifier returns a binary answer. In the case of prediction of inhibitors for specific PPI targets, there are positive examples of molecules in the input that modulate that PPI in conjunction with a selected negative set of molecules that modulate other PPIs. In our research, we aim to develop an innovative pipeline of neural networks. The proposed pipeline will predict specific PPI-oriented modulators, based on triplet examples that include embedding representations of the two proteins participating in the PPI complex and embedding representations of the small molecules as features. On a large protein-protein interactions network, the representation learning method uses graph neural networks to learn protein embeddings. A massage passing based graph neural network method was used to create molecular embeddings. In the final binary classifier, the triplet dataset contains pre-trained embeddings of 2 proteins and molecules and is fed to 3 fully connected layers. Our preliminary results indicate that the triplet dataset setting is superior to creating a new dataset that contains molecular fingerprints and training a new model for each PPI.