Prédiction de l’affinité de ligands d’intérêt basée sur les machines à vecteur de support (SVM)
La prédiction d’affinité de ligands d’intérêt peut classiquement être étudiée par criblage virtuel se basant sur la structure d’une cible et de petites molécules connues. Ces approches basées sur la structure demandent de nombreuses étapes de conception, de calculs, d’analyses et de recoupements avec les données expérimentales pour évaluer leur performance prédictive. Quand suffisamment de données expérimentales sont disponibles, il est possible d’envisager des approches alternatives basées sur l’apprentissage. Dans ce contexte, le stage portera sur l’évaluation de la puissance prédictive de machines à vecteur de support (SVM) sur une cible d’intérêt connue du laboratoire. Le projet consistera dans un premier temps à intégrer dans un portail existant, docknmine, le mécanisme d’apprentissage sous forme de service additionnel aux outils déjà développés précédemment. Par la suite, le taux de prédiction de ce service sera évalué sur la famille des galectines humaines, pour lesquelles le portail docknmine permettra d’entrer les données expérimentales pour environ 75 ligands.
@article{Gheyouche2019,
title = {Docknmine, a web portal to assemble and analyse virtual and experimental interaction data},
author = {Ennys Gheyouche and Romain Launay and Jean Lethiec and Antoine Labeeuw and Caroline Roze and Alan Amossé and Stéphane Téletchéa},
doi = {10.3390/ijms20205062},
issn = {14220067},
year = {2019},
date = {2019-10-01},
journal = {International Journal of Molecular Sciences},
volume = {20},
number = {20},
publisher = {MDPI AG},
abstract = {Scientists have to perform multiple experiments producing qualitative and quantitative data to determine if a compound is able to bind to a given target. Due to the large diversity of the potential ligand chemical space, the possibility of experimentally exploring a lot of compounds on a target rapidly becomes out of reach. Scientists therefore need to use virtual screening methods to determine the putative binding mode of ligands on a protein and then post-process the raw docking experiments with a dedicated scoring function in relation with experimental data. Two of the major difficulties for comparing docking predictions with experiments mostly come from the lack of transferability of experimental data and the lack of standardisation in molecule names. Although large portals like PubChem or ChEMBL are available for general purpose, there is no service allowing a formal expert annotation of both experimental data and docking studies. To address these issues, researchers build their own collection of data in flat files, often in spreadsheets, with limited possibilities of extensive annotations or standardisation of ligand descriptions allowing cross-database retrieval. We have conceived the dockNmine platform to provide a service allowing an expert and authenticated annotation of ligands and targets. First, this portal allows a scientist to incorporate controlled information in the database using reference identifiers for the protein (Uniprot ID) and the ligand (SMILES description), the data and the publication associated to it. Second, it allows the incorporation of docking experiments using forms that automatically parse useful parameters and results. Last, the web interface provides a lot of pre-computed outputs to assess the degree of correlations between docking experiments and experimental data.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Scientists have to perform multiple experiments producing qualitative and quantitative data to determine if a compound is able to bind to a given target. Due to the large diversity of the potential ligand chemical space, the possibility of experimentally exploring a lot of compounds on a target rapidly becomes out of reach. Scientists therefore need to use virtual screening methods to determine the putative binding mode of ligands on a protein and then post-process the raw docking experiments with a dedicated scoring function in relation with experimental data. Two of the major difficulties for comparing docking predictions with experiments mostly come from the lack of transferability of experimental data and the lack of standardisation in molecule names. Although large portals like PubChem or ChEMBL are available for general purpose, there is no service allowing a formal expert annotation of both experimental data and docking studies. To address these issues, researchers build their own collection of data in flat files, often in spreadsheets, with limited possibilities of extensive annotations or standardisation of ligand descriptions allowing cross-database retrieval. We have conceived the dockNmine platform to provide a service allowing an expert and authenticated annotation of ligands and targets. First, this portal allows a scientist to incorporate controlled information in the database using reference identifiers for the protein (Uniprot ID) and the ligand (SMILES description), the data and the publication associated to it. Second, it allows the incorporation of docking experiments using forms that automatically parse useful parameters and results. Last, the web interface provides a lot of pre-computed outputs to assess the degree of correlations between docking experiments and experimental data.