Loading...

Classification, Similarity Analysis and Modeling of Drug Activities Using Chemometric Techniques: Introduction of Classical Relativity in Chemical Space

Mani-Varnosfaderani, Ahmad | 2012

800 Viewed
  1. Type of Document: Ph.D. Dissertation
  2. Language: Farsi
  3. Document No: 43727 (03)
  4. University: Sharif University of Technology
  5. Department: Chemistry
  6. Advisor(s): Jalali-Heravi, Mehdi
  7. Abstract:
  8. The present research devoted to the application, development and implementation of clustering, classification and regression techniques for modeling of the biological activity of different drug and drug-like molecules. At first, the prediction ability of Bayesian regression techniques was evaluated for describing and predicting the inhibition behavior of Integrin antagonists. As a next step, the complementary local search techniques have been used for improving the performances of Bayesian regularized genetic neural network (BRGNN) algorithm. The results indicated that the pattern search algorithm has a great potential to be used as a feature selection method in Chemoinformatics. In line with the application of Bayesian techniques, the Markov Chain Monte-Carlo (MCMC) search engine has been used for the first time for modeling the retention indices of volatile components of Artemisia species. The obtained results showed that it is possible to use the MCMC engine in quantitative structure-activity relationship (QSAR) studies for pattern recognition and dimension reduction purposes.(1) A total of 5580 anti-HIV molecules consisted of diverse sets of CCR5 modulators, HIV-1 reverse transcriptase, HIV-1 protease and HIV-1 integrase inhibitors have been collected from literature. A total of 1497 molecular descriptors have been calculated for each molecule using DRAGON software. A combination of genetic algorithm and counterpropagation artificial neural networks (GA-PS-CPANN) has been used for classification of molecules based on their activities and therapeutic targets. The results showed that it is possible to separate the molecules based on their therapeutic targets using their molecular properties and the trained CPANN model. As a next step, the classification and regression technique (CART) has also been used for charting the chemical space of anti-HIV molecules. Validation of CART model revealed that this method correctly classifies more than 87% of the active molecules. At the end, the active-inactive binary classifiers have been developed for discriminating between active and inactive molecules. The area under curve (AUC) values of receiver operating characteristic (ROC) curves showed that these classifiers are good suggestions for screening of large compound databases.(2)A total of 6230 drug-like molecules consisted of the inhibitors of the aromatase, evolving growth factor receptor, histone deacetylase and matrix metalo proteinase has been collected from Binding-Database. A total of 1497 molecular descriptors was calculated for each molecule. The GA-PS-CPANN algorithm has been used for classification of molecules according to their target types. The obtained results revealed that the local dipole moments, average vertex index of molecular graph and number of halogen atoms have considerable role for determining the classes of molecules. This project has defined a new way for determining the position of active drug-like molecules in chemical space by using the classification models.(3)In order to study the inhibitors and modulators of the central nervous system (CNS), a total of 21800 molecules have been extracted from Binding-Database. A very diverse set of molecular descriptors has been calculated for each molecule. A combination of genetic algorithm and quadratic discriminant analysis (GA-QDA) has been proposed for classification of molecules based on their target types and activities. The results showed that the developed classifiers separated the molecules based on their target types and activities up to the certain level. As a next step, the developed classifiers have been used for screening of the random subsets of PubChem and ZINC databases. To the best of the knowledge of the authors, this is the first study in which the target based classifiers have been used for screening of very large molecular libraries. In fact, these models have used the relative distances of the center of the active clusters of molecules as an index for sorting compound databases in drug discovery projects. It has been proposed as “classical relativity in chemical space” and we showed its potential as a new strategy for speed up the early stages of drug discovery projects. These models help for extending the concept of “chemography” in chemical space.(4)In order to extend the concept of “classical relativity” in chemical space, all of known drug-like molecules with annotated biological activities have been collected from Binding-Database and analyzed using classification techniques. The molecules consisted of more than 211000 molecules for modulating and inhibiting 114 different biological targets. A total of 1497 molecular descriptors has been calculated for the collected molecules and stored in a SQL interface. A graphical user interface (GUI) has been written in C# environment for selecting and classifying the molecules according to their target types. A variety of discriminant analysis and optimization techniques have been implemented in this software for development of the tuned classification models. The possibility of screening of the random subsets PubChem, ZINC and Binding-DB has been implemented in this software. At the end, it is possible to visualize the chemical space of the collected molecules by mapping them between different selected molecular descriptors. It helps for better understanding of the mechanism of the inhibition of different drug-like molecules considering their important physicochemical properties.(5)This project is about the fragmentation analysis of the largest publicity available molecular library, GDB-13 developed by Reymond. et. al. in University of Bern in Switzerland. This database involves of about 1 billion molecules consisted of C, N, O, Cl and S atoms. The main aim of this project was to propose a very fast strategy for searching in GDB-13. The proposed algorithm consisted of the six different fragmentation steps starting from the simplest to more complex frameworks. A GUI has been writen in JAVA language for easily fragmenting of the molecules and defining queries based on the selected fragments. It helps for effective similarity analysis based on the shapes of the molecules and proposes a new strategy for virtual screening named “scaffold forest”. This software circumvents the low speed of the virtual screening algorithms and can screen about 500 thousands molecules per second within a usual computer.

  9. Keywords:
  10. Drug Design ; Chemometrics Method ; Quantitative Structure-Activity Relationship (QSAR)Model ; Drug-Like Molecules ; Drug-Like Molecules Similarity Analysis ; Drug-Like Molecules Virtual Screening ; Classical Relativity in Chemical Space

 Digital Object List

  • محتواي پايان نامه
  •   view

 Bookmark

No TOC