Browsing by Author "Le-Thi-Thu H."
Now showing 1 - 4 of 4
- Results Per Page
- Sort Options
Item Generalized molecular descriptors derived from event-based discrete derivative(Bentham Science Publishers B.V., 2016) Martínez-Santiago O.; Cabrera R.M.; Marrero-Ponce Y.; Barigye S.J.; Le-Thi-Thu H.; Torres, Javier; Zambrano C.H.; Yaber Goenaga, Iván; Cruz-Monteagudo, M.; López Y.M.; Giménez F.P.; Torrens, F.In the present study, a generalized approach for molecular structure characterization is introduced, based on the relation frequency matrix (F) representation of the molecular graph and the subsequent calculation of the corresponding discrete derivative (finite difference) over a pair of elements (atoms). In earlier publications (22-24), an unique event, named connected subgraphs, (based on the Kier-Hall’s subgraphs) was systematically employed for the computation of the matrix F. The present report is a generalization of this notion, in which eleven additional events are introduced, classified in three categories, namely, topological (terminal paths, vertex path incidence, quantum subgraphs, walks of length k, Sach’s subgraphs), fingerprints (MACCs, E-state and substructure fingerprints) and atomic contributions (Ghose and Crippen atom-types for hydrophobicity and refractivity) for F generation. The events are intended to capture diverse information by the generation or search of different kinds of substructures from the graph representation of a molecule. The discrete derivative over duplex atom relations are calculated for each event, and the resulting derivatives, local vertex invariants (LOVIs) are finally obtained. These LOVIs are subsequently employed as the basis for the calculation of global and local indices over groups of atoms (heteroatoms, halogens, methyl carbons, etc.), by using norms, means, statistics and classical algorithms as aggregator (fusion) operators. These indices were implemented in our house software DIVATI (Derivative Type Indices, a new module of TOMOCOMDCARDD system). DIVATI provides a friendly and cross-platform graphical user interface, developed in the Java programming language and is freely available at: http: //www.tomocomd.com. Factor analysis shows that the presented events are rather orthogonal and collect diverse information about the chemical structure. Finally, QSPR models were built to describe the logP and logK of 34 furylethylenes derivatives using the eleven events. Generally, the equations obtained according to these events showed high correlations, with the Sach’s sub-graphs and Multiplicity events showing the best behavior in the description of logK (Q2LOO value of 99.06%) and logP (Q2LOO value of 98.1%), respectively. These results show that these new eventbased indices constitute a powerful approach for chemoinformatics studies. © 2016 Bentham Science Publishers.Item Multi-output model with box-jenkins operators of quadratic indices for prediction of malaria and cancer inhibitors targeting ubiquitin-proteasome pathway (UPP) proteins(Bentham Science Publishers B.V., 2016) Casañola-Martín G.M.; Le-Thi-Thu H.; Pérez-Giménez F.; Marrero-Ponce Y.; Merino-Sanjuán M.; Abad C.; González-Díaz H.The ubiquitin-proteasome pathway (UPP) is the primary degradation system of short-lived regulatory proteins. Cellular processes such as the cell cycle, signal transduction, gene expression, DNA repair and apoptosis are regulated by this UPP and dysfunctions in this system have important implications in the development of cancer, neurodegenerative, cardiac and other human pathologies. UPP seems also to be very important in the function of eukaryote cells of the human parasites like Plasmodium falciparum, the causal agent of the neglected disease Malaria. Hence, the UPP could be considered as an attractive target for the development of compounds with Anti-Malarial or Anti-cancer properties. Recent online databases like ChEMBL contains a larger quantity of information in terms of pharmacological assay protocols and compounds tested as UPP inhibitors under many different conditions. This large amount of data give new openings for the computer-aided identification of UPP inhibitors, but the intrinsic data diversity is an obstacle for the development of successful classifiers. To solve this problem here we used the Bob-Jenkins moving average operators and the atom-based quadratic molecular indices calculated with the software TOMOCOMD-CARDD (TC) to develop a quantitative model for the prediction of the multiple outputs in this complex dataset. Our multi-target model can predict results for drugs against 22 molecular or cellular targets of different organisms with accuracies above 70% in both training and validation sets. © 2016 Bentham Science Publishers.Item Optimum search strategies or novel 3D molecular descriptors: Is there a stalemate?(Bentham Science Publishers B.V., 2015) Marrero-Ponce Y.; García-Jacas C.R.; Barigye S.J.; Valdés-Martiní J.R.; Rivera-Borroto O.M.; Pino-Urias R.W.; Cubillán, Néstor; Alvarado Y.J.; Le-Thi-Thu H.The present manuscript describes a novel 3D-QSAR alignment free method (QuBiLS-MIDAS Duplex) based on algebraic bilinear, quadratic and linear forms on the kth two-tuple spatial-(dis)similarity matrix. Generalization schemes for the inter-atomic spatial distance using diverse (dis)-similarity measures are discussed. On the other hand, normalization approaches for the two-tuple spatial-(dis)similarity matrix by using simple-and double-stochastic and mutual probability schemes are introduced. With the aim of taking into consideration particular inter-atomic interactions in total or local-fragment indices, path and length cut-off constraints are used. Also, in order to generalize the use of the linear combination of atom-level indices to yield global (molecular) definitions, a set of aggregation operators (invariants) are applied. A Shannon’s entropy based variability study for the proposed 3D algebraic form-based indices and the DRAGON molecular descriptor families demonstrates superior performance for the former. A principal component analysis reveals that the novel indices codify structural information orthogonal to those captured by the DRAGON indices. Finally, a QSAR study for the binding affinity to the corticosteroid-binding globulin using Cramer’s steroid database is performed. From this study, it is revealed that the QuBiLS-MIDAS Duplex approach yields similar-to-superior performance statistics than all the 3D-QSAR methods reported in the literature reported so far, even with lower degree of freedom, using both the 31 steroids as the training set and the popular division of Cramer’s database in training [1-21] and test sets [22-31]. It is thus expected that this methodology provides useful tools for the diversity analysis of compound datasets and high-throughput screening structure–activity data. © 2015 Bentham Science Publishers.Item Towards better BBB passage prediction using an extensive and curated data set(Wiley-VCH Verlag, 2015) Brito-Sánchez Y.; Marrero-Ponce Y.; Barigye S.J.; Yaber Goenaga, Iván; Morell Pérez C.; Le-Thi-Thu H.; Cherkasov A.In the present report, the challenging task of drug delivery across the blood-brain barrier (BBB) is addressed via a computational approach. The BBB passage was modeled using classification and regression schemes on a novel extensive and curated data set (the largest to the best of our knowledge) in terms of log BB. Prior to the model development, steps of data analysis that comprise chemical data curation, structural, cutoff and cluster analysis (CA) were conducted. Linear Discriminant Analysis (LDA) and Multiple Linear Regression (MLR) were used to fit classification and correlation functions. The best LDA-based model showed overall accuracies over 85% and 83% for the training and test sets, respectively. Also a MLR-based model with acceptable explanation of more than 69% of the variance in the experimental log BB was developed. A brief and general interpretation of proposed models allowed the estimation on how 'near' our computational approach is to the factors that determine the passage of molecules through the BBB. In a final effort some popular and powerful Machine Learning methods were considered. Comparable or similar performance was observed respect to the simpler linear techniques. Most of the compounds with anomalous behavior were put aside into a set denoted as controversial set and discussion regarding to these compounds is provided. Finally, our results were compared with methodologies previously reported in the literature showing comparable to better results. The results could represent useful tools available and reproducible by all scientific community in the early stages of neuropharmaceutical drug discovery/development projects. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.