Browsing by Author "Alvarado Y.J."
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Item Novel 3D bio-macromolecular bilinear descriptors for protein science: Predicting protein structural classes(Academic Press, 2015) Marrero-Ponce Y.; Contreras-Torres E.; García-Jacas C.R.; Barigye S.J.; Cubillán, Néstor; Alvarado Y.J.In the present study, we introduce novel 3D protein descriptors based on the bilinear algebraic form in the ℝn space on the coulombic matrix. For the calculation of these descriptors, macromolecular vectors belonging to ℝn space, whose components represent certain amino acid side-chain properties, were used as weighting schemes. Generalization approaches for the calculation of inter-amino acidic residue spatial distances based on Minkowski metrics are proposed. The simple- and double-stochastic schemes were defined as approaches to normalize the coulombic matrix. The local-fragment indices for both amino acid-types and amino acid-groups are presented in order to permit characterizing fragments of interest in proteins. On the other hand, with the objective of taking into account specific interactions among amino acids in global or local indices, geometric and topological cut-offs are defined. To assess the utility of global and local indices a classification model for the prediction of the major four protein structural classes, was built with the Linear Discriminant Analysis (LDA) technique. The developed LDA-model correctly classifies the 92.6% and 92.7% of the proteins on the training and test sets, respectively. The obtained model showed high values of the generalized square correlation coefficient (GC2) on both the training and test series. The statistical parameters derived from the internal and external validation procedures demonstrate the robustness, stability and the high predictive power of the proposed model. The performance of the LDA-model demonstrates the capability of the proposed indices not only to codify relevant biochemical information related to the structural classes of proteins, but also to yield suitable interpretability. It is anticipated that the current method will benefit the prediction of other protein attributes or functions. © 2015 Elsevier Ltd.Item Novel global and local 3D atom-based linear descriptors of the Minkowski distance matrix: theory, diversity–variability analysis and QSPR applications(Kluwer Academic Publishers, 2015) Cubillán, Néstor; Marrero-Ponce Y.; Ariza-Rico H.; Barigye S.J.; García-Jacas C.R.; Valdes-Martini J.R.; Alvarado Y.J.A new family of alignment-free 3D descriptors based on TOMOCOMD-CARDD framework has been designed, namely 3D-linear indices. In this report, we have proposed the use of a generalized form of the geometric pairwise atom-atom distance matrix as structural information matrix. This matrix, denominated as non-stochastic, uses as matrix form of linear maps as well as their algebraic transformations: stochastic, double stochastic and mutual probabilities matrices. The methodology for 3D-QSAR studies is based on the combined use of global and local approaches. Principal component analysis reveals that the novel indices are capable of capturing structural information not codified by the indices implemented in the DRAGON’s software. Moreover, Shannon’s entropy based variability analysis comparing the 3D-linear indices with some relevant descriptors suggests that the former encode similar-to-better amount of structural information than these descriptors. Finally, a search for the best regressions for congeneric databases in QSPR modeling was performed. The overall results demonstrates satisfactory behavior. © 2015, Springer International Publishing Switzerland.Item Optimum search strategies or novel 3D molecular descriptors: Is there a stalemate?(Bentham Science Publishers B.V., 2015) Marrero-Ponce Y.; García-Jacas C.R.; Barigye S.J.; Valdés-Martiní J.R.; Rivera-Borroto O.M.; Pino-Urias R.W.; Cubillán, Néstor; Alvarado Y.J.; Le-Thi-Thu H.The present manuscript describes a novel 3D-QSAR alignment free method (QuBiLS-MIDAS Duplex) based on algebraic bilinear, quadratic and linear forms on the kth two-tuple spatial-(dis)similarity matrix. Generalization schemes for the inter-atomic spatial distance using diverse (dis)-similarity measures are discussed. On the other hand, normalization approaches for the two-tuple spatial-(dis)similarity matrix by using simple-and double-stochastic and mutual probability schemes are introduced. With the aim of taking into consideration particular inter-atomic interactions in total or local-fragment indices, path and length cut-off constraints are used. Also, in order to generalize the use of the linear combination of atom-level indices to yield global (molecular) definitions, a set of aggregation operators (invariants) are applied. A Shannon’s entropy based variability study for the proposed 3D algebraic form-based indices and the DRAGON molecular descriptor families demonstrates superior performance for the former. A principal component analysis reveals that the novel indices codify structural information orthogonal to those captured by the DRAGON indices. Finally, a QSAR study for the binding affinity to the corticosteroid-binding globulin using Cramer’s steroid database is performed. From this study, it is revealed that the QuBiLS-MIDAS Duplex approach yields similar-to-superior performance statistics than all the 3D-QSAR methods reported in the literature reported so far, even with lower degree of freedom, using both the 31 steroids as the training set and the popular division of Cramer’s database in training [1-21] and test sets [22-31]. It is thus expected that this methodology provides useful tools for the diversity analysis of compound datasets and high-throughput screening structure–activity data. © 2015 Bentham Science Publishers.