Mostrar el registro sencillo del ítem

dc.creatorUrias R.W.P.
dc.creatorBarigye S.J.
dc.creatorMarrero-Ponce Y.
dc.creatorGarcía-Jacas C.R.
dc.creatorValdes-Martiní J.R.
dc.creatorPerez-Gimenez F.
dc.date.accessioned2020-03-26T16:32:46Z
dc.date.available2020-03-26T16:32:46Z
dc.date.issued2015
dc.identifier.citationMolecular Diversity; Vol. 19, Núm. 2; pp. 305-319
dc.identifier.issn13811991
dc.identifier.urihttps://hdl.handle.net/20.500.12585/9015
dc.description.abstractAbstract: The features and theoretical background of a new and free computational program for chemometric analysis denominated IMMAN (acronym for Information theory-based CheMoMetrics ANalysis) are presented. This is multi-platform software developed in the Java programming language, designed with a remarkably user-friendly graphical interface for the computation of a collection of information-theoretic functions adapted for rank-based unsupervised and supervised feature selection tasks. A total of 20 feature selection parameters are presented, with the unsupervised and supervised frameworks represented by 10 approaches in each case. Several information-theoretic parameters traditionally used as molecular descriptors (MDs) are adapted for use as unsupervised rank-based feature selection methods. On the other hand, a generalization scheme for the previously defined differential Shannon’s entropy is discussed, as well as the introduction of Jeffreys information measure for supervised feature selection. Moreover, well-known information-theoretic feature selection parameters, such as information gain, gain ratio, and symmetrical uncertainty are incorporated to the IMMAN software (http://mobiosd-hub.com/imman-soft/), following an equal-interval discretization approach. IMMAN offers data pre-processing functionalities, such as missing values processing, dataset partitioning, and browsing. Moreover, single parameter or ensemble (multi-criteria) ranking options are provided. Consequently, this software is suitable for tasks like dimensionality reduction, feature ranking, as well as comparative diversity analysis of data matrices. Simple examples of applications performed with this program are presented. A comparative study between IMMAN and WEKA feature selection tools using the Arcene dataset was performed, demonstrating similar behavior. In addition, it is revealed that the use of IMMAN unsupervised feature selection methods improves the performance of both IMMAN and WEKA supervised algorithms. © 2015, Springer International Publishing Switzerland.eng
dc.description.sponsorshipConselho Nacional de Desenvolvimento Científico e Tecnológico, CNPq
dc.format.mediumRecurso electrónico
dc.format.mimetypeapplication/pdf
dc.language.isoeng
dc.publisherKluwer Academic Publishers
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.sourcehttps://www.scopus.com/inward/record.uri?eid=2-s2.0-84937517073&doi=10.1007%2fs11030-014-9565-z&partnerID=40&md5=bebd134ed45279902c02db40eaa3b28c
dc.titleIMMAN: free software for information theory-based chemometric analysis
dcterms.bibliographicCitationTodeschini, R., Consonni, V., (2009) Molecular descriptors for chemoinformatics, , 1, Wiley-VCH, Weinheim:
dcterms.bibliographicCitationTodeschini, R., Consonni, V., Pavan, M., DRAGON Software version 2.1. Milano Chemometric and QSAR Research Group (2002) Milano
dcterms.bibliographicCitationGuha, R., The CDK descriptor calculator, 0.94th edn (1991) Indiana
dcterms.bibliographicCitationYap, C.W., PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints (2011) J Comput Chem, 32, pp. 1466-1474. , COI: 1:CAS:528:DC%2BC3MXjsF2isLc%3D, PID: 21425294
dcterms.bibliographicCitationGeorg, H., (2008) BlueDesc-molecular descriptor calculator, , University of Tübingen, Tübingen:
dcterms.bibliographicCitationLiu, J., Feng, J., Brooks, A., Young, S., (2005) PowerMV, , National Institute of Statistical Sciences, Research Triangle Park:
dcterms.bibliographicCitationCode, A.D.R.I.A.N.A., (2011) Molecular Networks, , Erlangen, Germany:
dcterms.bibliographicCitationHong, H., Xie, Q., Ge, W., Qian, F., Fang, H., Shi, L., Su, Z., Tong, W., Mold2, molecular descriptors from 2D structures for chemoinformatics and toxicoinformatics (2008) J Chem Inf Comput Sci, 48, pp. 1337-1344. , COI: 1:CAS:528:DC%2BD1cXnsVehtL0%3D
dcterms.bibliographicCitationKellogg, G.E., Molconn-Z 4.0 edn. eduSoft (2001) Virginia
dcterms.bibliographicCitationLiu, H., Motoda, H., Liu, H., Motoda, H., Less is More (2008) Computational methods of feature selection. Data mining and knowledge discovery series, p. 411. , Taylor * Francis Group, Boca Raton:
dcterms.bibliographicCitationWolpert, D.H., Macready, W.G., No free lunch theorems for optimization (1997) IEEE Trans Evol Comput, 1, pp. 67-82
dcterms.bibliographicCitationVenkatraman, V., Dalby, A.R., Yang, Z.R., Evaluation of mutual information and genetic programming for feature selection in QSAR (2004) J Chem Inf Comput Sci, 44, pp. 1686-1692. , COI: 1:CAS:528:DC%2BD2cXmsVensr4%3D, PID: 15446827
dcterms.bibliographicCitationYu, L., Liu, H., Feature selection for high-dimensional data: a fast correlation-based filter solution (2003) In, , Proceedings of the Twentieth international conference on machine learning, Washington DC:
dcterms.bibliographicCitationKira, K., Rendell, L., The feature selection problem: traditional methods and a new algorithm (1992) Association for the advancement of artificial intelligence, pp. 129-134. , AAAI Press and MIT Press, Cambridge:
dcterms.bibliographicCitationKullback, S., Leibler, R.A., On information and sufficiency (1951) Ann Math Stat, 22, pp. 79-86
dcterms.bibliographicCitationJeffreys, H., An invariant form for the prior probability in estimation problems (1946) Proc Roy Soc A, 186, pp. 453-461. , COI: 1:STN:280:DyaH28%2Fhs1yntA%3D%3D
dcterms.bibliographicCitationJennifer, G.D., Liu, H., Motoda, H., Unsupervised Feature Selection (2008) Computational methods of feature selection. Data mining and knowledge discovery series. Taylor &, p. 411. , Francis Group, Boca Raton:
dcterms.bibliographicCitationVarshavsky, R., Gottlieb, A., Linial, M., Horn, D., Novel unsupervised feature filtering of biological data (2006) Bioinformatics, 22, pp. 507-513. , COI: 1:CAS:528:DC%2BD28Xotl2rt7Y%3D, PID: 16873514
dcterms.bibliographicCitationMaldonado, A.G., Doucet, J.P., Petitjean, M., Fan, B.-T., Molecular similarity and diversity in chemoinformatics: from theory to applications (2006) Mol Divers, 10, pp. 39-79. , COI: 1:CAS:528:DC%2BD28XjsFCmsg%3D%3D, PID: 16404528
dcterms.bibliographicCitationGodden, J.W., Stahura, F.L., Variability of molecular descriptors in compound databases revealed by Shannon entropy calculations (2000) J Chem Inf Comput Sci, 40, pp. 796-800. , COI: 1:CAS:528:DC%2BD3cXisVOqurc%3D, PID: 10850785
dcterms.bibliographicCitationGodden, J.W., Bajorath, J., Chemical descriptors with distinct levels of information content and varying sensitivity to differences between selected compound databases identified by SE-DSE analysis (2002) J Chem Inf Comput Sci, 42, pp. 87-93. , COI: 1:CAS:528:DC%2BD3MXosFOqsbk%3D, PID: 11855971
dcterms.bibliographicCitationBarigye, S.J., Marrero-Ponce, Y., Pérez-Giménez, F., Bonchev, D., Trends in information theory-based chemical structure codification (2014) Mol Divers, 18, pp. 673-686. , COI: 1:CAS:528:DC%2BC2cXls1Kmsr8%3D, PID: 24705993
dcterms.bibliographicCitationWitten, I.H., Eibe, F., Hall, M.A., Data mining: practical machine learning tools and techniques (2011) The Morgan Kaufmann series in data management systems, , Morgan Kaufmann, Burlington
dcterms.bibliographicCitationAlter, O., Brown, P.O., Botstein, D., Singular value decomposition for genome-wide expression data processing and modeling (2000) Proc Natl Acad Sci USA, 97, pp. 10101-10106. , COI: 1:CAS:528:DC%2BD3cXmtlehsbs%3D, PID: 10963673
dcterms.bibliographicCitationDevakumari, D., Thangavel, K., Unsupervised adaptive floating search feature selection based on contribution entropy. In: 2010 international conference on communication and computational intelligence (INCOCCI) (2010) pp 623–627
dcterms.bibliographicCitationDash, M., Choi, K., Scheuermann, P., Huan, L., Feature selection for clustering—a filter solution (2002) Proceedings of the 2002 IEEE international conference on data mining (ICDM, 2003, pp. 115-122
dcterms.bibliographicCitationStahura, F.L., Godden, J.W., Bajorath, J., Differential Shannon entropy analysis identifies molecular property descriptors that predict aqueous solubility of synthetic compounds with high accuracy in binary QSAR calculations (2002) J Chem Inf Comput Sci, 42, pp. 550-558. , COI: 1:CAS:528:DC%2BD38Xht1Gktrs%3D, PID: 12086513
dcterms.bibliographicCitationWassermann, A.M., Nisius, B., Vogt, M., Bajorath, J., Identification of descriptors capturing compound class-specific features by mutual information analysis (2010) J Chem Inf Model, 50, pp. 1935-1940. , COI: 1:CAS:528:DC%2BC3cXhtlWiu7zO, PID: 20961115
dcterms.bibliographicCitationCover, T.M., Thomas, J.A., (1991) Elements of Information theory, , Wiley, New York:
dcterms.bibliographicCitationDesurvire, E., (2009) Classical and quantum information theory, , Cambridge University Press, New York:
dcterms.bibliographicCitationQuinlan, J.R., Learning efficient classification procedures and their application to chess end games. In: Michalski R, Carbonell J, Mitchell T (eds) Machine learning. Symbolic computation. Springer, Berlin, pp 463–482 (1983) doi:10.1007/978-3-662-12405-5_15
dcterms.bibliographicCitationPress, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T., (1988) Numerical recipes in C: the art of scientific computing, , Cambridge University Press, New York:
dcterms.bibliographicCitationConsonni, V., Todeschini, R., Pavan, M., Gramatica, P., Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. Part 2. Application of the novel 3D molecular descriptors to QSAR/QSPR studies (2002) J Chem Inf Comput Sci, 42, pp. 693-705. , COI: 1:CAS:528:DC%2BD38XivFCgtrc%3D, PID: 12086531
dcterms.bibliographicCitationPérez González, M., Terán, C., Teijeira, M., González-Moa, M.J., GETAWAY descriptors to predicting A2A adenosine receptors agonists (2005) Eur J Med Chem, 40, pp. 1080-1086
dcterms.bibliographicCitationSaiz-Urra, L., Pérez González, M., Quantitative structure-activity relationship studies of HIV-1 integrase inhibition.1. GETAWAY descriptors (2007) Eur J Med Chem, 42, pp. 64-70. , COI: 1:CAS:528:DC%2BD2sXhsFyku7s%3D, PID: 17030481
dcterms.bibliographicCitationFedorowicz, A., Singh, H., Soderholm, S., Demchuk, E., Structure–activity models for contact sensitization (2005) Chem Res Toxicol, 18, pp. 954-969. , COI: 1:CAS:528:DC%2BD2MXjvFKjtbs%3D, PID: 15962930
dcterms.bibliographicCitationSaiz-Urra, L., Pérez González, M., QSAR studies about cytotoxicity of benzophenazines with dual inhibition toward both topoisomerases I and II: 3D-MoRSE descriptors and statistical considerations about variable selection (2006) Bioorg Med Chem, 14, pp. 7347-7358. , COI: 1:CAS:528:DC%2BD28XpvFGjtb4%3D, PID: 16962784
dcterms.bibliographicCitationGasteiger, J., Sadowski, J., Schuur, J., Selzer, P., Steinhauer, L., Steinhauer, V., Chemical information in 3Dspace (1996) J Chem Inf Comput Sci, 36, pp. 1030-1037. , COI: 1:CAS:528:DyaK28XltlCms7k%3D
dcterms.bibliographicCitationGasteiger, J., Schuur, J., Selzer, P., Steinhauer, L., Steinhauer, V., Finding the 3D structure of a molecule in its IR spectrum (1997) Fresen J Anal Chem, 359, pp. 50-55. , COI: 1:CAS:528:DyaK2sXls1Clt7c%3D
dcterms.bibliographicCitationSchuur, J., Selzer, P., Gasteiger, J., The coding of the three-dimensional structure of molecules by molecular transforms and its application to structure-spectra correlations and studies of biological activity (1996) J Chem Inf Comput Sci, 36, pp. 334-344. , COI: 1:CAS:528:DyaK28Xhtlygtb4%3D
dcterms.bibliographicCitationBaumann, K., Uniform-length molecular descriptors for quantitative structure-property relationships (QSPR) and quantitative structure-activity relationships (QSAR): classification studies and similarity searching (1999) TRAC, 18, pp. 36-46. , COI: 1:CAS:528:DyaK1MXltFShsg%3D%3D
dcterms.bibliographicCitationJelcic, Z., Solvent molecular descriptors on poly(D, L-lactide-co-glycolide) particle size in emulsification-diffusion process (2004) Coll Surf A Physico-Chem Eng Asp, 242, pp. 159-166. , COI: 1:CAS:528:DC%2BD2cXlvFGktbs%3D
dcterms.bibliographicCitationTodeschini, R., Bettiol, C., Giurin, G., Gramatica, P., Miana, P., Argese, E., Modeling and prediction by using WHIM descriptors in QSAR studies. Submitochondrial particles (SMP) as toxicity biosensors of chlorophenols (1996) Chemosphere, 33, pp. 71-79. , COI: 1:CAS:528:DyaK28XktlersLs%3D
dcterms.bibliographicCitationRandic, M., Molecular profiles. Novel geometry-dependent molecular descriptors (1995) New J Chem, 19, pp. 781-791. , COI: 1:CAS:528:DyaK2MXnvVWisbg%3D
dcterms.bibliographicCitationFayyad, U.M., Irani, K.B., Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th international joint conference on artificial intelligence (1993) pp 1022–1027, , http://dblp.uni-trier.de/db/conf/ijcai/ijcai93.html#FayyadI93
dcterms.bibliographicCitationhttp://www.ics.uci.edu/~mlearn/MLRepository.html, Newman DJ, Hettich S, Blake CL, Merz CJ (1998) UCI repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine, CA
dcterms.bibliographicCitationGuyon, I., Gunn, S.R., Ben-Hur, A., Dror G (2004) Result analysis of the NIPS (2003) feature selection challenge. In, pp. 545-552. , http://papers.nips.cc/paper/2728-result-analysis-of-the-nips-2003-feature-selection-challenge, Advances in neural information processing systems, Vancouver, BC:
dcterms.bibliographicCitationWebb, A.R., (2002) Statistical pattern recognition, , Wiley, Chichester:
dcterms.bibliographicCitationCover, T.M., The best two independent measurements are not the two best (1974) IEEE Trans Syst Man Cybern, 4, pp. 116-117
datacite.rightshttp://purl.org/coar/access_right/c_16ec
oaire.resourceTypehttp://purl.org/coar/resource_type/c_6501
oaire.versionhttp://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.driverinfo:eu-repo/semantics/article
dc.type.hasversioninfo:eu-repo/semantics/publishedVersion
dc.identifier.doi10.1007/s11030-014-9565-z
dc.subject.keywordsChemometric analysis
dc.subject.keywordsClassification
dc.subject.keywordsComputational program
dc.subject.keywordsFeature selection
dc.subject.keywordsIMMAN
dc.subject.keywordsInformation-theoretic function
dc.subject.keywordsAlgorithm
dc.subject.keywordsSoftware
dc.subject.keywordsTheoretical model
dc.subject.keywordsAlgorithms
dc.subject.keywordsModels, Theoretical
dc.subject.keywordsSoftware
dc.rights.accessrightsinfo:eu-repo/semantics/restrictedAccess
dc.rights.ccAtribución-NoComercial 4.0 Internacional
dc.identifier.instnameUniversidad Tecnológica de Bolívar
dc.identifier.reponameRepositorio UTB
dc.type.spaArtículo
dc.identifier.orcid56497011800
dc.identifier.orcid55363486500
dc.identifier.orcid55665599200
dc.identifier.orcid56189852800
dc.identifier.orcid56191215400
dc.identifier.orcid6701762262


Ficheros en el ítem

FicherosTamañoFormatoVer

No hay ficheros asociados a este ítem.

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

http://creativecommons.org/licenses/by-nc-nd/4.0/
http://creativecommons.org/licenses/by-nc-nd/4.0/

Universidad Tecnológica de Bolívar - 2017 Institución de Educación Superior sujeta a inspección y vigilancia por el Ministerio de Educación Nacional. Resolución No 961 del 26 de octubre de 1970 a través de la cual la Gobernación de Bolívar otorga la Personería Jurídica a la Universidad Tecnológica de Bolívar.