A novel approach to estimation of E-coli promoter gene sequences: Combining feature selection and least square support vector machine (FS_LSSVM)

dc.contributor.authorPolat, Kemal
dc.contributor.authorGuenes, Salih
dc.date.accessioned2020-03-26T17:16:57Z
dc.date.available2020-03-26T17:16:57Z
dc.date.issued2007
dc.departmentSelçuk Üniversitesien_US
dc.description.abstractIn this paper, we have investigated the real-world task of recognizing biological concepts in DNA sequences. Recognizing promoters in strings that represent nucleotides (one of A, G, T, or C) has been performed using a novel approach based on combining feature selection (FS) and least square support vector machine (LSSVM). Dimensionality of Escherichia coli promoter gene sequences dataset has 57 attributes and 106 samples including 53 promoters and 53 non-promoters. The proposed system consists of two parts. Firstly, we have used the FS process to reduce the dimensionality of E. coli promoter gene sequences dataset that has 57 attributes. So the dimensionality of this dataset has been reduced to 4 attributes by means of FS process. Secondly, LSSVM classifier algorithm has been run to estimation the E. coli promoter gene sequences. In order to show the performance of the proposed system, we have used the success rate, sensitivity and specificity analysis, 10-fold cross validation, and confusion matrix. Whilst only LSSVM classifier has been obtained 80% success rate using 10-fold cross validation, the proposed system has been obtained 100% success rate for same condition. These obtained results indicate that the proposed approach improve the success rate in recognizing promoters in strings that represent nucleotides. (C) 2007 Elsevier Inc. All rights reserved.en_US
dc.identifier.doi10.1016/j.amc.2007.02.033en_US
dc.identifier.endpage1582en_US
dc.identifier.issn0096-3003en_US
dc.identifier.issn1873-5649en_US
dc.identifier.issue2en_US
dc.identifier.scopusqualityQ1en_US
dc.identifier.startpage1574en_US
dc.identifier.urihttps://dx.doi.org/10.1016/j.amc.2007.02.033
dc.identifier.urihttps://hdl.handle.net/20.500.12395/21186
dc.identifier.volume190en_US
dc.identifier.wosWOS:000248400400060en_US
dc.identifier.wosqualityQ2en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.language.isoenen_US
dc.publisherELSEVIER SCIENCE INCen_US
dc.relation.ispartofAPPLIED MATHEMATICS AND COMPUTATIONen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.selcuk20240510_oaigen_US
dc.subjectE. coli promoter gene sequencesen_US
dc.subjectfeature selectionen_US
dc.subjectLSSVM classifieren_US
dc.subjectestimationen_US
dc.titleA novel approach to estimation of E-coli promoter gene sequences: Combining feature selection and least square support vector machine (FS_LSSVM)en_US
dc.typeArticleen_US

Dosyalar