Gene encoder: a feature selection technique through unsupervised deep learning-based clustering for large gene expression data

Uzma, , Al-Obeidat, F., Tubaishat, A., Shah, B. and Halim, Z. (2022) Gene encoder: a feature selection technique through unsupervised deep learning-based clustering for large gene expression data. Neural Computing and Applications, 34(11), pp. 8309-8331. (doi: 10.1007/s00521-020-05101-4)

[img] Text
306711.pdf - Accepted Version

1MB

Abstract

Cancer is a severe condition of uncontrolled cell division that results in a tumor formation that spreads to other tissues of the body. Therefore, the development of new medication and treatment methods for this is in demand. Classification of microarray data plays a vital role in handling such situations. The relevant gene selection is an important step for the classification of microarray data. This work presents gene encoder, an unsupervised two-stage feature selection technique for the cancer samples’ classification. The first stage aggregates three filter methods, namely principal component analysis, correlation, and spectral-based feature selection techniques. Next, the genetic algorithm is used, which evaluates the chromosome utilizing the autoencoder-based clustering. The resultant feature subset is used for the classification task. Three classifiers, namely support vector machine, k-nearest neighbors, and random forest, are used in this work to avoid the dependency on any one classifier. Six benchmark gene expression datasets are used for the performance evaluation, and a comparison is made with four state-of-the-art related algorithms. Three sets of experiments are carried out to evaluate the proposed method. These experiments are for the evaluation of the selected features based on sample-based clustering, adjusting optimal parameters, and for selecting better performing classifier. The comparison is based on accuracy, recall, false positive rate, precision, F-measure, and entropy. The obtained results suggest better performance of the current proposal.

Item Type:Articles
Additional Information:This work was sponsored by the GIK Institute graduate research fund under GA-F scheme.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Uzma, Dr Uzma
Authors: Uzma, , Al-Obeidat, F., Tubaishat, A., Shah, B., and Halim, Z.
College/School:College of Science and Engineering > School of Engineering > Infrastructure and Environment
Journal Name:Neural Computing and Applications
Publisher:Springer
ISSN:0941-0643
ISSN (Online):1433-3058
Published Online:14 June 2020
Copyright Holders:Copyright © Springer-Verlag London Ltd., part of Springer Nature 2020
First Published:First published in Neural Computing and Applications 34(11):8309-8331
Publisher Policy:Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record