COMPARATIVE EVALUATION OF DEEP LEARNING AND TRADITIONAL MACHINE LEARNING CLASSIFIERS ON DUAE-REDUCED HIGH-DIMENSIONAL GENE EXPRESSION DATA FOR HEAD AND NECK SQUAMOUS CELL CARCINOMA

Aneela Nargis

Authors

Aneela Nargis

Keywords:

Gene expression classification; HNSCC; deep under-complete autoencoder; WideResNet; machine learning; deep learning; transcriptomics; dimensionality reduction

Abstract

High-dimensional transcriptomic datasets present a major challenge for robust cancer classification because the number of measured genes substantially exceeds the number of available samples, making dimensionality reduction a critical preprocessing step before downstream predictive modeling. In this study, a Deep Under-complete Autoencoder (DUAE) was employed to compress high-dimensional Head and Neck Squamous Cell Carcinoma (HNSCC) gene expression data while preserving discriminative structure for classification. Gene expression data were obtained from The Cancer Genome Atlas (TCGA), and following preprocessing, the DUAE-reduced feature representation was evaluated using four traditional machine learning classifiers, Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Gradient Boosting Machine (GBM), and four deep learning architectures, WideResNet, DenseNet, VGG, and EfficientNet. Model performance was assessed using accuracy, area under the receiver operating characteristic curve (ROC-AUC), precision, recall, and F1-score. Among all evaluated models, WideResNet achieved the strongest overall performance, with an accuracy of 0.970, ROC-AUC of 0.990, precision of 0.960, recall of 0.950, and F1-score of 0.955, followed by VGG, which also demonstrated strong and balanced classification performance. Among the traditional machine learning baselines, GBM and Random Forest remained competitive, whereas SVM and KNN showed comparatively lower performance. DenseNet and EfficientNet demonstrated moderate predictive capability but did not match the stronger performance profiles of WideResNet and VGG. Overall, the findings indicate that DUAE-based feature compression preserved biologically relevant signal for downstream classification, while the final predictive performance remained strongly dependent on classifier architecture. These results suggest that deep residual learning, particularly WideResNet, may offer substantial advantages for classification of DUAE-reduced gene expression data in HNSCC, and support the use of DUAE-driven representation learning as a practical and effective preprocessing strategy for high-dimensional genomic classification tasks.

COMPARATIVE EVALUATION OF DEEP LEARNING AND TRADITIONAL MACHINE LEARNING CLASSIFIERS ON DUAE-REDUCED HIGH-DIMENSIONAL GENE EXPRESSION DATA FOR HEAD AND NECK SQUAMOUS CELL CARCINOMA

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

if

Make a Submission

hec

DOI

SECP

Open access

spectrumlogo

AI

DS

ENGINEERING

MIX PIX

cs PIX

ENGINEERING PIX

ELECTRI ENGINEERN

working hours