A ROBUST DEEP LEARNING FRAMEWORK FOR PREDICTING ACADEMIC SUCCESS USING ADVANCED NEURAL ARCHITECTURES

Sundas Israr; Muhammad Sajid Maqbool; Dr. Israr Hanif; Muqadas Nadeem; Abdul Basit; Aiman Ali Batool

Authors

Sundas Israr
Muhammad Sajid Maqbool
Dr. Israr Hanif
Muqadas Nadeem
Abdul Basit
Aiman Ali Batool

Keywords:

Educational Data Mining (EDM), Student Performance Prediction, Machine Learning, Deep Learning, Classification Algorithms, Principal Component Analysis (PCA)

Abstract

In recent years, data mining techniques have gained significant attention in educational institutions for improving the quality of education and enhancing academic decision-making processes. Accurate prediction of student academic performance plays a vital role in identifying students at risk of poor achievement and supports the development of effective educational strategies. Numerous studies have focused on predicting student performance at the higher education level, as academic success in earlier semesters strongly influences students’ future learning progress and retention. In semester-based educational systems, many students experience academic difficulties or fail to achieve satisfactory grades during the initial stages of higher education. Therefore, the early prediction of student performance is essential for improving student retention and academic outcomes. Educational Data Mining (EDM) provides techniques for extracting meaningful information, hidden patterns, and valuable knowledge from large volumes of educational data. These extracted insights can be utilized to predict students’ future academic success and support timely interventions. The primary objective of this research is to evaluate student performance using multiple classification techniques and identify the model that achieves the highest predictive accuracy. The educational dataset used in this study is obtained from a Kaggle repository. The proposed methodology consists of several stages. Initially, the dataset undergoes preprocessing, including the removal of duplicate records and handling of missing values through appropriate data imputation techniques. Subsequently, three classification algorithms are implemented using the Weka data mining tool. These algorithms include Deep Learning-based Neural Networks (NN) and traditional machine learning techniques such as Random Forest (RF), Support Vector Machine (SVM). To enhance feature quality and reduce dimensionality, Principal Component Analysis (PCA) is applied for optimized feature extraction. Furthermore, the performance of all classification models is evaluated using a training–testing split validation available in the Weka environment. The models are assessed using standard performance evaluation metrics, including Training Accuracy, Testing Accuracy, Precision, Recall, and F1-Score. Experimental results indicate that the Neural Network and Random Forest classifiers outperform the SVM model in terms of predictive accuracy and overall classification performance