TY - JOUR
T1 - A Hybrid Machine Learning Approach for Predicting Student Performance Using Multi-class Educational Datasets
AU - Al-Tameemi, Ghaith
AU - Xue, James
AU - Hadi, Israa
AU - Ajit, Suraj
PY - 2024/7/8
Y1 - 2024/7/8
N2 - Prediction of students’ academic performance has garnered considerable interest, with many institutions seek to enhance students’ performance and their quality of education. The integration of both unsupervised and supervised machine learning techniques has demonstrated significant efficacy in predicting student performance. This paper explores the application of different machine learn- ing methods in predicting student academic performance. Initially, Principal Component Analysis (PCA) was utilised to reduce the dataset’s dimensionality, thereby improving its visualisation. Subsequently, K-Means clustering was employed to segregate students into distinct groups, reflective of their learning behaviors. Afterwards, the observed clusters were utilised for training classification models to address each student cluster individually. This approach was implemented in a case study involving an un- dergraduate science course at a North American University (NAU) and the Open University Learning Analytics Dataset (OULAD). Empirical findings indicate that the combined use of Feedforward Dense Network (FDN), Random Forest (RF), and Decision Tree (DT), specifically in their clustered forms, outperforms other classifiers in predicting student academic performance effectively.
AB - Prediction of students’ academic performance has garnered considerable interest, with many institutions seek to enhance students’ performance and their quality of education. The integration of both unsupervised and supervised machine learning techniques has demonstrated significant efficacy in predicting student performance. This paper explores the application of different machine learn- ing methods in predicting student academic performance. Initially, Principal Component Analysis (PCA) was utilised to reduce the dataset’s dimensionality, thereby improving its visualisation. Subsequently, K-Means clustering was employed to segregate students into distinct groups, reflective of their learning behaviors. Afterwards, the observed clusters were utilised for training classification models to address each student cluster individually. This approach was implemented in a case study involving an un- dergraduate science course at a North American University (NAU) and the Open University Learning Analytics Dataset (OULAD). Empirical findings indicate that the combined use of Feedforward Dense Network (FDN), Random Forest (RF), and Decision Tree (DT), specifically in their clustered forms, outperforms other classifiers in predicting student academic performance effectively.
KW - Hybrid Machine Learning
KW - Principal Component Analysis (PCA)
KW - K-Means Clustering
KW - Student Performance
UR - https://pure.northampton.ac.uk/en/publications/801663e4-78a8-4a71-8065-32f39675a734
U2 - 10.1016/j.procs.2024.06.108
DO - 10.1016/j.procs.2024.06.108
M3 - Article
SN - 1877-0509
VL - 238
SP - 888
EP - 895
JO - Procedia Computer Science
JF - Procedia Computer Science
ER -