A PREDICTIVE ANALYSIS OF STUDENT DROPOUTS IN IT HIGHER EDUCATION PROGRAMMES
U.G.N. Kumari*
Faculty of Information Technology, University of Moratuwa, Sri Lanka
Session: Technical Session E
Abstract
The study primarily aims at identifying the key attributes that contribute to student dropouts in Information Communication Technology (ICT) courses offered by Higher Education Institutes, a significant issue in educational data mining. It seeks to explore the distinct factors influencing dropout rates that have been underexplored in existing literature. Data was collected from five batches of students enrolled in an Information Technology course at a government tertiary education institute in Sri Lanka. The collected data underwent pre-processing and feature selection was carried out using the Correlation-Based Feature Selection (CFS) to pinpoint subsets of attributes closely linked to dropout outcomes. Mostly used classification algorithms were evaluated based on their performance using confusion matrix metrics. Therefore, this study trains a set of classification models, namely Decision Tree, K-Nearest Neighbor, Naïve Bayes, and Rule-Based approaches that attained an accuracy of over 83.17% in defining strong associations between dropout factors and dropout status which is known as “Yes” and “No”. J48 Decision Tree was the topmost algorithm for this dataset, and the predictive modeling of student profiles was done using the same. The model’s performance was validated using a new dataset sourced from institutional records. The dropout prediction application was implemented using the Java WEKA API and achieved 92.61% accuracy in predicting student dropouts in ICT higher education in all educational streams. By uncovering strong relationships between dropout factors and dropout status, the study highlights key influences, with the most significant factors being perceived course quality, previous academic qualifications, previous ICT experience, Ordinary Level results, and English proficiency level in the Sri Lankan context. This model can be utilized to predictively analyze student dropouts in ICT higher education, allowing early identification of at-risk students and facilitating targeted intervention strategies.
Keywords: classification algorithms, data mining, dropouts
DOI: 10.64752/VMTF4884