Predicting student dropout risk using machine learning: A case study at the Technical University of Manabí
Keywords:
algorithms, crisp dm, data mining, random forest, student dropoutAbstract
Student dropout is a problem that occurs in every Higher Education institution, which is why this research proposes the application of machine learning methods and algorithms allowing the risk of dropout in students at the Technical University of Manabí to be estimated. For this, the data collection process was developed, taking as reference the demographic and academic information of 10,002 students from different majors, information that was extracted through the Academic Management System. With the CRISP DM methodology, phases and processes were specified, taking as a sample a degree program from each of the faculties during the academic period from May 2014 to February 2019. Subsequently, the inclusion criteria were established to verify the regularity of the students who attended classes, through the Pearson correlation coefficient, the relationship between dropouts and non-regular students. Through the process carried out, three scenarios were obtained and the logistic regression algorithms, KNN, neural networks, support vector machines and the random forest algorithm were executed. As a result, the academic and demographic information of the students allowed us to verify the correlation between dropout students and non-regular students, so irregularity was an estimator of the risk of dropout that occurred in the institution, where the best scenario was during the second, third and fourth level of studies, with a margin of error of 0.05 evaluated by the metric evaluation system and with a high correlation in each of them, in the area under the AUC curve of 0.95 and an F1 score of 0.95.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Kasetsart UniversityThis is an open access article under the CC BY-NC-ND license http://creativecommons.org/licenses/by-nc-nd/4.0/



