Analysis of Key Features Affecting the Effectiveness of English Language Learning Among Undergraduate Students Utilizing the Data Mining Techniques

Chidchanok Kaewpanitch
Sutep Tongngam


The objective of this research was to study and analyze key features affecting the effectiveness of English language learning among undergraduate students. This research utilized data mining techniques and machine learning algorithms to compare the effectiveness and accuracy of the Decision Tree and Random Forest Classifier Supervised Machine Learning algorithms. The Faculty of Geoinformatics at Burapha University Academic year 2018 - 2020 consisted of 299 students. From this department, a survey was used to collect 32 attributes for data mining and analysis. The Content Validation Index used to calculate content validity in the questionnaire, and was greater than 0.7. The Decision Tree and Random Forest Classifier performance were evaluated using a 10-fold cross-validation method with a data ratio of 70:30. Accuracy, Precision, and balance (F1 score) with Confusion Matrix format was used to interpret and analyze the results of the data models.
The results indicated the data model developed with Random Forest Classifier produced a higher classification accuracy for students' English performance than that of the Decision Tree algorithm. The Random Forest Classifier produced an accuracy of 85.14% in classifying students' performance based upon a set of attributes in a given dataset, whereas the Decision Tree produced an accuracy of 78.0% using the same dataset. Random Forest Classifier produced an attribute ranking and correlation coefficient metrics to measure relationship strengths between student grades and other attributes. It found that students that enjoyed learning English had a 0.1194 correlation to the students' English performance. Asking teachers immediately when they didn't understand the content of the lesson had a correlation of 0.1149 to the students' English performance. Interest and attention to English showed a 0.0884 correlation to the students' English performance. Students that studied in groups to review English lessons with friends when the exam was approaching had a correlation of 0.0864 to students' English performance. Students that invested extra time into research and studying English beyond the classroom had a correlation value of 0.0747 to the students' English performance. Each of these measurements was the result of calculating the characteristics that affect English language learning in a University environment. This academic research can help to improve undergraduate students' performance when learning English and can help professors to understand how teaching methods can impact student performance.

