STUDENT ACADEMIC PERFORMANCE PREDICTION FRAMEWORK WITH FEATURE SELECTION AND IMBALANCED DATA HANDLING

Authors

  • Vivi Nur Wijayaningrum Politeknik Negeri Malang, Indonesia
  • Annisa Puspa Kirana Politeknik Negeri Malang, Indonesia
  • Ika Kusumaning Putri Politeknik Negeri Malang, Indonesia

DOI:

https://doi.org/10.21107/kursor.v12i3.356

Keywords:

classification, drop out, Random Forest, SMOTE

Abstract

Various factors cause the low scores of students in practicum courses. If these factors cannot be identified, more and more students will drop out of the study due to low scores, especially Vocational College students who do not have the opportunity to improve their scores in the short semester. Students with the potential to drop out must be identified as soon as possible because the number of dropouts can have an impact on a university's accreditation value. In this study, the prediction of student academic performance was carried out using a framework consisting of imbalanced data handling using SMOTE and feature selection using Random Forest, as well as the application of Multi-Layer Perceptron (MLP) for the formation of a classification model. The MLP architecture consists of some neurons in the input layer, two hidden layers with five neurons each, and two neurons in the output layer. SMOTE succeeded in selecting ten significant parameters from 22 initial parameters, which produced the most accurate predictions. According to the test results, the proposed framework offers the best accuracy of 0.8889 and an F1-Score of 0.9032. These results prove that the proposed framework can be used as an alternative for the Department to take action to prevent students from dropping out.

Downloads

Download data is not yet available.

References

L. Bäulke, C. Grunschel, and M. Dresel, “Student dropout at university: a phase-orientated view on quitting studies and changing majors,” European Journal of Psychology of Education, pp. 1–24, 2021, doi: 10.1007/s10212-021-00557-x.

C. M. Fourie, “Risk factors associated with first-year students’ intention to drop out from a university in South Africa,” J Furth High Educ, vol. 44, no. 2, pp. 201–215, 2020, doi: 10.1080/0309877X.2018.1527023.

H. S. Park and S. J. Yoo, “Early Dropout Prediction in Online Learning of University using Machine Learning,” International Journal on Informatics Visualization, vol. 5, no. 4, pp. 347–353, 2021, doi: 10.30630/JOIV.5.4.732.

I. Alqudah, M. Barakat, S. M. Muflih, and A. Alqudah, “Undergraduates’ perceptions and attitudes towards online learning at Jordanian universities during COVID-19,” Interactive Learning Environments, pp. 1–18, 2021, doi: 10.1080/10494820.2021.2018617.

S. Bali and M. C. Liu, “Students’ perceptions toward online learning and face-to-face learning courses,” in Journal of Physics: Conference Series, 2018, p. 012094. doi: 10.1088/1742-6596/1108/1/012094.

M. Mather and A. Sarkans, “Student Perceptions of Online and Face-to-Face Learning,” International Journal of Curriculum and Instruction, vol. 10, no. 2, pp. 61–76, 2018.

F. Ferri, P. Grifoni, and T. Guzzo, “Online learning and emergency remote teaching: Opportunities and challenges in emergency situations,” Societies, vol. 10, no. 4, p. 86, 2020, doi: 10.3390/soc10040086.

A. Tayebi, J. Gomez, and C. Delgado, “Analysis on the Lack of Motivation and Dropout in Engineering Students in Spain,” IEEE Access, vol. 9, pp. 66253–66265, 2021, doi: 10.1109/ACCESS.2021.3076751.

J. Gabalán-Coello, A. L. Balcero-Molina, F. E. Vasquez Rizo, A. Martínez-González, and G. Fonseca-Grandón, “An Analysis of Accredited Colombian Universities, Based on Performance Variables Associated with Their Quality,” J Lat Educ, pp. 1–9, 2019, doi: 10.1080/15348431.2019.1665523.

E. Sneyers and K. De Witte, “The interaction between dropout, graduation rates and quality ratings in universities,” Journal of the Operational Research Society, vol. 68, no. 4, pp. 416–430, 2017, doi: 10.1057/jors.2016.15.

Q. Kabashi, I. Shabani, and N. Caka, “Analysis of the student dropout rate at the Faculty of Electrical and Computer Engineering of the University of Prishtina, Kosovo, from 2001 to 2015,” IEEE Access, vol. 10, pp. 68126–68137, 2022, doi: 10.1109/access.2022.3185620.

J. Jacqmin and M. Lefebvre, “The effect of international accreditations on students’ revealed preferences: Evidence from French Business schools,” Econ Educ Rev, vol. 85, p. 102192, 2021, doi: 10.1016/j.econedurev.2021.102192.

A. Acevedo-De-los-Ríos and D. R. Rondinel-Oviedo, “Impact, added value and relevance of an accreditation process on quality assurance in architectural higher education,” Quality in Higher Education, vol. 28, no. 2, pp. 186–204, 2022, doi: 10.1080/13538322.2021.1977482.

M. A. S. Mustapa, M. Ibrahim, and A. Yusoff, “Engaging Vocational College Students through Blended Learning: Improving Class Attendance and Participation,” Procedia Soc Behav Sci, vol. 204, pp. 127–135, 2015, doi: 10.1016/j.sbspro.2015.08.125.

R. Kadar, S. B. Mahlan, M. Shamsuddin, J. Othman, and N. A. Wahab, “Analysis of Factors Contributing to the Difficulties in Learning Computer Programming among Non-Computer Science Students,” in 2022 IEEE 12th Symposium on Computer Applications & Industrial Electronics (ISCAIE), IEEE, 2022, pp. 89–94. doi: 10.1109/ISCAIE54458.2022.9794546.

Y. P. Huang and Y. M. Huang, “Programming language learning supported by an accredited course strategy,” in 2013 IEEE 13th International Conference on Advanced Learning Technologies, IEEE, 2013, pp. 327–329. doi: 10.1109/ICALT.2013.101.

Z. Li, Z. Jie, and H. Daming, “Design and implementation of student programming profile-based teaching aids solution in introductory programming course,” in 2020 15th International Conference on Computer Science & Education (ICCSE), 2020, pp. 383–390. doi: 10.1109/ICCSE49874.2020.9201695.

A. Baist and A. S. Pamungkas, “Analysis of Student Difficulties in Computer Programming,” VOLT: Jurnal Ilmiah Pendidikan Teknik ElektroElektro, vol. 2, no. 2, pp. 81–92, 2017, doi: 10.30870/volt.v2i2.2211.

L. Dombrovskaia, J. P. del Rio, and P. Rodríguez, “Prediction of student’s retention in first year of engineering program at a technological chilean university,” in 2020 39th International Conference of the Chilean Computer Science Society (SCCC), 2020, pp. 34–37. doi: 10.1109/SCCC51225.2020.9281195.

S. Sivakumar, S. Venkataraman, and R. Selvaraj, “Predictive modeling of student dropout indicators in educational data mining using improved decision tree,” Indian J Sci Technol, vol. 9, no. 4, pp. 1–5, 2016, doi: 10.17485/ijst/2016/v9i4/87032.

S. Roy and A. Garg, “Predicting academic performance of student using classification techniques,” in 2017 4th IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics (UPCON), IEEE, 2017, pp. 568–572. doi: 10.1109/UPCON.2017.8251112.

V. L. Miguéis, A. Freitas, P. J. V. Garcia, and A. Silva, “Early segmentation of students according to their academic performance: A predictive modelling approach,” Decis Support Syst, vol. 115, pp. 36–51, 2028, doi: 10.1016/j.dss.2018.09.001.

H. A. Mengash, “Using data mining techniques to predict student performance to support decision making in university admission systems,” IEEE Access, vol. 8, pp. 55462–55470, 2020, doi: 10.1109/ACCESS.2020.2981905.

H. P. Singh and H. N. Alhulail, “Predicting Student-Teachers Dropout Risk and Early Identification: A Four-Step Logistic Regression Approach,” IEEE Access, vol. 10, pp. 6470–6482, 2022, doi: 10.1109/ACCESS.2022.3141992.

Harwati, R. I. Virdyanawaty, and A. Mansur, “Drop out Estimation Students based on the Study Period: Comparison between Naïve Bayes and Support Vector Machines Algorithm Methods,” in IOP Conference Series: Materials Science and Engineering, 2016, p. 012039. doi: 10.1088/1757-899X/105/1/012039.

S. Mutrofin, A. M. Khalimi, E. Kurniawan, R. V. H. Ginardi, C. Fatichah, and Y. A. Sari, “Detection of Potentially Students Drop out of College in Case of Missing Value Using C4.5,” in 2019 International Conference on Sustainable Engineering and Creative Computing (ICSECC), IEEE, 2019, pp. 349–354. doi: 10.1109/ICSECC.2019.8907014.

M. Utari, B. Warsito, and R. Kusumaningrum, “Implementation of Data Mining for Drop-Out Prediction using Random Forest Method,” in 2020 8th International Conference on Information and Communication Technology (ICoICT), IEEE, 2020, pp. 1–5. doi: 10.1109/ICoICT49345.2020.9166276.

M. I. Sa’ad, Kusrini, and M. S. Mustafa, “Student Prediction of Drop out Using Extreme Learning Machine (ELM) Algorithm,” in 2020 2nd International Conference on Cybernetics and Intelligent System (ICORIS), IEEE, 2020, pp. 1–6. doi: 10.1109/ICORIS50180.2020.9320831.

I. M. S. Bimantara and I. M. Widiartha, “Optimization of K-Means Clustering Using Particle Swarm Optimization Algorithm for Grouping Traveler Reviews Data on Tripadvisor Sites,” Jurnal Ilmiah KURSOR, vol. 12, no. 1, pp. 1–10, 2023.

M. S. Pathan, A. Nag, M. M. Pathan, and S. Dev, “Analyzing the impact of feature selection on the accuracy of heart disease prediction,” Healthcare Analytics, vol. 2, p. 100060, 2022.

S. Adi, Y. Pristyanto, and A. Sunyoto, “The best features selection method and relevance variable for web phishing classification,” in 2019 International Conference on Information and Communications Technology (ICOIACT), IEEE, 2019, pp. 578–583. doi: 10.1109/ICOIACT46704.2019.8938566.

S. Wild and L. S. Heuling, “Student dropout and retention: An event history analysis among students in cooperative higher education,” Int J Educ Res, vol. 104, p. 101687, 2020, doi: 10.1016/j.ijer.2020.101687.

M. K. Morampudi, N. Gonthina, V. D. Reddy, and K. S. Rao, “Analyzing Student Performance in Programming Education Using Classification Techniques,” in 2022 International Conference on Advancements in Smart, Secure and Intelligent Computing (ASSIC), IEEE, 2022. doi: 10.1109/ASSIC55218.2022.10088377.

R. Anantama, H. Suyono, and M. Aswin, “Application of Cost-Sensitive Convolutional Neural Network for Pneumonia Detection,” Jurnal Ilmiah KURSOR, vol. 11, no. 3, pp. 101–108, 2022.

I. P. B. W. Brata and I. D. M. B. A. Darmawan, “Neural Network Backpropagation for Kendang Tunggal Tone Classification,” Jurnal Ilmiah KURSOR, vol. 11, no. 2, pp. 63–74, 2021.

H. A. Rosyid, A. Maulana, and U. Pujianto, “Can K-Nearest Neighbor Method Be Used To Predict Success in Indonesia State University Student Selection,” Jurnal Ilmiah KURSOR, vol. 9, no. 4, pp. 137–144, 2018.

V. N. Wijayaningrum, A. P. Kirana, I. K. Putri, and T. O. Satrio, “Prediction of Student Academic Performance in Practicum Courses Based on Activity Logs and Student Background,” in 2022 International Conference on Electrical and Information Technology, IEEE, 2022, pp. 366–371. doi: 10.1109/IEIT56384.2022.9967888.

F. Thiele, A. J. Windebank, and A. M. Siddiqui, “Motivation for using data-driven algorithms in research: A review of machine learning solutions for image analysis of micrographs in neuroscience,” Journal of Neuropathology and Experimental Neurology, vol. 82, no. 7. Oxford University Press, pp. 595–610, Jul. 01, 2023. doi: 10.1093/jnen/nlad040.

L. Qiu, Y. Liu, Q. Hu, and Y. Liu, “Student dropout prediction in massive open online courses by convolutional neural networks,” Soft comput, vol. 23, no. 20, pp. 10287–10301, Oct. 2019, doi: 10.1007/s00500-018-3581-3.

E. T. Lau, L. Sun, and Q. Yang, “Modelling, prediction and classification of student academic performance using artificial neural networks,” SN Appl Sci, vol. 1, no. 9, pp. 1–10, Sep. 2019, doi: 10.1007/s42452-019-0884-7.

S. C. Tsai, C. H. Chen, Y. T. Shiao, J. S. Ciou, and T. N. Wu, “Precision education with statistical learning and deep learning: a case study in Taiwan,” International Journal of Educational Technology in Higher Education, vol. 17, no. 12, pp. 1–13, Dec. 2020, doi: 10.1186/s41239-020-00186-2.

E. Ismanto, H. A. Ghani, N. I. M. Saleh, J. Al Amien, and R. Gunawan, “Recent systematic review on student performance prediction using backpropagation algorithms,” Telkomnika (Telecommunication Computing Electronics and Control), vol. 20, no. 3, pp. 597–606, 2022, doi: 10.12928/TELKOMNIKA.v20i3.21963.

V. N. Wijayaningrum, I. K. Putri, A. P. Kirana, M. R. Mubarok, D. M. Harahap, and B. R. Hamesha, “Analisis Performa Seleksi Atribut untuk Menentukan Potensi Mahasiswa Putus Studi [Performance Analysis of Attribute Selection to Determine the Potential of Students Drop Out],” Jurnal Informatika Polinema, vol. 9, no. 2, pp. 237–243, 2023.

C. Kaope and Y. Pristyanto, “The Effect of Class Imbalance Handling on Datasets Toward Classification Algorithm Performance,” MATRIK: Jurnal Manajemen, Teknik Informatika Dan Rekayasa Komputer, vol. 22, no. 2, pp. 227–238, Mar. 2023, doi: 10.30812/matrik.v22i2.2515.

D. J. Maulana, S. Saadah, and P. E. Yunanto, “Kmeans-SMOTE Integration for Handling Imbalance Data in Classifying Financial Distress Companies using SVM and Naïve Bayes,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 8, no. 1, pp. 54–61, 2024, doi: 10.29207/resti.v8i1.5150.

Downloads

Published

2024-05-25

Issue

Section

Articles

Citation Check