Model Predictive Analysis of Performance in Training and Course Institutions Using Naive Bayes and K-Means Clustering
Keywords:
Predictive Analysis, Naive Bayes, K-Means, Institutional Performance, Data MiningAbstract
The performance of course and training institutions (LKP) is a crucial factor in determining the quality of non-formal education in Indonesia. Performance assessments are currently conducted manually using the National Accreditation Board for Non-Formal Education (BAN-PNF) assessment instrument, which is time-consuming and prone to subjectivity. This research aims to develop a predictive analysis model for the performance of course and training institutions using a combination of the Naive Bayes and K-Means Clustering methods. The K-Means Clustering method is used to group institutions based on similar characteristics across key variables such as trainers, infrastructure, curriculum, management, and graduate outcomes. These clustering results are then used as additional features for the Naive Bayes classification model to predict performance categories (high, medium, or low). Testing of 150 institutions' data showed a predictive accuracy of 89.2%, with three main clusters representing high-, medium-, and low-performing institutions. This model has the potential to become a data-driven tool for governments and institutions to conduct performance evaluations quickly, objectively, and adaptively to changes in training data.
References
Han, J., Kamber, M., & Pei, J. (2021). Data Mining: Concepts and Techniques. Morgan Kaufmann.
Jain, A. K. (2010). Data Clustering: 50 Years Beyond K-Means. Pattern Recognition Letters.
Witten, I. H., & Frank, E. (2017). Data Mining: Practical Machine Learning Tools and Techniques. Elsevier.
Simargolang, M. Y., & Budianto, E. (2024). Sistem Pendukung Keputusan Penilaian Akreditasi Lembaga Pelatihan Kerja. Journal of Science and Social Research.
B. Alnasyan, “The power of Deep Learning techniques for predicting student dropout,” Procedia Computer Science, vol. 266, pp. 112–119, 2024.
J. Niyogisubizo, “Predicting student's dropout in university classes using two machine learning algorithms,” Procedia Computer Science, vol. 266, pp. 201–208, 2022.
C. Thomas, “Automatic prediction of presentation style and student engagement in online learning,” Procedia Computer Science, vol. 266, pp. 150–157, 2022.
A. Villar, “Supervised machine learning algorithms for predicting student dropout and academic success: a comparative study,” AI Open, vol. 4, pp. 1–10, 2024.
T. Althaqafi, “Enhancing student performance prediction: the role of learning management systems,” Education and Information Technologies, vol. 30, pp. 345–359, 2025.
Tan, P.-N., Steinbach, M., & Kumar, V. (2022). Introduction to Data Mining. Pearson.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Muhammad Iqbal, Eko Budianto

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.




