Model Predictive Analysis of Performance in Training and Course Institutions Using Naive Bayes and K-Means Clustering

Authors

  • Muhammad Iqbal Universitas Pembangunan Panca Budi
  • Eko Budianto Universitas Pembangunan Panca Budi

Keywords:

Predictive Analysis, Naive Bayes, K-Means, Institutional Performance, Data Mining

Abstract

The performance of course and training institutions (LKP) is a crucial factor in determining the quality of non-formal education in Indonesia. Performance assessments are currently conducted manually using the National Accreditation Board for Non-Formal Education (BAN-PNF) assessment instrument, which is time-consuming and prone to subjectivity. This research aims to develop a predictive analysis model for the performance of course and training institutions using a combination of the Naive Bayes and K-Means Clustering methods. The K-Means Clustering method is used to group institutions based on similar characteristics across key variables such as trainers, infrastructure, curriculum, management, and graduate outcomes. These clustering results are then used as additional features for the Naive Bayes classification model to predict performance categories (high, medium, or low). Testing of 150 institutions' data showed a predictive accuracy of 89.2%, with three main clusters representing high-, medium-, and low-performing institutions. This model has the potential to become a data-driven tool for governments and institutions to conduct performance evaluations quickly, objectively, and adaptively to changes in training data.

References

Han, J., Kamber, M., & Pei, J. (2021). Data Mining: Concepts and Techniques. Morgan Kaufmann.

Jain, A. K. (2010). Data Clustering: 50 Years Beyond K-Means. Pattern Recognition Letters.

Witten, I. H., & Frank, E. (2017). Data Mining: Practical Machine Learning Tools and Techniques. Elsevier.

Simargolang, M. Y., & Budianto, E. (2024). Sistem Pendukung Keputusan Penilaian Akreditasi Lembaga Pelatihan Kerja. Journal of Science and Social Research.

B. Alnasyan, “The power of Deep Learning techniques for predicting student dropout,” Procedia Computer Science, vol. 266, pp. 112–119, 2024.

J. Niyogisubizo, “Predicting student's dropout in university classes using two machine learning algorithms,” Procedia Computer Science, vol. 266, pp. 201–208, 2022.

C. Thomas, “Automatic prediction of presentation style and student engagement in online learning,” Procedia Computer Science, vol. 266, pp. 150–157, 2022.

A. Villar, “Supervised machine learning algorithms for predicting student dropout and academic success: a comparative study,” AI Open, vol. 4, pp. 1–10, 2024.

T. Althaqafi, “Enhancing student performance prediction: the role of learning management systems,” Education and Information Technologies, vol. 30, pp. 345–359, 2025.

Tan, P.-N., Steinbach, M., & Kumar, V. (2022). Introduction to Data Mining. Pearson.

Downloads

Published

2025-10-27