Comparison of Elbow and Silhouette Methods in Optimizing K-Prototype Clustering for Customer Transactions

Dendy Arizki Kuswardana, Dwi Arman Prasetya, Trimono Trimono, I Gede Susrama Mas Diyasa

Abstract


This research presents a comparative analysis of the Elbow and Silhouette methods to identify the ideal number of clusters in applying the K-Prototypes algorithm for customer grouping using purchase transaction data. The K-Prototypes algorithm is employed due to its ability to handle both numerical and categorical data simultaneously. Customer purchase transaction data from the Point of Sale (POS) system is analyzed through preprocessing, feature transformation, and attribute segmentation stages before being clustered using the K-Prototypes algorithm. To identify the optimal cluster count, this study employs two methods: the Elbow and the Silhouette method. The results indicate that the Elbow method produces 2 clusters with a model evaluation score of 0.6368, while the Silhouette method suggests 2 clusters with a slightly lower score of 0.6186. In terms of computational efficiency, the Elbow method also demonstrates a faster processing time results highlight the significance of choosing an appropriate method for identifying the ideal number of clusters, ensuring it aligns with the specific goals of the analysis, whether emphasizing superior inter-cluster distinction or favoring a more parsimonious model configuration.


Keywords


Elbow; Silhouette; K-Prototype; Clustering; Customer Transactions

Full Text:

PDF

References


Amalijah, E., & Fredy, M. (2023). Pemetaan Restoran Jepang dan Kuliner Milenial di Surabaya. Jurnal Sakura : Sastra, Bahasa, Kebudayaan dan Pranata Jepang, 5(1), 169. https://doi.org/10.24843/JS.2023.v05.i01.p10

Ardian, S., & Syairudin, B. (2018). Development strategy of culinary business employing the Blue Ocean Strategy (BOS). IPTEK Journal of Proceedings Series, 0(3), 153. https://doi.org/10.12962/j23546026.y2018i3.3722

Arunachalam, M., Sekar, S., Erdmann, A. M., Sajith Variyar, V. V., & Sivanpillai, R. (2025). Comparative Analysis of Machine Learning Algorithms and Statistical Techniques for Data Analysis in Crop Growth Monitoring with NDVI. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLVIII-M-5–2024, 15–20. https://doi.org/10.5194/isprs-archives-XLVIII-M-5-2024-15-2025

Girsang, A. S. (2020). Clustering Hostels Data for Customer Preferences using K-Prototype Algorithm. International Journal of Emerging Trends in Engineering Research, 8(6), 2650–2653. https://doi.org/10.30534/ijeter/2020/70862020

Hindrayani, K. M., & Timur, J. (2020). Business Intelligence For Educational Institution: A Literature Review. 2(1). https://doi.org/10.33005/ijconsist.v2i1.32

Idhom, M., Priananda, A. M., Raynaldi, A., Nur, R., Pamungkas, S. A., & Wardana, A. C. (n.d.). UPAYA REBRANDING SEBAGAI BENTUK KEPEDULIAN TERHADAP UMKM. 2(4). https://doi.org/10.56855/jcos.v2i4.1112

Ijegwa David Acheme & Esosa Enoyoze. (2024). Customer personality analysis and clustering for targeted marketing. International Journal of Science and Research Archive, 12(1), 3048–3057. https://doi.org/10.30574/ijsra.2024.12.1.1003

Maurya, N. K. (2024). Decoding Consumer Dynamics: A Deep Dive into Food Industry Surveys and Trends. Nutrition and Food Processing, 07(14), 01–06. https://doi.org/10.31579/2637-8914/275

Prasetya, D. A., Nguyen, P. T., Faizullin, R., Iswanto, I., & Armay, F. (2020). Resolving the Shortest Path Problem using the Haversine Algorithm. Journal of Critical Reviews, 7(1). http://10.22159/jcr.07.01.11

Prasetya, D. A., Sari, A. P., Idhom, M., & Lisanthoni, A. (2025). Optimizing Clustering Analysis to Identify High-Potential Markets for Indonesian Tuber Exports. Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics, 7(1), 113–122. https://doi.org/10.35882/ijeeemi.v7i1.55

Punhani, A., Faujdar, N., Mishra, K. K., & Subramanian, M. (2022). Binning-Based Silhouette Approach to Find the Optimal Cluster Using K-Means. IEEE Access, 10, 115025–115032. https://doi.org/10.1109/ACCESS.2022.3215568

Riyantoko, P. A., Fahrudin, T. M., Prasetya, D. A., Trimono, T., & Timur, T. D. (2022). Analisis Sentimen Sederhana Menggunakan Algoritma LSTM dan BERT untuk Klasifikasi Data Spam dan Non-Spam. PROSIDING SEMINAR NASIONAL SAINS DATA, 2(1), 103–111. https://doi.org/10.33005/senada.v2i1.53

S. Dhivya Devi, A. V. B., G. Lakshmi, & Usman Ak, S. B., Syed Shujauddin Sameer,. (2024). Data-Driven Decision-Making: Leveraging Analytics for Performance Improvement. Journal of Informatics Education and Research, 4(3). https://doi.org/10.52783/jier.v4i3.1298

Sipayung, E. M., Fiarni, C., & Tanudjaya, R. (2015). DECISION SUPPORT SYSTEM FOR POTENTIAL SALES AREA OF PRODUCT MARKETING USING CLASSIFICATION AND CLUSTERING METHODS. Proceeding 8 Th International Seminar on Industrial Engineering and Management, 33–39.

Wara, S. S. M. (2019). ANALISIS RESPONS WARGANET TERHADAP DEBAT CALON PRESIDEN 2019 DI TWITTER DENGAN METODE CLUSTERED SUPPORT VECTOR MACHINES [INSTITUT TEKNOLOGI SEPULUH NOPEMBER]. https://repository.its.ac.id/64282/1/06211540000101_Undergraduate_Thesis.pdf




DOI: https://doi.org/10.21107/edutic.v12i1.29744

Refbacks

  • There are currently no refbacks.


Copyright (c) 2025 Dendy Arizki Kuswardana, Dwi Arman Prasetya, Trimono Trimono, I Gede Susrama Mas Diyasa

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.


Indexed by:

Sinta 3        Google Scholar    Crossref    Dimensions    Worldcat    Scilit MDPI    ROAD

 J. Ilm. Edutic is licensed under a Creative Commons Attribution 4.0 International License