The Ngoko Javanese Stemmer uses the Enhanced Confix Stripping Stemmer Method

Shevia Ilfa Melia, Jamiatus Sholihah, Dianatin Nisak, Intan Sukma Juniaristha, Ana Tsalitsatun Ni'mah

Abstract

Stemming is vital in text processing. The stemming that is most often encountered is Indonesian and English stemming. This is because more articles are processed in text processing in English and Indonesian. Indonesia has several regional languages, especially local school content, often used in learning. Therefore, research is needed to process Javanese language texts to make it easier for education practitioners, especially in Ngoko Javanese. Ngoko Javanese stemming, which still uses the affix removal stemmers method (rule-based approach) in previous research. Has a problem, namely the lack of success of this method when returning the root words of Ngoko Javanese, so it is necessary to check the Ngoko Javanese dictionary so that the results of the root words obtained are maximized. This study aims to conduct stemmer research on Ngoko Javanese using the Enhanced Confix Stripping (ECS) method. This stemmer is designed to do word splitting according to the Enhanced Confix Stripping algorithm and through checking the dictionary adapted to the Ngoko Javanese language. The results of this study are the ability to extract essential words in Javanese Ngoko to achieve a level of truth in returning root words reaching 97 percent.

Keywords

Javanese Ngoko; Stemming; Enhanced Confix Stripping

Full Text:

PDF

References

K. Saddhono and W. Hartanto, “Heliyon A dialect geography in Yogyakarta-Surakarta isolect in Wedi District : An examination of permutation and phonological dialectometry as an endeavor to preserve Javanese language in Indonesia,” Heliyon, vol. 7, no. September 2020, p. e07660, 2021, doi: 10.1016/j.heliyon.2021.e07660.

A. Rahman, U. Islam, and N. Alauddin, “Pengaruh Bahasa Daerah Terhadap Hasil Belajar Bahasa Indonesia Peserta Didik Kelas 1 SD INPRES Maki Kecamatan Lamba-Leda Kabupaten Manggarai,” vol. 3, no. 2, pp. 71–79, 2016, doi: 10.24252/auladuna.v3i2a3.2016.

Suharyo, “Nasib Bahasa Jawa dan Bahasa Indonesia dalam Pandangan dan Sikap Bahasa Generasi Muda Jawa,” NUSA J. Ilmu Bhs. dan Sastra, vol. 13, no. 2, pp. 244–255, 2018.

A. T. Ni’mah, D. Ari, and A. Z. Arifin, “Autonomy Stemmer Algorithm for Legal and Illegal Affix Detection Use Finite-State Automata Method,” vol. 2, no. 1, pp. 46–55, 2019, doi: 10.25042/epi-ije.022019.09.

M. Indriani, “Penanda Morfologi Bahasa Jawa Dialek Rembang,” Sutasoma J. Javanese Lit., vol. 3, no. 1, pp. 64–72, 2014.

Ni’mah, A.T. and Arifin, A. Z. “Perbandingan Metode Term Weighting terhadap Hasil Klasifikasi Teks pada Dataset Terjemahan Kitab Hadis,” Rekayasa J. Sci. Technol., vol. 13, no. 2, pp. 172–180, 2020, doi: https://doi.org/10.21107/rekayasa.v13i2.6412.

A. T. Ni’mah and F. Syuhada, “Term Weighting Based Indexing Class and Indexing Short Document for Indonesian Thesis Title Classification,” J. Comput. Sci. Informatics Eng., vol. 6, no. 2, pp. 167–175, 2022, doi: 10.29303/jcosine.v6i2.471.

P. Gede, S. Cipta, N. W. Wardani, P. T. Informatika, and R. B. Approach, “Stemming Dokumen Teks Bahasa Bali Dengan Metode Rule Base Approach,” vol. 7, no. 3, pp. 510–521, 2020.

N. W. Wardani, P. Gede, and S. Cipta, “Stemming Teks Bahasa Bali dengan Algoritma Enhanced Confix Stripping,” vol. 4, pp. 103–113, 2020.

R. Maulidi, “Modifikasi Metode Enhanced Confix Stripping,” Pros. Semin. Nas. FDI, no. December, pp. 12–15, 2016.

DOI

https://doi.org/10.21107/rekayasa.v16i1.19308

Metrics

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Shevia Ilfa Melia, Jamiatus Sholihah, Dianatin Nisak, Intan Sukma Juniaristha, Ana Tsalitsatun Ni'mah

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.