• Willy Fernando Institut Bisnis dan Teknologi Pelita Indonesia, Indonesia
  • Deny Jollyta Institut Bisnis dan Teknologi Pelita Indonesia, Indonesia
  • Dadang Priyanto Universitas Bumigora, Indonesia
  • Dwi Oktarina Institut Bisnis dan Teknologi Pelita Indonesia, Indonesia



Categorization, Classification Algorithms, Confusion Matrix, Numerical Data


Numerical data problems are typically caused by a failure to comprehend the data and the outcomes of its processing. In order to give richer context and a deeper understanding of the facts, numerical data must be transformed into categories. On the other hand, changes in data have a significant impact on the analysis's outcomes. The purpose of this study is to see how transforming numerical data into categories affects the model produced by the classification algorithms. The dataset used in this study is the Maternal Health Risk. Categorization refers to formal arrangements. Categorization is also accomplished by using the Gini Index to limit the number of instances of an attribute. The classification is carried out using the Random Forest (RF), K-Nearest Neighbor (K-NN) and Support Vector Machine (SVM) algorithms to produce a model. The influence of data modifications to model can be observed in the confusion matrix with 5 different data splitting. The study results suggested that changing numerical data to categories data significantly improved the performance of the SVM model from 76.92% to 80.77% at a data splitting percentage of 95/5.


