COMPARISON OF FUZZY SUBTRACTIVE CLUSTERING AND FUZZY C-MEANS

Multivariate statistics have related problems with large data dimensions. One method that can be used is principal component analysis (PCA). Principal component analysis (PCA) is a technique used to reduce data dimensions consisting of several dependent variables while maintaining variance in the data. PCA can be used to stabilize measurements in statistical analysis, one of which is cluster analysis. Fuzzy clustering is a method of grouping based on membership values that includes fuzzy sets as a weighting basis for grouping. In this study, the fuzzy clustering method used is Fuzzy Subtractive Clustering (FSC) and Fuzzy C-Means (FCM) with a combination of the Minkowski Chebysev distance. The purpose of this study was to compare the cluster results obtained from the FSC and FCM using the DBI validity index. The results of the cluster evaluation obtained were 3.9164 for FSC and 1.6468 for FCM. This shows that the results of clustering using FCM are better than FSC.


INTRODUCTION
A problem which occur in multivariate statistic is analyzing huge data consisting a lot of variables to acquire fewer variables (dimension reduction). However, the process should maintain some information which already included in the data itself. One of the popular method to apply those dimension reduction in multivariate statistic is Principal Component Analysis (PCA) [1]. PCA is a method to reduce the dimension consisting of a few dependent variables without transforming its data variants [2]. This method can be applied by transforming the data into new variables (main component) which not correlated to maximum variants. Based on Jolliffe [3], PCA method can be used to stabilize the statistics analysis measurement such as regression analysis or cluster analysis.
Clustering is a classification process applied to certain set of data which its cluster label are still unknown [4]. Clustering method will classify certain data into the same cluster, where each of the data have some certain level of similarity with the others. Hence, good cluster will have high degree of data similarity in its cluster and high degree of difference in another cluster [5].
Fuzzy clustering is one of the clustering method where its classification process is based on membership degree and includes set of fuzzy as its measurement base. Every data will have the probability to be included as a member of any cluster. This means that each of the data will not absolutely induced in only one cluster [6]. There are many technique which can be applied to do fuzzy clustering, such as Fuzzy Subtractive Clustering and Fuzzy C-Means.
Based on Kusumadewi and Purnomo [7] Fuzzy Subtractive Clustering (FSC) is one of the clustering method which applied on certain set of data where the number of cluster to be formed through this process are still unknown. The basic concept of FSC is by determining the coordinates for each data which have the highest density. Coordinate with the highest number of neighboring data will be chose as the cluster centroid. Furthermore, the density level of the coordinate which chosen as the cluster centroid will be reduced and the algorithm will select another coordinates which have the most neighboring data to be chosen as the new cluster centroid. Fuzzy C-Means (FCM) is a fuzzy based clustering method where the number of cluster which will be formed are initially determined. FCM works by applying partition process into certain data to be formed as c-cluster through minimalizing its FCM objective function based on the given limitation [8].
A few research related with FCM had been analyzed by Zhang et al [9] concerning in image clustering and Qiao & Yang [10] related to FCM application in solving the optimal scale function problem by combining dolphin swarm algorithm. Another research in medical sector also done by Yepuganti, Saladi, and Narasimhulu [11] to do a segmentation of tumor disease through FCM application. The data applied in that research are the magnetic resonance (MR) of brain image data which transformed by applying wavelet discreet transformation and feature reduction. PCM and FCM also applied by Upadhyay and Nagpal [12] to reduce the data dimension for EEG data and to classify EGG data into 3 classification, which are awake, SWS, and REM. Other than that, Zhao et al. [13] also did a research related to AE signal classification which analyzed by using PCA and FCM.
Another fuzzy clustering method, FSC also already applied on economic sector by Widodo [14] to form a prediction model of certain brand of products and by Chandar [15] to predict the stock market by applying subtractive clustering with neural fuzzy hybrid approach. FSC also once combined with wavelet to improve the low-cost navigation system with high speed [16]. Another research by Benmouiza and Cheknane [17] also used FSC to classify ANFIS network which applied for predicting the radiation of sun in each hour. In addition to that, FSC also being used by Chen and Li [18] for analyzing the service level of online taxi through PCA combination.
Both of fuzzy clustering method, FCM and FSC, needs certain distance to measure the degree of similarity on certain set of data. In this research, the distance parameter that being applied is Minkowski Chebisev combination distance which invented by Rodrigues. This distance is already used by Rodrigues [19] for classifying process with the application of K-Nearest Neighbors (KNN) and proven to provides high accuracy. Furthermore, this distance also used by Surono and Putri [20] which applied on one of the fuzzy clustering method, FCM, with the combination of PCA. Based on these facts regarding research which already worked before, we decide to do a research concerning in the comparison of the clustering result from FCM and FSC with the application of Minkkowski Chebisev combination distance.

Principal Component Analysis (PCA)
Suppose there are random vector as many as m, which are 1 , 2 , … , . The main component requirement from variables There are two methods to determine the number of the main components, which are: First, through proportion of total variant (PTV) If the majority of the total variant proportion in m is 80% to 90%, as many as m variables can be connected by the first, second, or third components which have the ability to transform m variables without diminishing its main information [21].
Second, through proportion of cumulative variant (PKV) If the majority of the total variant proportion in m is 70% to 80%, as many as m variables can be connected by the first, second, or third components which have the ability to transform m variables without diminishing its main information [22].

Minkowski Chebysev Combination Distance
The distance combination applied by Rodrigues [19] is Minkowski Chebisev combination distance. The equation of this distance is defined as follows: Where and are the components in dimension. If the value of 1 > 2 , then the distance will similar to Minkowski and vice versa.

Fuzzy Subtractive Clustering (FSC)
Based on Kusumadewi and Purnomo [7] ,Fuzzy Subtractive Clustering (FSC) is classified as an unsupervised algorithm where the number of clusters which are going to be formed are still unknown. The basic concept of this method is to determine every data coordinates which have high density towards its neighboring coordinates. The coordinate with the highest number of neighboring data coordinates will be determined as the cluster (4)   (5) centroid. Then, the density of this new formed cluster centroid will be reduced and algorithm will choose another coordinate which have the highest neighboring data to be formed as the other new cluster centroid. This process will be initiated until all of the data coordinates are tested.
Two comparison factors will be applied on FSC, which are Accept ratio and Reject ratio. This factors are some certain real fraction numbers ranged from 0 to 1. Accept ratio is the lower limit where certain data coordinates which determined as the cluster centroid candidates are eligible to be defined further as the cluster centroid. While reject ratio is the upper limit where certain data coordinates which determined as the cluster centroid candidates are ineligible to be defined further as the cluster centroid. There are 3 condition probability which can occur during iteration process. The first condition is a condition where ratio > accept ratio, this means that this data coordinate is accepted as the new cluster centroid. The second condition, where reject ratio < ratio ≤ accept, means that this data coordinate will be accepted as the new cluster centroid only if this coordinate is located in certain distance towards another existed cluster centroid (the addition of Ratio and the nearest distance of this data coordinate with another cluster centroid is ( + ) ≥ 1). If the addition between the ratio and the farthest distance of this data coordinate with existing cluster centroid is ( + ) < 1, then this data coordinate will not be selected and considered as the new cluster centroid (the potential will be set to 0). Furthermore, the third condition where ratio ≤ reject ratio means that there are no other data coordinates which will be considered to be selected as the new cluster centroid and the iteration process is terminated.

Fuzzy C-Means (FCM)
Fuzzy clustering is a classification technique where every set of data have the probability to become a member of certain cluster with different degree for each class. There are two type of methods for this fuzzy clustering, the first one is FCM. This technique first introduced by Bezdek in 1981 where certain data set is classified by creating c-number of clusters which already determined before. FCM works by determining the cluster location which will define the average value for each cluster, with inaccuracy on initial condition. Each data coordinate have their own minimum membership degree for each existing cluster. In this condition, the cluster centroid and every level data are iterated over and over, so that the cluster centroid will move to the most suitable position. This iteration process is based on the minimalizing objective function process which positioned into the cluster centroid with the value of each data.

Validity Index
One of the method to measure the cluster validity is Davies Bouldin Index (DBI). This method works by maximizing the distance between clusters and minimizing the distance between data coordinate in the same cluster. If the distance between clusters is far, then the characteristic similarity between different clusters will be low and the data which belong in certain cluster will tend to have high degree of similarity. Based on DBI, the optimum cluster will have the minimum DBI value (close to 0). The equation for DBI application is described as follows[23]:

RESULT AND ANALYSIS
The measurement process in this study begins with PCA which produces the main component for measuring fuzzy clustering. The data used is hypertension data obtained from a public health center in Yogyakarta. The amount of data as many as 100 data with 5 different variables was applied to this study, meaning that the data used was 100 × 5 in size. The 5 variables applied in this research are age ( 1 ), gender ( 2 ), systolic pressure ( 3 ), diastolic pressure ( 4 ), and body weight ( 5 ). The data processing in this research is done by applying Software Jupyter Notebook with programming language Phyton.
The PCA calculation is initiated by data normalization to position the data in certain range. Then, matrix covariance is measured to determine whether the relation between variables are exist. The formed matrix covariance then will be analyzed by applying PCA to observe the eigen value for each variable. The formed main component is determined based on the total variance proportion and cumulative proportion. The result of eigen value and variance calculation are illustrated in Table 1.  Table 1 shows that the PKV value in the third component is 75.86%. This value indicates that the existing data set can be presented with 3 components, because the value has exceeded 70%. The eigenvalues indicate the principal component's contribution to the variance of all the original variables analyzed. The graph of the main components and eigenvalues is illustrated in Figure 1. Analysis and interpretation of this graph is necessary before discussing the main results of this study.  Figure 2 visualize the output resulting from the 3 main components, where the red dots, green dots, and blue dots defined as the first, second, and third component respectively. In this research, the FSC process applying squash factor (q) valued 1.25, where this value shows that the radius of the data coordinates around the cluster centroid which will be measured its data potential reduction are valued 1.25. The accept ratio valued 0.8 shows the lower limit value of the data coordinates which selected as the cluster centroid candidate and the reject ratio valued 0.2 shows the upper limit of the data coordinates which selected as the cluster centroid are ineligible to transformed into cluster centroid. Then, the radius which are used in this research are = 1.22, 1 = 3, 2 = 2, = 4. The first step applied in this calculation is transforming the data into fuzzy number by applying exponential membership function Rezaei and Rezaei [24] and resulting the output which is illustrated in table 2.  Then, the data in Table 2 is processed using FSC method so that the cluster centroid is obtained as below: shows the cluster centroid formed from the radius 1.22. From the matrix above, the number of rows shows the number of cluster, and the number of column shows the number of applied variables. The first row, second row, and third row illustrate the first, second, and third cluster centroid respectively. Furthermore, the membership value for each data will be calculated by applying this following equation [7].
= The membership value of cluster k in data i. = Data in variable .
The output of the membership value is illustrated in the table 3 below: In table 3, the first data tends to be induced in cluster 2 because the highest membership degree on the first data are located on cluster 2. The second data tends to be induced in cluster 1 because of the highest membership degree on the first data are located on cluster 1, and so on until the 100 th data.
Afterwards, the FCM process applied by minimalize this following objective function. The initial value applied for this calculation are = 3, = 2, iteration maximum = 100, = 10 −5 . For the 1 , 2 and , the value are equal with the value applied for FSC process which are 3, 2 dan 4. The partition matrix applied for the calculation is described as follows: Then, the first iteration process resulting in the cluster centroid described as follows: The objective function acquired from the first iteration is = 330,3307.
After that, the iteration process are continuously iterated until > or | ( ) ( , ) − ( −1) ( , )| < . If these condition is meet, then the iteration process will be terminated. The cluster centroid acquired from this process are described in the matrix below. As for the new partition matrix which acquired when = 100, the output is described in this following matrix.
Furthermore, the cluster centroid resulted from FSC and FCM process will be evaluated by applying equation (6). The SSW and SSB value resulted from FSC are: Based on equation (6) the values for the FCM and FSC methods are obtained which are shown in Table 3.

CONCLUSION
This research combines PCA and fuzzy clustering methods. PCA is applied for dimensional reduction. Through this research, PCA produced 3 new variables which were the new main components of the original 5 variables. Then the results from the PCA will be used as input parameters for the FSC and FCM methods. The distance used for fuzzy clustering is the combination distance of Minkowski Chebisev. Furthermore, the cluster evaluation process is carried out by applying DBI, where the DBI value for FSC using PCA is 3.9164 and without PCA is 0.5998. for FCM it is 1.6468.
Meanwhile, the DBI value for FCM using PCA is 1.6468 and without PCA is 2.0406. Therefore, in this case, if you use the FCM method, the existing data set is reduced first using PCA, but if the FSC method is used, the DBI value will be even better if you don't use PCA first.