Finding Best Possible Number of Clusters using K-Means Algorithm
K. Maheswari
Dr. K. Maheswari, Department of Computer Applications, Kalasalingam Academy of Research and Education College, Krishnankovil Virudhunagar (Tamil Nadu), India.
Manuscript received on 24 November 2019 | Revised Manuscript received on 18 December 2019 | Manuscript Published on 30 December 2019 | PP: 533-538 | Volume-9 Issue-1S4 December 2019 | Retrieval Number: A11191291S419/19©BEIESP | DOI: 10.35940/ijeat.A1119.1291S419
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Customers are assets for business. The companies are investing more for customer relationship management. Retaining customer for long time is a difficult process in today’s trend. On line shopping is also increasing day by day. People are more interested to visit popular web sites and they are spending very less time to choose their products. On line shops are paying more interest to analyze customer preferences, their needs, shopping behaviors through data mining technique. Proper classification is necessary for organizing such data. In this work, Customer with the same buying behavior is grouped based on the features age and salary. K-Means algorithm is applied to form clusters with different K values for original data and normalized data. The within sum of square (wss) is calculated for both the data for different cluster size. The minimum wss is considered to be better which is achieved in normalized data. The validity of cluster is evaluated by elbow, silhouette and gap statistic method to choose the optimal number of clusters. This work is implemented in R software.
Keywords: Cluster, Customer Purchase, K-Means and WSS.
Scope of the Article: Clustering