PNNCP- Parallel Nearest Neighbor Classification and Prediction for Big Data Application Based on Apache Spark and Machine Learning
Anilkumar Vishwanath Brahmane1, B. Chaitanya Krishna2

1Anilkumar V. Brahmane (Correspondence Author), Research Scholar, Department of Computer Science and Engineering, KL Deemed to be University, Vijaywada, A.P., India.
2Dr. B. Chaitanya Krishna, Professor, Department of Computer Science and Engineering, KL Deemed to be University, Vijaywada, A.P., India.
Manuscript received on September 22, 2019. | Revised Manuscript received on October 20, 2019. | Manuscript published on October 30, 2019. | PP: 2358-2365 | Volume-9 Issue-1, October 2019 | Retrieval Number: A1382109119/2019©BEIESP | DOI: 10.35940/ijeat.A1382.109119
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (

Abstract: Right by and by the Colossal Information applications, for case, social orchestrating, helpful human administrations, agribusiness, keeping cash, stock show, direction, Facebook and so forward are making the data with especially tall speed. Volume and Speed of the Immense data plays a fundamental bit interior the execution of Colossal data applications. Execution of the Colossal data application can be affected by distinctive parameters. Quickly watch, capacity and precision are the a significant parcel of the triumphant parameters which impact the by and gigantic execution of any Huge data applications. Due the energize and underhanded affiliation of the qualities of 7Vs of Colossal data, each Colossal Information affiliations expect the tall execution.Tall execution is the foremost obvious test within the display advancing condition. In this paper we propose the parallel course of action way to bargain with speedup the explore for closest neighbor center. k-NN classifier is the preeminent basic and comprehensively utilized method for gathering. In this paper we apply a parallelism thought to k-NN for looking the another closest neighbor. This neighbor center will be utilized for putting lost and execution of the remarkable data streams. This classifier unequivocally overhaul and coordinate of the out of date data streams. We are utilizing the Apache Begin and scattered estimation space affiliation for snappier evaluation.
Keywords: Parallel processing, Big Data, Machine Learning, Apache Spark.