Data Optimization using Apache Flink
Vikas S1, Thimmaraju S N2

1Vikas S, Assistant Professor, CSE Department, VTU PG Centre, Mysuru, Karnataka, India.
2Thimmaraju S N, Professor, CSE Department, VTU PG Centre, Mysuru, Karnataka, India
Manuscript received on July 12, 2019. | Revised Manuscript received on July 22, 2019. | Manuscript published on December 30, 2019. | PP: 137-142 | Volume-9 Issue-2, December, 2019. | Retrieval Number:  B3081129219/2019©BEIESP | DOI: 10.35940/ijeat.B3081.129219
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (

Abstract: Map Reduce, Flink, and Spark, also become more popular in the processing of big data lately. Flink will be an open platform Big Data processing system for Apache-powered batch storage and streaming of data. Flink’s query optimizer is constructed for historical information processing (batch) based on parallel storage systems approaches. Flink query query optimizer interprets the questions into jobs of different tasks that are regularly sent. Therefore, taking advantage of task similarities should prevent redundant computation. In this article, the multi-demand optimization model for Flink, Flink was planned and designed on Flink Software Stack’s top priority. It’s thought-about as an associate in Apache Flink’s nursing add-on to maximize multi-demand information sharing. The Flink system takes advantage of option operators ‘ information sharing resources to reduce overlap and duplication of multi-query in-network information movement. Research findings show that the leveraging of shared option operations in vast information on multiple requests would offer promising time to perform queries. Therefore, in the stream phase, Without doubt the Flink approach can be used to boost application performance over time periods.
Keywords: Big Data, Parallel Processing, Flink, batchprocessing, selection predicates.