Web Scraping in Finance using Python
Siddhant Vinayak Chanda1, Arivoli A2

1Siddhant Vinayak Chanda*, School of Computer Science and Engineering, Vellore Institute of Technology, Vellore.
2Arivoli A, School of Computer Science and Engineering, Vellore Institute of Technology, Vellore

Manuscript received on April 11, 2020. | Revised Manuscript received on May 15, 2020. | Manuscript published on June 30, 2020. | PP: 255-262 | Volume-9 Issue-5, June 2020. | Retrieval Number: E9457069520/2020©BEIESP | DOI: 10.35940/ijeat.E9457.069520
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: The objective of this paper is to highlight different ways to extract financial data ( Balance Sheet, Income Statement and Cash Flow) of different companies from Yahoo finance and present an elaborate model to provide an economical, reliable and, a time-efficient tool for this purpose. It aims at aiding business analysts who are not well versed with coding but need quantitative outputs to analyse, predict, and make market decisions, by automating the process of generation of financial data. A python model is used, which scrapes the required data from Yahoo finance and presents it in a precise and concise manner in the form of an Excel sheet. A web application is build using python with a minimalistic and simple User Interface to facilitate this process. This proposed method not only removes any chances of human error caused due to manual extraction of data but also improves the overall productivity of analysts by drastically reducing the time it takes to generate the data and thus saves a substantial amount of human hours for the consumer. We also discuss the importance of data mining and scraping technologies in the finance industry, different methods of scraping online data, and the legal aspect of web scraping which is highly dependent on generated data to analyse and make decisions. 
Keywords: Web Information extraction, Web Data Mining, Information Retrieval, Web Scraping, Financial Statements Analysis