News Story Retrieval Based on Textual Query
Namarata Dave1, Mehfuza S. Holia2

1Namrata Dave*, Computer Engineering Department, Gujarat Technological University, India.
2Dr. Mehfuza S. Holia, Assistant Professor, Electronics Department, Birla Vishwakarma Mahavidyalaya, India.
Manuscript received on January 26, 2020. | Revised Manuscript received on February 05, 2020. | Manuscript published on February 30, 2020. | PP: 2918-2922 | Volume-9 Issue-3, February 2020. | Retrieval Number:  C5264029320/2020©BEIESP | DOI: 10.35940/ijeat.C5264.029320
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: This paper presents news video retrieval using text query for Gujarati language news videos. Due to the fact that Broadcasted Video in India is lacking in metadata information such as closed captioning, transcriptions etc., retrieval of videos based on text data is trivial task for most of the Indian language video. To retrieve specific story based on text query in regional language is the key idea behind our approach. Broadcast video is segmented to get shots representing small news stories. To represent each shot efficiently, key frame extraction using singular value decomposition and rank of matrix is proposed. Text is extracted from keyframes for further indexing data. Next task is to process text using natural language processing steps like tokenization, punctuation and extra symbols removal as well as stemming of words to root words etc. Due to unavailability of stemming and other methods of preprocessing of text in Guajarati language, we have given basic stemming technique to reduce dictionary size for efficient indexing of text data. With proposed system 82.5 percent accuracy is achieved on Gujarati news video dataset ETV.
Keywords: Key frame extraction, Guajarati OCR, stemming, video retrieval, text query