Marathi Text Summarization using Extractive Technique
Kirti Pankaj Kakde1, H. M. Padalikar2
1Mrs. Kirti Pankaj Kakde, Research Scholar, Department of Computer Application, IMED Bharati Vidyapeeth Deemed to be University, Pune (M.H), India.
2Dr. H. M. Padalikar, Department of Computer Application, IMED Bharati Vidyapeeth Deemed to be University, Pune (M.H), India.
Manuscript received on 29 May 2023 | Revised Manuscript received on 06 June 2023 | Manuscript Accepted on 15 June 2023 | Manuscript published on 30 June 2023 | PP: 99-105 | Volume-12 Issue-5, June 2023 | Retrieval Number: 100.1/ijeat.E42000612523 | DOI: 10.35940/ijeat.E4200.0612523
Open Access | Editorial and Publishing Policies | Cite | Zenodo | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Multilingualism has played a key role in India, where people speak and understand more than one language. Marathi, as one of the official languages inMaharashtra state, is often used in sources such as newspapers or blogs. However, manually summarizing bulky Marathi paragraphs or texts for easy comprehension can be challenging. To address this, text summarization becomes essential to make large documents easily readable and understandable. This research article focuses on single document text summarization using the Natural Language Processing (NLP) approach, a subfield of Artificial Intelligence. Automatic text summarization is employed to extract relevant information in a concise manner. Information Extraction is particularly useful when summarizing documents consisting of multiple sentences into three or four sentences. While extensive research has been conducted on English Text Summarization, the field of Marathi document summarization remains largely unexplored. This research paper explores extractive text summarization techniques specifically for Marathi documents, utilizing the LexRank algorithm along with Genism, a graph-based technique, to generate informative summaries within word limit constraints. The experiment was conducted on the IndicNLP Marathi news article dataset, resulting in 78% precision, 72% recall, and 75% F-measure using the frequency-based method, and 78% precision, 78% recall, and 78% F-measure using the Lex Rank algorithm.
Keywords: Artificial Intelligence, Automatic text summarization, Extractive text summarization, Natural Language Processing, Indic NLP.
Scope of the Article: Artificial Intelligence