Automatic Pre-Processing of Marathi Text for Summarization
Apurva D. Dhawale1, Sonali B. Kulkarni2, Vaishali M. Kumbhakarna3
1Ms. Apurva D. Dhawale*, Department of Computer Science, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, India.
2Dr. Sonali B. Kulkarni, Completed her Master of Science, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, India
3Ms. Vaishali M. Kumbhakarna, Completed Master of Science, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, India
Manuscript received on October 05, 2020. | Revised Manuscript received on October 10, 2020. | Manuscript published on October 30, 2020. | PP: 230-234 | Volume-10 Issue-1, October 2020. | Retrieval Number: 100.1/ijeat.A18031010120 | DOI: 10.35940/ijeat.A1803.1010120
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: The text summarization is a technique where the original large text is condensed into smaller version without changing its abstract meaning. The text summarization is done on the common foreign and regional languages typically, but infrequent work has been observed for the Marathi language. As the amount of e-contents on web is increasing drastically, the users are facing difficulty to read the newspaper articles with extraction of its different perspectives with sorting. We are focussing on educational, Political and sports news for summarization, which will be helpful for students who are appearing for competitive exams. This paper explores the pre-processing techniques for Marathi e-news articles.
Keywords: Text summarization, POS tagging, Pre-processing, LDA(Latent Dirichlet Allocation), LNS (Label Induction Grouping), SVM (Support Vector Machine)