Single Document Text Summarization of a Resource-Poor Language using an Unsupervised Technique
Gunadeep Chetia1, Gopal Chandra Hazarika2
1Gunadeep Chetia*, Centre for Computer Science and Applications, Dibrugarh University, Dibrugarh, Assam, India.
2Gopal Chandra Hazarika, Centre for Computer Science and Applications, Dibrugarh University, Dibrugarh, Assam, India.
Manuscript received on September 20, 2019. | Revised Manuscript received on October 05, 2019. | Manuscript published on October 30, 2019. | PP: 6278-6281 | Volume-9 Issue-1, October 2019 | Retrieval Number: A2250109119/2019©BEIESP | DOI: 10.35940/ijeat.A2250.109119
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Automatic text summarization of a resource-poor language is a challenging task. Unsupervised extractive techniques are often preferred for such languages due to scarcity of resources. Latent Semantic Analysis (LSA) is an unsupervised technique which automatically identifies semantically important sentences from a text document. Two methods based on Latent Semantic Analysis have been evaluated on two datasets of a resource-poor language using Singular Value Decomposition (SVD) on different vector-space models. The performance of the methods is evaluated using ROUGE-L scores obtained by comparing the system generated summaries with human generated model summaries. Both the methods are found to be performing better for shorter documents than longer ones.
Keywords: Latent Semantic Analysis, Singular Value Decomposition, Text Summarization, Word-sentence Matrix.