Thematic Context Derivator Algorithm for Enhanced Context Vector Machine: eCVM
Vaibhav Khatavkar1, Makarand Velankar2, Parag Kulkarni3
1Vaibhav Khatavkar (Corresponding Author), Research Scholar, Department of Computer Engineering, College of Engineering, Pune, Shivajinagar, Pune, Maharashtra, India.
2Makarand Velankar, Assistant Professor, Department of IT, MKSSS’s Cummins College of Engineering, Kothrud, Pune, Maharashtra , India.
2Parag Kulkarni, Adjunct Professor, Department of Computer Engineering, College of Engineering, Pune, Shivajinagar, Pune, Maharashtra, India
Manuscript received on November 22, 2019. | Revised Manuscript received on December 08, 2019. | Manuscript published on December 30, 2019. | PP: 4872-4877 | Volume-9 Issue-2, December, 2019. | Retrieval Number: B4564129219/2019©BEIESP | DOI: 10.35940/ijeat.B4564.129219
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Natural Language Processing uses word embeddings to map words into vectors. Context vector is one of the techniques to map words into vectors. The context vector gives importance of terms in the document corpus. The derivation of context vector is done using various methods such as neural networks, latent semantic analysis, knowledge base methods etc. This paper proposes a novel system to devise an enhanced context vector machine called eCVM. eCVM is able to determine the context phrases and its importance. eCVM uses latent semantic analysis, existing context vector machine, dependency parsing, named entities, topics from latent dirichlet allocation and various forms of words like nouns, adjectives and verbs for building the context. eCVM uses context vector and Pagerank algorithm to find the importance of the term in document and is tested on BBC news dataset. Results of eCVM are compared with compared with the state of the art for context detrivation. The proposed system shows improved performance over existing systems for standard evaluation parameters.
Keywords: Context Vector Machine, Natural Language Processing, PageRank, Named Entities.