Use of NLP Based Combined Features for Sentiment Classification
K.S.Kalaivani1, S.Kuppuswami2, C.S.Kanimozhiselvi3

1K.S.Kalaivani, Department of CSE, Kongu Engineering College, Perundurai, India.
2Prof. S. Kuppuswami, Principal, Kongu Engineering College, Perundurai, India.
3C.S. Kanimozhiselvi,  Department of CSE, Kongu Engineering College, Perundurai, India.
Manuscript received on September 16, 2019. | Revised Manuscript received on October 05, 2019. | Manuscript published on October 30, 2019. | PP: 621-626 | Volume-9 Issue-1, October 2019 | Retrieval Number: F8290088619/2019©BEIESP | DOI: 10.35940/ijeat.F8290.109119
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (

Abstract: Sentiment analysis is the technique of automatic detection of the belief or the mood of an author towards a certain subject in textual form. To extract the opinion present in text, the machine needs expertise in the area of natural language processing. In this paper, machine learning based document-level sentiment classification is performed on Amazon product reviews to classify them as positive and negative. Two NLP based feature extraction techniques (Word Relation and POS based) are used in this study to determine the features that are sentiment bearing. The features are extracted as basic features (unigrams, bigrams and trigrams) and their combinations (unigrams+bigrams, unigrams+trigrams, unigrams+bigrams+trigrams). In order to identify the features that are most informative and to bring down the computational time of the classification algorithms, feature selection techniques are used. Performance of independent and combined feature sets is assessed using accuracy, precision, recall and F-measure. From the experiments conducted, it is observed that combined features outperformed independent features using Boolean Multinomial Naive Bayes (BMNB) classifier.
Keywords: Document-level Sentiment Classification, Information Gain, NLP based combined features, Weighted Frequency and Odds.