Comparative Study of Multiple Machine Learning Algorithms for Students’ Performance Data for Job Placement in University
Athreya Shetty B1, Akram Pasha2, Amith Singh S3, Shreyas N I4, Adithya R Hande5
1Athreya Shetty B, Department of Computing and Information Technology, REVA University, Bangalore (Karnataka), India.
2Akram Pasha, Department of Computing and Information Technology, REVA University, Bangalore (Karnataka), India.
3Amith Singh S, Department of Computing and Information Technology, REVA University, Bangalore (Karnataka), India.
4Shreyas N I, Department of Computing and Information Technology, REVA University, Bangalore (Karnataka), India.
5Adithya R Hande, Department of Computing and Information Technology, REVA University, Bangalore (Karnataka), India.
Manuscript received on 05 June 2019 | Revised Manuscript received on 14 June 2019 | Manuscript Published on 29 June 2019 | PP: 182-187 | Volume-8 Issue-5S, May 2019 | Retrieval Number: E10380585S19/19©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: In the era of data evolution, many organizations have taken the lead in storing the data in huge data repositories. Analysis of data comes with several challenges since the time the data is captures till the insights are inferred from the data. Accentuating the accuracy of data analysis is of paramount importance as many critical decisions are totally dependent on the outcomes of the analysis. Machine learning has been found as the most effective and most preferred tool in the literature for in-memory data analytics. Universities mostly collect the statistical data related to the students that is only either used quantitatively or sparsely analyzed to gain the insights that could be useful for the authorities to enhance the percentage of placements in campus drives held through early analysis of such data accurately. The work proposed in this paper formulates the problem of predicting the likelihood of a student getting placed in a company as a binary classification problem. Then it makes an effort to train and perform the empirical study of following multiple machine learning algorithms with the placement data; Logistic Regression, Naïve Bayes, Support Vector Machine, K-Nearest Neighbor and Decision Tree. The machine learning classification models are built to predict the probabilities of a student getting placed in a company based on the student’s academic scores, achievements, work experience (internship), and many other relevant features. Such an analysis helps the university authorities to dynamically create plans to enhance the unlikely students to be placed in a company participating in the campus recruitment held in the university. To improve these models and to avoid the models from overfitting to the training data, strategies like K-Fold cross-validation is applied for various values of k. The machine learning models selected are also compared for its efficiency by employing the supervised and unsupervised feature extraction techniques such as PCA and LDA. The Decision Tree model with K as 10 for cross-validation and PCA has outperformed all the other models producing the accuracy of 72.83% with satisfactory support and recall during experimentation. The application focuses on the targeted group of students, to eventually improve the probability of students getting placed during campus recruitment drives held in the university.
Keywords: Data Mining, Data Analytics, Classification, Machine Learning, K-Fold Cross Validation, PCA, LDA.
Scope of the Article: Data Mining