Edu-APCCM: Automatic Programming Code Constructs Mining from Learning Content
Maitri Jhaveri1, Jyoti Pareek2

1Dr. Maitri Jhaveri, Lecturer, Department of Computer Science, Gujarat University, India.
2Dr. Jyoti Pareek, Professor, Department of Computer Science, Gujarat University, India.

Manuscript received on February 01, 2020. | Revised Manuscript received on February 05, 2020. | Manuscript published on February 30, 2020. | PP: 474-480 | Volume-9 Issue-3, February, 2020. | Retrieval Number: C4835029320/2020©BEIESP | DOI: 10.35940/ijeat.C4835.029320
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: The current education ecosystem is moving towards centralized online blended learning. Online learning repositories have replaced traditional libraries. Learning repositories contain learning materials, which can be located with the help of associated metadata. Associating metadata to the content (definition, program, example, figure, and table) of individual learning concept (topic) from the learning material also leads to a better search. If a student knows the prerequisites of the topic s/he wants to learn then the study of current topic would be more fruitful. The prerequisites of a computer science topic can be obtained from its explanation and the programming code snippet used for its implementation. This paper proposes a metadata “code construct as a prerequisite of a code snippet”. For example “recursion and function call are prerequisite to understand recursive module of binary tree traversal”. It also proposes the framework to automatically identify, extract and present the code constructs used in code snippets included in a computer science learning material. Thus obtained list of code constructs act as prerequisites for understanding the corresponding code snippet. Rule-based pattern mining approach is used for the identification of code snippet in the learning material and identification of code constructs in the code snippet. A pattern set is designed for the same. Natural language tool kit of python is used to identify the code snippet. The algorithms are tested on the programs of C, C++ and Java. Accuracy and efficiency of the developed algorithms is checked against the manual results given by subject experts. An average F1 score of 92% is obtained.
Keywords: The algorithms are tested on the programs of C, C++