Chapter extraction from research documents using Meta-Content Framework
Tripti Sharma1, Sarang Pitale2
1Tripti Sharma, Department of Computer Science, Chhatrapati Shivaji Institute of Technology, Durg, India.
2Sarang Pitale, Department of Information Technology, Bhilai Institute of Technology,Durg, India.
Manuscript received on May 17, 2012. | Revised Manuscript received on June 22, 2012. | Manuscript published on June 30, 2012. | PP: 337-339 | Volume-1 Issue-5, June 2012. | Retrieval Number: E0533061512/2012©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Automatic chapter extraction from electronic documents has always been an interesting task for researchers who are continuously engaged in subjective answering systems. Researchers are agreed on the fact that chapter extraction is one of the key processes to generate the model answers. The proposed paper presents a framework to extract the chapter contents from the research documents. The framework is implemented using Java technology and iText library , It takes research document of PDF format as an input and extracts the chapter contents in simple HTML format so that it can be easily rendered in web browser.
Keywords: PDF, Java, iText, Html.