An Advanced Algorithm for Finding Tandem Repeats in DNA Sequencing based on Text Mining
Anjali Saini1, Poonam2

1Anjali saini , Ph.D. in Computer Science & Engg., M. Tech., Assistant Professor of Computer Science, IP College for Women, Jhajjar, Haryana.
2Poonam, M. tech. ,Department of Computer Science and Engg., NCU Gurugram.
Manuscript received on July 20, 2019. | Revised Manuscript received on August 10, 2019. | Manuscript published on August 30, 2019. | PP: 4058-4062 | Volume-8 Issue-6, August 2019. | Retrieval Number: F8607088619/2019©BEIESP | DOI: 10.35940/ijeat.F8607.088619
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: In this paper a new searching technique based on pattern matching is proposed. In bioinformatics, finding Tandem Repeats (TR) in DNA sequences is an critical issue. There exist many pattern matching algorithms and KMP (Knuth Morris Pratt) is one of the pattern matching algorithm that undergo deficiencies of runtime complexity and cost when size of the data set increases. The main aim of the paper is to generate an effective algorithm for detecting and identifying Tandem Repeats over a DNA sequence more efficiently. By introducing the concept of 2Dimensional matrix to minimize the purview scope and optimizing the problem, Tandem Repeat finding algorithm makes the detecting or identifying process more efficient and effective that improves the quality of results. The theoretical analysis and experimental results concludes that tandem repeat finding algorithm get equivalent results in less runtime. This algorithm is better to KMP for determining results, and it also reduces or weaken the runtime cost which is beneficial when DNA data becomes greater.
Keywords: Tandem Repeat, DNA sequence, Pattern matching, KMP.