We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 November 2024
Reseach Article

Biological Motif Discovery Algorithm based on Mining Tree Structure

by Lounnas Bilal, Bouderah Brahim, Moussaoui Abdelouahab
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 69 - Number 4
Year of Publication: 2013
Authors: Lounnas Bilal, Bouderah Brahim, Moussaoui Abdelouahab
10.5120/11833-7551

Lounnas Bilal, Bouderah Brahim, Moussaoui Abdelouahab . Biological Motif Discovery Algorithm based on Mining Tree Structure. International Journal of Computer Applications. 69, 4 ( May 2013), 35-40. DOI=10.5120/11833-7551

@article{ 10.5120/11833-7551,
author = { Lounnas Bilal, Bouderah Brahim, Moussaoui Abdelouahab },
title = { Biological Motif Discovery Algorithm based on Mining Tree Structure },
journal = { International Journal of Computer Applications },
issue_date = { May 2013 },
volume = { 69 },
number = { 4 },
month = { May },
year = { 2013 },
issn = { 0975-8887 },
pages = { 35-40 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume69/number4/11833-7551/ },
doi = { 10.5120/11833-7551 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:29:22.124986+05:30
%A Lounnas Bilal
%A Bouderah Brahim
%A Moussaoui Abdelouahab
%T Biological Motif Discovery Algorithm based on Mining Tree Structure
%J International Journal of Computer Applications
%@ 0975-8887
%V 69
%N 4
%P 35-40
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The nucleic acid and protein sequences contain different types of information (genes, RNA structural, active sites, regulatory structure . . . ), these information can lead to discover many useful knowledge on biology like the functionality of a given protein sequence, another example is toclassifying proteins on different families based on these information. In this paper we focus on the existed motif in the nucleic acid sequences. Before going further it is useful to review the concepts and terminology associated with this study. The motif is a structural short element that could be found in all members of a family of protein. It contains essential residues for function conserved, not necessarily consecutive, but rather closes to the 3D structure, be-cause they involve the same function (active site, binding site . . . ). While the pattern or profile is a degenerate sequence and/or composed of different motif that can be separated by variable regions. In fact, the objective is to develop a new algorithm based on mining tree structure in order to highlight segments of DNA, RNA, or amino acids, which are likely to have a biological role

References
  1. Edward Keedwell, Ajit Narayanan, Intelligent Bioinformatics: The Application of Artificial Intelligence Techniques to Bioinformatics Problems, 2005. K. Karu, A. K. Jain, ,"Fingerprint Classification, Proceedings of Pattern Recognition", Vol. 29, No. 3, pp. 389-404, 1996.
  2. Luscombe NM, Greenbaum D, Gerstein M, What is bioinformatics? A proposed definition and overview of the field. Schattauer GmbH, 2001.
  3. Venkatarajan Mathura, PandjassarameKangueane, Bioinformatics: A Concept-Based Introduction. Springer. 2009.
  4. alvisbrazma, ingejonassen, ingvareidhammer,david gilbert,Approaches to the Automatic Discovery of Patterns in Biosequences. JOURNAL OF COMPUTATIONAL BIOLOGY. Volume 5, Number 2, 1998.
  5. Modan K Das, Ho-Kwok Dai, A survey of DNA motif finding algorithms, BMC Bioinformatics, 2007.
  6. A. Heger, M. Lappe, and L. Holm. Accurate detection of very sparse sequence motifs. In Proceedings of RECOMB 2003, pages 139–147, 2003.
  7. Vilo J, Brazma A, Jonassen I, Robinson A, Ukonnen E: Mining for putative regulatory elements in the yeast genome using gene expression data. In Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology AAAI Press San Diego, CA; 2000.
  8. Bucher P: Weight matrix description for four eukaryotic RNA polymerase II promoter element derived from 502 unrelated promoter sequences. J MolBiol 1990, 212:563-578.
  9. Hubert Comon, Max Dauchet, Remi Gilleron, Florent Jacquemard, Denis Lugiez,ChristofLodingTison, Marc Tommasi. Tree Automata Techniques and Applications. 2007.
  10. IrrneGuessarian, Pushdown Tree Automata. Math. Systems Theory 16, 237-263 (1983).
  11. Irene GUESSARIAN, On pushdown tree automata, Lecture Notes in Computer Science Volume 112, 1981.
  12. TadeuszKrasi?ski, Sebastian Sakowski, Autonomous Push-down Automaton Built on DNA, Informatica 36 (2012) 263–276.
  13. DušanKolá, Formal Pushdown Automata, Lecture Formal Pushdown Automata on the 2009.
  14. Jiawei Han, MichelineKamber and Jian Pei, Data Mining: Concepts and Techniques, 3rd Edition. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann Publishers, July 2011. ISBN 978-0123814791.
  15. Yun Chi, Yirong Yang, Yi Xia, and Richard R. Muntz. CMTreeMiner: Mining Both Closed and Maximal Frequent Subtrees. In The Eighth Pacific Asia Conference on Knowledge Discovery and Data Mining. 2003.
  16. S. de Amo, N. A. Silva, R. P. Silva, F. S. F. Pereira. Tree Pattern Mining with Tree Automata Constraints. Twenty-second Brazilian Symposium on Databases. 2007.
  17. Lin Shi, Nick Rizzolo. Survey of Graph Mining Techniques. 2005.
  18. Xueyi Wang, Jun Huan, Jack S. Snoeyink, Wei Wang. Mining RNA Tertiary Motifs with Structure Graphs. Scientific and Statistical Database Management, 2007.
Index Terms

Computer Science
Information Sciences

Keywords

Motif matching profiles tree automata pushdown automata tree structure tree mining