International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 75 - Number 4 |
Year of Publication: 2013 |
Authors: Subhankar Roy, Sunirmal Khatua |
10.5120/13101-0399 |
Subhankar Roy, Sunirmal Khatua . Compression Algorithm for all Specified bases in Nucleic Acid Sequences. International Journal of Computer Applications. 75, 4 ( August 2013), 29-34. DOI=10.5120/13101-0399
Organizations such as IT industry, colleges and Scientists regularly encounter problems to handle large data sets for their different purpose in many areas as for example biological research. These limitations also affect internet search to fetch data, business for analysis etc. So it is simply needed generalized but special types of compression algorithm for dissimilar data to get utmost saving percentage. In this article Compression of biological data that is single and double strand DNA and single strand RNA have been considered. Since biological data are less random compare to any text data that means redundancy within the sequences are more but they have some special property as for example different types of repeat one of such repeat is called dinucleotide repeat . This type of repeat are more in any sequence. Here the two proposed algorithm are based on this repeat using static fixed length LUT for input file and output file mapping.