International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 50 - Number 9 |
Year of Publication: 2012 |
Authors: Sandhya.n, R. Krishnan, D. R. Ramesh Babu |
10.5120/7798-0915 |
Sandhya.n, R. Krishnan, D. R. Ramesh Babu . A Language Independent Characterization of Document Image Noise in Historical Scripts. International Journal of Computer Applications. 50, 9 ( July 2012), 11-18. DOI=10.5120/7798-0915
Digitization of historical documents helps preserve these documents. As these documents have existed for a long time, various types of noise creep in. In our paper we have analyzed the different types of noise that occur in printed and handwritten historical documents mainly based on Kannada (Kannada is a language used in Karnataka, a southern state in India) documents and created a taxonomy for the same. We have also characterized each noise type based on factors such as their source, their effect on characters and the associated challenges in character recognition. We have also catalogued the different noise detection, removal and restoration techniques that are reported in the literature for each of the prominent noise types, and identified areas relating to noise detection, removal for further research focus.