International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 155 - Number 2 |
Year of Publication: 2016 |
Authors: Siddharth Garimella |
10.5120/ijca2016912261 |
Siddharth Garimella . Identification of Receipts in a Multi-receipt Image using Spectral Clustering. International Journal of Computer Applications. 155, 2 ( Dec 2016), 14-18. DOI=10.5120/ijca2016912261
In order to submit expense reports, multiple receipts are often scanned on a single page and the scanned images are submitted along with the expense report in order to get expenses reimbursed. These scanned images are manually verified to check the validity of the claimed expenses. In this paper, a method is presented to isolate receipt segments in an image and use Optical Character Recognition (OCR) to identify receipt amounts, reducing validation time and effort. Scanned images are processed to find the contours of all high-contrast objects in receipts, including letters. Minimum bounding rectangles (MBRs) are found for each of the contours. Spectral clustering is used to group these MBRs in order to find receipt clusters which correspond to individual receipts. These are then processed with OCR to aid the user with validation.