CFP last date
20 February 2025
Reseach Article

Extracting Images from the Web using Data Mining Technique

Published on May 2012 by Syed Thousif Hussain
National Conference on Advances in Computer Science and Applications (NCACSA 2012)
Foundation of Computer Science USA
NCACSA - Number 5
May 2012
Authors: Syed Thousif Hussain

Syed Thousif Hussain . Extracting Images from the Web using Data Mining Technique. National Conference on Advances in Computer Science and Applications (NCACSA 2012). NCACSA, 5 (May 2012), 21-24.

@article{
author = { Syed Thousif Hussain },
title = { Extracting Images from the Web using Data Mining Technique },
journal = { National Conference on Advances in Computer Science and Applications (NCACSA 2012) },
issue_date = { May 2012 },
volume = { NCACSA },
number = { 5 },
month = { May },
year = { 2012 },
issn = 0975-8887,
pages = { 21-24 },
numpages = 4,
url = { /proceedings/ncacsa/number5/6508-1033/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Conference on Advances in Computer Science and Applications (NCACSA 2012)
%A Syed Thousif Hussain
%T Extracting Images from the Web using Data Mining Technique
%J National Conference on Advances in Computer Science and Applications (NCACSA 2012)
%@ 0975-8887
%V NCACSA
%N 5
%P 21-24
%D 2012
%I International Journal of Computer Applications
Abstract

The objective of this work is to generate a large number of images for specified object class. The approach is to employ text, metadata and visual features and to use to gather many high quality images from the web. Candidates images are obtained by text based web search. The web page and the images are downloaded. The task is to remove irrelevant images and to re-rank. First, the images query page is downloaded. Second, it extracts images URL from downloaded page and place it in the database then ranking is done based on text surrounding and metadata features. SVM and Naive bayes classifier algorithm are compared for ranking. The top ranked images are used as training data and an SVM visual classifier is learned to improve re-ranking. The principal idea of the overall method is in combining text or metadata or visual features in order to achieve a completely automatic ranking of images.

References
  1. Jianping Fan, Yi Shen, Ning Zhou and Yuli Gao, "Harvesting Large-Scale Weakly-Tagged Image Databases from the Web", Department of Computer Science, UNC-Charlotte, NC28223, USA.
  2. K. Barnard, P. Duygulu, N. de Freitas, D. Forsyth, D. Blei, and M. Jordan. Matching words and pictures. J. Machine Learning Research, 3:1107–1135, Feb 2003.
  3. Nicholas Morsillo, Christopher Pal and Randal Nelson," Semi-Supervised Learning of Visual Classifiers from Web Images and Text. Department of Computer Science University of Rochester Rochester, NY, D´epartement de g´enie informatique et g´enie logiciel, Ecole Polytechnique De Montreal, Montr´eal, QC, H3C 3A7, Canada.
  4. F. Schroff, A. Criminisi, and A. Zisserman, "Harvesting image Databases from the Web", Proc. 11th Conf. Computer Vision, 2007.
  5. J. Aslam and M. Montague, "Models for Metasearch". Proc. ACM Conf. Research and Development in Information Retrieval.
  6. K. Barnard, P. Duygulu, N. de Freitas, D. Blei and M. Jordan, "Matching Words and Pictures", J. Machine Learning Research, vol. 3.
  7. R. Fergus, L. Fei-Fei, P. Perona and A. Zisseman, "Learning Object Categories from Google's Image Search", Proc. 10th Int'l Conf. Computer Vision, 2005.
  8. R. Fergus, P. Pernoa and A. Zisserman, "A Visual Category Filter for Google Images", Proc. Eighth European Conf. Computer Vision, May 2004.
  9. C. Frankel, M. J. Swain, and V. Athitsos, "Webseer: An Image Search Engine for the WWW", technical report, Univ. of Chicago, 1997.
  10. W. H. Lin, R. Jin and A. Hauptmann, "Web Image Retrieval Re-Ranking with Relevance Model", Proc. IADIS Int'l Conf. , 2003.
  11. G. Wang and D. Forsyth, "Object Image Retrieval by Exploiting Online Knowledge Resource", Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
  12. T. L. Berg and D. A. Forsyth, "Animals on the Web", Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004.
  13. T Joachims, "SVM lisght", http://svmlight. joachims. org/ ,2010.
  14. F. Schroff, A, Criminisi and A. Zisserman, "Harvesting Image Databases From The Web", http://www. robots. ox. ac. uk/~vgg/data/mkdb, 2007 .
  15. Carole Bouchard, Jean Frencols Omhover, " A Kansei Based Image Retrieval System Based on The Conjoint Trends Analysis Method.
Index Terms

Computer Science
Information Sciences

Keywords

Image Retrieval Object Recognition Computer Vision Weakly Supervised.