Deep Web Mining: A Gold Mine

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

Deep Web Mining: A Gold Mine

Published on August 2011 by Tejaswini A. Bhosale, Priya B. Pandharbale

National Technical Symposium on Advancements in Computing Technologies

Foundation of Computer Science USA

NTSACT - Number 4

August 2011

Authors: Tejaswini A. Bhosale, Priya B. Pandharbale

Tejaswini A. Bhosale, Priya B. Pandharbale . Deep Web Mining: A Gold Mine. National Technical Symposium on Advancements in Computing Technologies. NTSACT, 4 (August 2011), 6-11.

@article{

author = { Tejaswini A. Bhosale, Priya B. Pandharbale },

title = { Deep Web Mining: A Gold Mine },

journal = { National Technical Symposium on Advancements in Computing Technologies },

issue_date = { August 2011 },

volume = { NTSACT },

number = { 4 },

month = { August },

year = { 2011 },

issn = 0975-8887,

pages = { 6-11 },

numpages = 6,

url = { /proceedings/ntsact/number4/3210-ntst031/ },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Proceeding Article

%1 National Technical Symposium on Advancements in Computing Technologies

%A Tejaswini A. Bhosale

%A Priya B. Pandharbale

%T Deep Web Mining: A Gold Mine

%J National Technical Symposium on Advancements in Computing Technologies

%@ 0975-8887

%V NTSACT

%N 4

%P 6-11

%D 2011

%I International Journal of Computer Applications

Abstract

DEEP Web contains very large and valuable information than the surface Web. However, making use of such consolidated information requires substantial efforts since the pages are generated for visualization not for data exchange. Thus, Extracting structured data from deep Web pages is a challenging problem due to the underlying intricate structures of such pages. So, extracting information from searchable Websites has been a key step for Web information integration. We discuss some of the underlying problems and issues central to extending information retrieval systems.

References

B. Amento, L.G. Terveen, and W.C. Hill, “Does ‘Authority’ Mean Quality? Predicting Expert Quality Ratings of Web Documents,” Proc. ACM SIGIR ’00, July 2000.
J.M. Kleinberg, “Authoritative Sources in a Hyperlinked Environment,”J. ACM, vol. 46, no. 5, pp. 604–632, 1999.
A. Borodin, G.O. Roberts, J.S. Rosenthal, and P. Tsaparas, “Link Analysis Ranking: Algorithms, Theory, and Experiments,” ACMTrans. Internet Technology, vol. 5, no. 1, pp. 231–297, 2005.
T. Mandl, “Implementation and Evaluation of a Quality-Based Search Engine,” Proc. 17th ACM Conf. Hypertext and Hypermedia, Aug. 2006.
S. Lawrence and C.L. Giles, "Accessibility of Information on the Web," Nature 400:107–109, July 8, 1999.
Google Image, images.google.com, 2008.
A. Ghoshal et al., “Hidden Markov Models for Automatic Annotation and Content-Based Retrieval of Images and Video,” Proc. 28th Ann. Int’l ACM SIGIR Conf. Research and Development inInformation Retrieval, pp. 544-551, 2005.
B.T. Li, K. Goh, and E. Chang, “Confidence-Based Dynamic Ensemble for Image Annotation and Semantics Discovery,” Proc.
Huskysearch. Available on the World Wide Web at: http:// zhadum.cs.washington.edu/ACM Int’l Conf. Multimedia, pp. 195-206, 2003.
http://www.tcp.ca/Jan96/BusandMark.html. [formerlyhttp://www.tcp.ca/Jan96/BusandMark.html]
Brin, S.; Motwani, R.; Page, L.; Winograd, T.: The PageRank CitationRanking: Bringing Order to the Web. Technical Report, 1998.
BrightPlanet, LexiBot Pro v. 2.1 User's Manual, April 2000, 126 p.
AccessLogAnalyzers,[http://www.uu.se/Software/Analyzers/Ac cessanalyzers.Html]
The 1999 NEC study report on average Web document

Index Terms

Computer Science

Information Sciences

Keywords

Deep Web Surface web Web-mining