Notification: Our email services are now fully restored after a brief, temporary outage caused by a denial-of-service (DoS) attack. If you sent an email on Dec 6 and haven't received a response, please resend your email.
CFP last date
20 December 2024
Reseach Article

Review On Intelligent Crawling Web Forum

Published on December 2014 by Trupti D. Narkhede, P.m. Yawalkar
Innovations and Trends in Computer and Communication Engineering
Foundation of Computer Science USA
ITCCE - Number 3
December 2014
Authors: Trupti D. Narkhede, P.m. Yawalkar
7badcbd1-1817-4068-b890-e7cdc8c11a16

Trupti D. Narkhede, P.m. Yawalkar . Review On Intelligent Crawling Web Forum. Innovations and Trends in Computer and Communication Engineering. ITCCE, 3 (December 2014), 13-16.

@article{
author = { Trupti D. Narkhede, P.m. Yawalkar },
title = { Review On Intelligent Crawling Web Forum },
journal = { Innovations and Trends in Computer and Communication Engineering },
issue_date = { December 2014 },
volume = { ITCCE },
number = { 3 },
month = { December },
year = { 2014 },
issn = 0975-8887,
pages = { 13-16 },
numpages = 4,
url = { /proceedings/itcce/number3/19055-2021/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 Innovations and Trends in Computer and Communication Engineering
%A Trupti D. Narkhede
%A P.m. Yawalkar
%T Review On Intelligent Crawling Web Forum
%J Innovations and Trends in Computer and Communication Engineering
%@ 0975-8887
%V ITCCE
%N 3
%P 13-16
%D 2014
%I International Journal of Computer Applications
Abstract

Internet forum are important service where user can request and exchange information with other. The Focus(Forum crawler Under Supervision), it is web-scale forum crawler. A Crawler traverses the World Wide Web in a systematic manner with intention of gathering data or knowledge. The goal of Focus is to crawl applicable forum content from the web. Web crawlers following the hyperlinks in Web pages to automatically download a partial snapshot of the Web. Based on this observation, smaller the forum web crawling problem to a URL-type . Although forum have different layouts or styles and different forum software packages. They have always similar implicit navigation path connected by specific URL types to users from entry pages to thread pages.

References
  1. Jingtian Jiang, Xinying Song, NenghaiYu,and Chin-Yew Lin,"FoCUS Learning to Crawl Web Forums, Proc. IEEE Trans. Knowledge Data Eng. , vol. 25,no. 6 ,June 2013.
  2. Y. Zhai and B. Liu, "Structured Data Extraction from the Web based on Partial Tree Alignment," Proc. IEEE Trans. Knowledge Data Eng. , vol. 18, no. 12, pp. 1614- 1628, Dec. 2006.
  3. "ForumMatrix,"http://www. forummatrix. org/index. php, 2012.
  4. M. L. AVidal,A. S. Silva,E. S. Moura,andJ. M. B. Cavalcanti, "Structure-Driven Crawler Generation by Example," Proc. 29thAnn. Intl ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 292-299, 2006.
  5. R. Cai, J. -M. Yang, W. Lai, Y. Wang, and L. Zhang, "iRobot: An Intelligent Crawler for Web Forums," Proc. 17th Int'l Conf. World Wide Web, pp. 447-456, 2008.
  6. J. -M. Yang, R. Cai, Y. Wang, J. Zhu, L. Zhang, and W. -Y. Ma, "Incorporating Site-Level Knowledge to Extract Structured Data from Web Forums," Proc. 18th Int'l Conf. World Wide Web, pp. 181-190, 2009.
  7. K. Li, X. Q. Cheng, Y. Guo, and K. Zhang, "Crawling Dynamic Web Pages in WWW Forums," Computer Eng. , vol. 33, no. 6,pp. 80-82, 2007.
Index Terms

Computer Science
Information Sciences

Keywords

Eit Path Forum Crawling Itf Regex Url Type Page Classification Page Type.