CFP last date
22 July 2024
Reseach Article

Web Page Categorization using Multilayer Perceptron with Reduced Features

by Kavitha S, Vijaya M S
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 65 - Number 1
Year of Publication: 2013
Authors: Kavitha S, Vijaya M S
10.5120/10889-5786

Kavitha S, Vijaya M S . Web Page Categorization using Multilayer Perceptron with Reduced Features. International Journal of Computer Applications. 65, 1 ( March 2013), 22-27. DOI=10.5120/10889-5786

@article{ 10.5120/10889-5786,
author = { Kavitha S, Vijaya M S },
title = { Web Page Categorization using Multilayer Perceptron with Reduced Features },
journal = { International Journal of Computer Applications },
issue_date = { March 2013 },
volume = { 65 },
number = { 1 },
month = { March },
year = { 2013 },
issn = { 0975-8887 },
pages = { 22-27 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume65/number1/10889-5786/ },
doi = { 10.5120/10889-5786 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:17:32.767587+05:30
%A Kavitha S
%A Vijaya M S
%T Web Page Categorization using Multilayer Perceptron with Reduced Features
%J International Journal of Computer Applications
%@ 0975-8887
%V 65
%N 1
%P 22-27
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The web is a huge repository of knowledge and numerous hyperlinks. Web also serves a broad diversity of user communities and global information service centers. Every day the knowledge in web page upwards rapidly. Web pages can be used to convey the knowledge to web users. Such voluminous size of the web makes an intricacy of web information retrieval, web content filtering and web structure mining. Hence, it is essential for proper categorization of web pages. This paper demonstrates the web page categorization problem as the multi classification task and provides a suitable solution using a supervised learning technique namely multilayer perceptron. The classification model is generated by learning the features that have been extracted from HTML structure and URL of the web page. Feature reduction techniques have been applied to select optimum features and a model is learned. The experimental results of the multilayer perceptron models before and after feature reduction has been evaluated and observed that the multilayer perceptron model with reduced features performs well.

References
  1. Alamelu Mangai, J. , and Santhosh Kumar, V. 2011. A Novel Approach for Web Page Classification using Optimum Features, in Proceedings of International Journal of Computer Science and Network Security, Vol. 11, No. 5.
  2. SiniShibu, Aishwarya Vishwakarma, and Niket Bhargava. 2010. A Combination Approach for Web Page Classification using Page Rank and Feature Selection Technique, in Proceedings of International Journal of Computer Theory and Engineering, Vol. 2, No. 6.
  3. Sara Meshkizadeh, and Amir Masound Rahmani. 2010. "Web page Classification based on Compound of Using HTML Features and URL Features and Features of Sibling Pages", in Proceedings of International Journal of Advancements in Computing Technology, Vol. 2, No. 4.
  4. Nuanwan Soonthornphisaj, and Boonserm Kijsirikul. 2005. Combining ILP with Semi-supervised Learning for Web Page Categorization, in Proceedings of International Journal of Information and Mathematical Sciences, Vol. 1, No. 4.
  5. Santhana Lakshmi, V. , and Vijaya, M. S. 2011. The SVM Based Interactive Tool for Predicting Phishing Websites, in Proceedings of the International Journal of Computer Science and Information Security, Vol. 9, No. 10.
  6. Rekha Jain, and Purohit G. N. 2011. Page Ranking Algorithms for Web Mining, in Proceedings of International Journal of Computer Application, Vol. 13.
  7. Ting, S. L. , W. H. IP, Albert, H. C. T. 2011. Is Naïve Bayes a Good Classifier for Document Classification", in Proceedings of International Journal of Software Engineering and its Applications", Vol. 5, No. 3.
  8. Zhihua Wei, Hongyun Zhang_, Zhifei Zhang, Wen Li, Duoqian Miao, 2011. A Naïve Bayesian Multi-label Classification Algorithm with Application to Visualize Text Search Results, in Proceedings of International Journal of Advanced Intelligence", Vol. 3, No. 2, pp. 173-188.
  9. BinduMadhuri, C. H. , AnandChandulal, J. , Ramya, K. , and Phanidra, M. 2011. Analysis of Users Web Navigation Behavior using GRPA with Variable Length Markov Chains, in Proceedings of International Journal of Data Mining and Knowledge Management Process", Vol. 1, No. 2.
  10. Pooja Sharma, and Pawan Bhadana. 2010. Weighted Page Content Rank for Ordering Web Search Result, in Proceedings of International Journal of Engineering Science and Technology, Vol. 2.
  11. Wongkot Sriurai, Phayung Meesad and Choochart Haruechaiyasak. 2010. Hierarchical Web page Classification based on a Topic Model and Neighboring Pages Integration, in Proceedings of International Journal of Computer Science and Information Security, Vol. 7, No. 2.
  12. Selvakuberan, K. , Indradevi M and Rajaram R. 2008. Combined Feature Selection and Classification-A Noval Approach for the Categorization of Web Pages, in Proceedings of International Journal of Information and Computing Science, Vol. 3, No. 2, Pp. 083-089.
  13. Brown E. N. , Kass R. E. , and Mitra P. P. 2004. Multiple neural spike train data analysis: state-of-the-art and future challenges, Nature Neuroscience, 7 (5): 456–61.
  14. Arabib and Michael A, The Handbook of Brain Theory and Neural Networks.
  15. Russell and Ingrid. 2012. Neural Networks Module.
  16. Yogendra kumar jain, and Sandeep wadekar . 2011. Classification based Retrieval Methods to Enhance Information Discovery on the Web, in Proceedings of International Journal of Managing Information Technology, Vol. 3, No. 1.
  17. Shiqun Yin, Yuhui Qiu, Chengwen, Zhong, Jifu Zhou. 2007. Study of Web Information Extraction and Classification Method, IEEE International Conference on Wireless Communications, Networking and Mobile Computing, Wicom, PP. 5548-5552.
  18. Lilac A. E. , Al-safadi. 2009. Auto Classification for Search Intelligence, in Proceedings of World Academy of Science, Engineering and Technology.
Index Terms

Computer Science
Information Sciences

Keywords

Categorization Multilayer perceptron Training Web page