2nd National Conference on Innovative Paradigms in Engineering and Technology (NCIPET 2013) |
Foundation of Computer Science USA |
NCIPET - Number 13 |
March 2012 |
Authors: Sumedha S. Parshurame |
0a24c7e9-1269-44e8-8208-a4e773ef7e3a |
Sumedha S. Parshurame . An evolved classification and clustering approach for the detection of web spam. 2nd National Conference on Innovative Paradigms in Engineering and Technology (NCIPET 2013). NCIPET, 13 (March 2012), 26-29.
Web spam denotes the manipulation of web pages with the sole intent to raise their position in search engine rankings. Since a better position in the rankings directly and positively affects the number of visits to a site, attackers use different techniques to boost their pages to higher ranks. In the best case, web spam pages are a nuisance that provide undeserved advertisement revenues to the page owners. In the worst case, these pages pose a threat to Internet users by hosting malicious content and launching drive-by attacks against unsuspecting victims. When successful, these drive-by attacks then install malware on the victims machines. In this paper we introduce a clustering and classification approach to detect spam web pages in the list of results that are returned by a search engine. Initially, we apply K-nearest neighbor approach for clustering. And then we will apply K-means classification over those links for categorizing them as either spam or non-spam links.