International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 76 - Number 17 |
Year of Publication: 2013 |
Authors: Tanvir Ahmad |
10.5120/13343-0924 |
Tanvir Ahmad . Clustering Technique for Feature Segregation in Opinion Analysis. International Journal of Computer Applications. 76, 17 ( August 2013), 43-49. DOI=10.5120/13343-0924
The World Wide Web (WWW) is a reservoir of enormous amount of data which is primarily embedded within unstructured text documents. E-commerce websites, social networking sites, and discussion forums have become a common place for writing informal opinions about products and other related information. A substantial amount of research has been directed towards mining these texts and concludes on the overall meaning of the users and to assign a grade to the products under discussion. These grading systems often become helpful for users to get an informed opinion about the products he/she wants to buy. There have been different techniques adopted by the opinion website developers to provide end users an overall meaning of the contents, like numerical rating on some predefined scale, star rating, and calculation of the percentage of users who are satisfied or dissatisfied with a product. However, all these methods have failed to segregate the features on the basis of opinion expressed in them or to cluster them in different group which gives a general insight into the features grouped together. In this paper, a framework has been presented which first extracts the feature, modifier and opinion from the dataset and then using clustering mechanism divides them into discrete clusters on the basis of users' opinion, in which the intra-cluster similarity between the features are high whereas the inter-cluster similarity is very low.