International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 159 - Number 1 |
Year of Publication: 2017 |
Authors: Abha Tewari, Pratik Sawant, Jai Samtani, Sanket Sawant, Gaurav Massand |
10.5120/ijca2017912209 |
Abha Tewari, Pratik Sawant, Jai Samtani, Sanket Sawant, Gaurav Massand . Multilabel Classification of Tweets. International Journal of Computer Applications. 159, 1 ( Feb 2017), 1-4. DOI=10.5120/ijca2017912209
With the help of Social Networking sites many news providers used to share their news headlines on the micro blogging sites such as twitter. We are proposing a system to classify tweets into different groups and labels so that the user can identify the particular tweet from particular category. We will use 120 character tweets for our analysis purpose. Various active and verified twitter accounts would be chosen to extract the tweets. Each tweet is to be classified into 2 category-spam and non-spam. Then further spam group is classified as advertisement, malicious and URL links. The non-spam tweets are classified into 6 labels. These classified tweets then are used to train the various machine learning techniques. Words of each tweet considered as features and a feature vector was created using bag-of-words approach in order to create the instances. The data will be trained using SVM (Support Vector Machine), Naive Bayes and K neighbor machine learning techniques and their efficiency will be compared.