CFP last date
20 March 2024
Reseach Article

Analysis of Opinion Mining on Social Media Data Streams using Hadoop

by Padala S. Venkata Durga Gayatri, Archana Raghuvamshi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 155 - Number 6
Year of Publication: 2016
Authors: Padala S. Venkata Durga Gayatri, Archana Raghuvamshi
10.5120/ijca2016912336

Padala S. Venkata Durga Gayatri, Archana Raghuvamshi . Analysis of Opinion Mining on Social Media Data Streams using Hadoop. International Journal of Computer Applications. 155, 6 ( Dec 2016), 45-49. DOI=10.5120/ijca2016912336

@article{ 10.5120/ijca2016912336,
author = { Padala S. Venkata Durga Gayatri, Archana Raghuvamshi },
title = { Analysis of Opinion Mining on Social Media Data Streams using Hadoop },
journal = { International Journal of Computer Applications },
issue_date = { Dec 2016 },
volume = { 155 },
number = { 6 },
month = { Dec },
year = { 2016 },
issn = { 0975-8887 },
pages = { 45-49 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume155/number6/26613-2016912336/ },
doi = { 10.5120/ijca2016912336 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:00:36.082203+05:30
%A Padala S. Venkata Durga Gayatri
%A Archana Raghuvamshi
%T Analysis of Opinion Mining on Social Media Data Streams using Hadoop
%J International Journal of Computer Applications
%@ 0975-8887
%V 155
%N 6
%P 45-49
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Twitter is a social networking site in which the data to be processed is in rich amounts and which can be structured, semi-structured and unstructured data streams. Opinion mining over the Twitter offers organizations a fast and effective way to monitor the feelings of public towards their services. It focuses on predicting the polarity of words and then classifies them into positive and negative feelings with the aim of identifying attitude and opinions that are expressed in any form or language. Bian et al.’s method (2012) annotated the twitter corpus which was focused on Adverse Drug Reaction (ADR) which includes the broad pharmacological coverage. Bingwei et al.’s method ( 2013) evaluates the scalability of Naive Bayes classifier (NBC) in large datasets instead of using the standard library. Skuza et al.’s method (2015) estimated the future stock prices by calculating in distributed environment according to Map Reduce programming model. Mohit et al.’s method, (2014) explains how the Map – Reduce paradigm can be applied to existing Naïve Bayes algorithm to handle a large number of tweets. All these approaches say about the real-world data sets at its accuracy level by using Hadoop File System. This paper analyses all the above methods comparatively.

References
  1. T. Wilson, J. Wiebe, and P. Hoffmann, “Recognizing contextual polarity in phrase-level sentiment analysis,” in Proceedings of HLT and EMNLP. ACL, (2005), pp. 347–354
  2. C. C. Tao, S. K. Kim, Y. A. Lin, Y. Y. Yu, G. Bradski, A. Y. Ng and Kunle Olukotun, “Map-reduce for machine learning on multicore”, In NIPS, vol. 6, (2006), pp. 281-288.
  3. B. Jiang, U. Topaloglu and F. Yu, “Towards large-scale twitter mining for drug-related adverse events”, In Proceedings of the 2012 international workshop on Smart health and wellbeing, ACM, (2012), pp. 25-32.
  4. Jiang, K., & Zheng, Y. (2013). Mining Twitter Data for Potential Drug Effects. In Advanced Data Mining and Applications (pp. 434–443). Springer.
  5. M. Gamon, A. Aue, S. Corston-Oliver, and E. Ringger, “Pulse: Mining customer opinions from free text,” in Advances in Intelligent Data Analysis VI. Springer, 2005, pp. 121–132.
  6. U. Kang, D. H. Chau, and C. Faloutsos, “Mining large graphs: Algorithms, inference, and discoveries,” in Data Engineering (ICDE), 2011 IEEE 27th International Conference on, 2011, pp. 243–254.
  7. D. Pessemier and Martens “MovieTweetings: A Movie Reviews Dataset Collected From Twitter”, Ghent University, Ghent, Belgium, (2013).
  8. M. Thomas, B. Pang, and L. Lee, “Get out the vote: Determining support or opposition from congressional floor-debate transcripts,” in Proceedings of the 2006 conference on empirical methods in natural language processing. Association for Computational Linguistics, 2006, pp. 327–335.
  9. L. Bingwei, E. Blasch, Y. Chen, D. Shen and G. Chen, “Scalable Sentiment Classification for Big Data Analysis Using Naive Bayes Classifier”, In Big Data, 2013 IEEE International Conference on, IEEE, (2013), pp. 99-104.
  10. Twitter. Twitter Search API, available at https://dev.twitter.com/rest/public/search.
  11. S. Michal and A. Romanowski, “Sentiment analysis of Twitter data within big data distributed environment for stock prediction”, In Computer Science and Information Systems (FedCSIS), 2015 Federated Conference on, IEEE, (2015), pp. 1349-1354
  12. T. White, “Hadoop: The Definitive Guide”, Third Edition, O'Reilley
  13. Malkani, Zahan, and Evelyn Gillie. "Supervised Multi-Class Classification of Tweets." (2012).
  14. T. Mohit, I. Gohokar, J. Sable, D. Paratwar and R. Wajgi, “Multi-Class Tweet Categorization Using Map Reduce Paradigm”, In International Journal of Computer Trends and Technology. (2014), pp. 78-81.
Index Terms

Computer Science
Information Sciences

Keywords

Twitter social networking sites Navie Bayes Classifier (NBC) Map-Reduce Hadoop File System (HDFS).