International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 186 - Number 24 |
Year of Publication: 2024 |
Authors: Kshitij Sekhar Dutta |
10.5120/ijca2024923700 |
Kshitij Sekhar Dutta . Data Extraction and Sentiment Analysis of Social Media. International Journal of Computer Applications. 186, 24 ( Jun 2024), 23-28. DOI=10.5120/ijca2024923700
Social media nowadays has become synonymous with the internet. Social media platforms have long evolved from being simple forums where people could post photos and thoughts to now being a base where people can launch successful entrepreneurial businesses or even turn into influencers, potentially earning them millions. This paper aims to harness the power of Natural Language Processing through Sentiment Analysis and apply it to one of the most popular social media forums right now, Reddit. Reddit has 73.1 million daily active users and 267.5 million weekly active users. There are more than 100,000 active subreddits (sub-forums) on the platform. This paper utilizes Reddit APIs to employ a crawler that scrapes data from Reddit and orders them into a single data set. Then, the paper examines the structure of this data set. Through this data set it then analyses what the current topics of discussions were about, what the perceived opinions of the users were about the various topics. This paper intends to find if there is a correlation between the amount and type of emotions. This paper also highlights the concerns with using Sentiment Analysis and some other applications of it in real-life.