CFP last date
20 December 2024
Reseach Article

What the Masses Want: A Case Study in Knowledge Discovery from Politically Oriented Data

by Samhaa R. El-beltagy, Moustafa Ghanem, Heba Ezzat, Sourya Ezzat, Mohmmed Aboelhouda, Ahmed Gamal, Mohamed Elkalioby, Shady Alaa Issa
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 67 - Number 6
Year of Publication: 2013
Authors: Samhaa R. El-beltagy, Moustafa Ghanem, Heba Ezzat, Sourya Ezzat, Mohmmed Aboelhouda, Ahmed Gamal, Mohamed Elkalioby, Shady Alaa Issa
10.5120/11399-6712

Samhaa R. El-beltagy, Moustafa Ghanem, Heba Ezzat, Sourya Ezzat, Mohmmed Aboelhouda, Ahmed Gamal, Mohamed Elkalioby, Shady Alaa Issa . What the Masses Want: A Case Study in Knowledge Discovery from Politically Oriented Data. International Journal of Computer Applications. 67, 6 ( April 2013), 21-28. DOI=10.5120/11399-6712

@article{ 10.5120/11399-6712,
author = { Samhaa R. El-beltagy, Moustafa Ghanem, Heba Ezzat, Sourya Ezzat, Mohmmed Aboelhouda, Ahmed Gamal, Mohamed Elkalioby, Shady Alaa Issa },
title = { What the Masses Want: A Case Study in Knowledge Discovery from Politically Oriented Data },
journal = { International Journal of Computer Applications },
issue_date = { April 2013 },
volume = { 67 },
number = { 6 },
month = { April },
year = { 2013 },
issn = { 0975-8887 },
pages = { 21-28 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume67/number6/11399-6712/ },
doi = { 10.5120/11399-6712 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:23:58.055957+05:30
%A Samhaa R. El-beltagy
%A Moustafa Ghanem
%A Heba Ezzat
%A Sourya Ezzat
%A Mohmmed Aboelhouda
%A Ahmed Gamal
%A Mohamed Elkalioby
%A Shady Alaa Issa
%T What the Masses Want: A Case Study in Knowledge Discovery from Politically Oriented Data
%J International Journal of Computer Applications
%@ 0975-8887
%V 67
%N 6
%P 21-28
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper describes an approach taken to analyze and categorize a sizable dataset of politically oriented posts that were submitted to a popular idea bank, Egypt 2. 0, created following the Egyptian revolution. The aim of the analysis was to organize and present the data in a simple way that allows the voice of the people to be heard by decision makers and activists in a critical 6 week period in February and March 2011. The constraints faced when developing the approach included the absence of a classification scheme, the unavailability of training data, the need to assign more than one category, or label, to individual posts and the need to complete the task in a short period of time. The goal of this paper is twofold. Firstly, to present and evaluate the rapid development framework and algorithms used to organize the data. Secondly, to document the challenges encountered when both developing the system itself and analyzing the data, and to present our experience to the research community with the aim of identifying potentially new interesting research topics.

References
  1. Blei, D. M. Ng, A. Y. and Jordan, M. I. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research, pp. 993-1022.
  2. Boyapati, V. 2000. Towards a comprehensive topic hierarchy for news. Master Thesis. The Australian National University.
  3. Chang, Y. and Huang, H. 2008. An automatic document classifier system based on nave Bayes classifier and ontology. In Proceedings of 7th International Conference on Machine Learning and Cybernetics, Kunming, China.
  4. El-Beltagy, S. R. and Rafea, A. 2011. An accuracy enhanced light stemmer for Arabic text. ACM Transactions on Speech and Language Processing (TSLP), 7(1).
  5. El-Beltagy, S. R. and Rafea, A. 2009. KP-Miner: A keyphrase extraction system for English and Arabic documents. Information Systems, 34(1), 132–144.
  6. Esuli, A. and Sebastiani, F. 2009. Active learning strategies for multi-label text classification," In Proceedings of the 31st European Conference on Information Retrieval (ECIR'09). Toulouse, France, pp. 102–113
  7. Fellbaum, C. (Ed) (1998). WordNet: An Electronic Lexical Database. MIT Press.
  8. Janik, M. and Kochut, K. 2008. Training-less ontology-based text categorization. In Proceedings of Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR 2008) at the 30th European Conference on Information Retrieval (ECIR'08), Glasgow, Scotland,.
  9. Janik, M. and Kochut, K. . 2007. Wikipedia in action: ontological knowledge in text categorization. Technical. Report No. UGA-CS-TR-07-001. University of Georgia.
  10. Joher, A. , Al-hajar, Z. and Kassem, F. 2008. Automatic Arabic text categorization with Bayesian learning. Damascus University - Department of Artificial Intelligence, 2008.
  11. Mendenhall, W. Beaver, R. J. , and. Beaver, B. M. (2003). Introduction to Probability and Statistics. Brooks/Cole, a division of Thomson Learning.
  12. Said, D. , Wanas, N. , Darwish, N. , and Hegazy, N. 2009. A study of text preprocessing tools for Arabic text categorization. In Proceedings of the 2nd International conference on Arabic Language Resources and Tools. Cairo, Egypt, 2009
  13. Salton, G. and M. J. McGill (1983). Introduction to modern information retrieval. McGraw-Hill. ISBN 0070544840.
  14. Sebastiani, F. 2002. Machine learning in automated text categorization. ACM Computing Surveys (CSUR), pp. 1–47.
  15. Ueda, N. and Saito, K. 2003. Parametric mixture models for multi-labeled text. Advances in neural information processing systems, 15, 721–728.
  16. Wang, B. B. , McKay, R. I. , Abbass, H. A. , and Barlow, M. 2002. Learning text classifier using the domain concept hierarchy. In Proceedings of the IEEE International Conference on Communications, New York, USA.
  17. Wikipedia. (2012). http://www. wikipedia. org/
Index Terms

Computer Science
Information Sciences

Keywords

Text Mining Text Analysis Topic Categorization Multi-Labeling