CFP last date
20 December 2024
Reseach Article

Performance Evaluation and Estimation for Concept Drifting Data Stream Mining

by Veena Mittal, Indu Kashyap
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 180 - Number 42
Year of Publication: 2018
Authors: Veena Mittal, Indu Kashyap
10.5120/ijca2018917105

Veena Mittal, Indu Kashyap . Performance Evaluation and Estimation for Concept Drifting Data Stream Mining. International Journal of Computer Applications. 180, 42 ( May 2018), 10-15. DOI=10.5120/ijca2018917105

@article{ 10.5120/ijca2018917105,
author = { Veena Mittal, Indu Kashyap },
title = { Performance Evaluation and Estimation for Concept Drifting Data Stream Mining },
journal = { International Journal of Computer Applications },
issue_date = { May 2018 },
volume = { 180 },
number = { 42 },
month = { May },
year = { 2018 },
issn = { 0975-8887 },
pages = { 10-15 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume180/number42/29410-2018917105/ },
doi = { 10.5120/ijca2018917105 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:03:22.925588+05:30
%A Veena Mittal
%A Indu Kashyap
%T Performance Evaluation and Estimation for Concept Drifting Data Stream Mining
%J International Journal of Computer Applications
%@ 0975-8887
%V 180
%N 42
%P 10-15
%D 2018
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In Machine learning tasks, mainly in classification problems, the performance measures are of major concern in order to determine and compare the performance of classification methods. In classification problem the accuracy of the classifier is one of the most important performance measures commonly used. However, computing performance in the dynamic environment learning offered for concept drifting data streams also requires some performance considerations as compared to classification tasks in static environment. Furthermore, the learning and testing strategies widely used for training and testing of classifiers of static environments cannot be utilized efficiently to meet the requirements of concept drifting data stream mining as the main requirement of online leaning is to perform one pass incremental learning on large datasets conversely to the allowable iterative learning in static environment with small datasets. This paper describes some important performance measures and learning and testing strategies pertaining to online and incremental learning in the presence of concept drifting data streams. Furthermore, this paper also presents performance measures of drift detection methods widely used as an explicit component in many concept drifting data stream mining algorithms.

References
  1. J. Han and M. Kamber, "Classification and prediction," Data mining: Concepts and techniques, pp. 347-350, 2006.
  2. M. M. Gaber, A. Zaslavsky, and S. Krishnaswamy, "Mining data streams: a review," ACM Sigmod Record, vol. 34, no. 2, pp. 18-26, 2005.
  3. V. Mittal and I. Kashyap, "Online methods of learning in occurrence of concept drift," International Journal of Computer Applications, vol. 117, no. 13, 2015.
  4. V. Mittal and I. Kashyap, "Empirical Study of Impact of Various Concept Drifts in Data Stream Mining Methods," International Journal of Intelligent Systems and Applications, vol. 8, no. 12, p. 65, 2016.
  5. R. Srivastava and M. Bhatia, "Quantifying modified opinion strength: A fuzzy inference system for sentiment analysis," presented at the International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2013 2013.
  6. Gepperth and B. Hammer, "Incremental learning algorithms and applications," in European Symposium on Artificial Neural Networks (ESANN), 2016.
  7. K. Nishida, "Learning and detecting concept drift," Graduate School of Information Science and Technology, Hokkaido University, 2008.
  8. Bifet and R. Gavalda, "Learning from time-changing data with adaptive windowing," in Proceedings of the 2007 SIAM International Conference on Data Mining, 2007, pp. 443-448: SIAM.
  9. G. Ditzler, M. Roveri, C. Alippi, and R. Polikar, "Learning in nonstationary environments: A survey," IEEE Computational Intelligence Magazine, vol. 10, no. 4, pp. 12-25, 2015.
  10. Žliobaitė, "Learning under concept drift: an overview," arXiv preprint arXiv:1010.4784, 2010.
  11. D. Brzeziński, "Mining data streams with concept drift," Master’s thesis, Poznan University of Technology, 2010.
  12. P. Angelov, "Nature-inspired methods for knowledge generation from data in real-time," URL: http://www. nisis. risk-technologies. com/popup/Mallorca2006 Papers A, vol. 333, p. 13774, 2006.
  13. R. Srivastava and M. Bhatia, "Real-Time Unspecified Major Sub-Events Detection in the Twitter Data Stream That Cause the Change in the Sentiment Score of the Targeted Event," International Journal of Information Technology and Web Engineering (IJITWE), vol. 12, no. 4, pp. 1-21, 2017.
  14. R. Srivastava and M. Bhatia, "Challenges with Sentiment Analysis of On-line Micro-texts," International Journal of Intelligent Systems and Applications, vol. 9, no. 7, p. 31, 2017.
  15. R. Srivastava and M. Bhatia, "Ensemble methods for sentiment analysis of on-line micro-texts," presented at the International Conference on Recent Advances and Innovations in Engineering (ICRAIE), 2016 2016.
  16. R. Srivastava and M. Bhatia, "Offline vs. Online Sentiment Analysis: Issues With Sentiment Analysis of Online Micro-Texts," International Journal of Information Retrieval Research (IJIRR), vol. 7, no. 4, pp. 1-18, 2017.
  17. R. Srivastava, M. Bhatia, H. K. Srivastava, and C. Sahu, "Exploiting grammatical dependencies for fine-grained opinion mining," in Computer and communication technology (iccct), 2010 international conference on, 2010, pp. 768-775: IEEE.
  18. R. Srivastava, H. Kumar, M. Bhatia, and S. Jain, "Analyzing Delhi Assembly Election 2015 Using Textual Content of Social Network," in Proceedings of the Sixth International Conference on Computer and Communication Technology 2015, 2015, pp. 78-85: ACM.
  19. P. Langley, W. Iba, and K. Thompson, "An analysis of Bayesian classifiers," in Aaai, 1992, vol. 90, pp. 223-228.
  20. C. Cortes and V. Vapnik, "Support-vector networks," Machine learning, vol. 20, no. 3, pp. 273-297, 1995.
  21. D. Kifer, S. Ben-David, and J. Gehrke, "Detecting change in data streams," in Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, 2004, pp. 180-191: VLDB Endowment.
  22. M. Baena-Garcıa, J. del Campo-Ávila, R. Fidalgo, A. Bifet, R. Gavalda, and R. Morales-Bueno, "Early drift detection method," in Fourth international workshop on knowledge discovery from data streams, 2006, vol. 6, pp. 77-86.
  23. W. Raza and K. Ahmad, "A highly selective Fe@ ZnO modified disposable screen printed electrode based non-enzymatic glucose sensor (SPE/Fe@ ZnO)," Materials Letters, vol. 212, pp. 231-234, 2018.
  24. J. Gama, P. Medas, G. Castillo, and P. Rodrigues, "Learning with drift detection," in Brazilian Symposium on Artificial Intelligence, 2004, pp. 286-295: Springer.
  25. Bifet and R. Gavalda, "Learning from Time-Changing Data with Adaptive Windowing," in SDM, 2007, vol. 7, p. 2007: SIAM.
  26. L. Rokach, "Ensemble-based classifiers," Artificial Intelligence Review, vol. 33, no. 1, pp. 1-39, 2010.
Index Terms

Computer Science
Information Sciences

Keywords

Data Streams dynamic environments classifiers ensemble learning online methods.