CFP last date
20 February 2025
Reseach Article

Mining Concept Drift from Data Streams by Unsupervised Learning

by E.padmalatha, C.r.k.reddy, Padmaja Rani
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 117 - Number 15
Year of Publication: 2015
Authors: E.padmalatha, C.r.k.reddy, Padmaja Rani
10.5120/20632-3255

E.padmalatha, C.r.k.reddy, Padmaja Rani . Mining Concept Drift from Data Streams by Unsupervised Learning. International Journal of Computer Applications. 117, 15 ( May 2015), 27-34. DOI=10.5120/20632-3255

@article{ 10.5120/20632-3255,
author = { E.padmalatha, C.r.k.reddy, Padmaja Rani },
title = { Mining Concept Drift from Data Streams by Unsupervised Learning },
journal = { International Journal of Computer Applications },
issue_date = { May 2015 },
volume = { 117 },
number = { 15 },
month = { May },
year = { 2015 },
issn = { 0975-8887 },
pages = { 27-34 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume117/number15/20632-3255/ },
doi = { 10.5120/20632-3255 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:00:21.291231+05:30
%A E.padmalatha
%A C.r.k.reddy
%A Padmaja Rani
%T Mining Concept Drift from Data Streams by Unsupervised Learning
%J International Journal of Computer Applications
%@ 0975-8887
%V 117
%N 15
%P 27-34
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Mining is involved with knowing the unknown characteristics from the databases or gaining of Knowledge (Knowledge Discovery) from Databases to get more useful information from the database. Real time databases which are constantly changing with time, there may arise a point when traditional Data Mining techniques may not be adequate as there may be a previously unknown class label involved or new properties of data which need to be taken into consideration. Thus as time passes and new data is in the dataset, the model predicted by the data mining techniques may become less accurate. This phenomenon is known as Concept Drift. The meaning of Concept Drift is the statistical properties of the target variable, i. e. how the properties of the target variable change over the course of time. The basic idea behind the ?Mining Concept Drift from Data Stream by Unsupervised Learning? is to detect the Concept Drift present in the Data Stream, which is used in majority of Web-Based Applications like Fraud Detection & Span E-mail Filtering etc. The approach taken here is both for the Offline Approach & an Online Approach, which can be easily merged with the current Web-Based Applications. Some examples for Concept Drift are – In a fraud detection application the target concept may be a binary attribute FRAUDULENT with values "yes" or "no" that indicates whether a given transaction is fraudulent. Or, in a weather prediction application, there may be several target concepts such as EMPERATURE, PRESSURE, and HUMIDITY. Each of these target parameters change over time and over model should be able to accommodate these changes or the Concept. In order to overcome the problems of the Offline or Desktop based processing to detect the Concept drift (which is available), it is aimed here to move the Concept Drift Detection process to the Cloud (web) & have it for Web-Based Applications too.

References
  1. J. Gehrke, V. Ganti, R. Ramakrishnan, and W. Loh, ?BOAT— Optimistic Decision Tree onstruction,? Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '99), 1999.
  2. G. Hulten, L. Spencer, and P. Domingos, ?Mining Time-Changing Data Streams,? Proc. Seventh ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '01), pp. 97-106, 2001.
  3. Jordan, Michael I. ; Bishop, Christopher M. (2004). "Neural Networks?. In Allen B. Tucker. Computer Science Handbook, Second Edition (Section VII: Intelligent Systems). Boca Raton, FL: Chapman & Hall/CRC Press LLC.
  4. G. Widmer and M. Kubat, ?Learning in the presence of concept drift and hiddencontexts?. Machine Learning, vol. 23, no. 1, pp. 69-101, 1996.
  5. W. N. Street and Y. Kim, ?A streaming ensemble algorithm (sea) for large-scale classification. ? ACM Press, 2001, pp. 377– 382.
  6. J. Z. Kolter and M. A. Maloof, ?Using additive expert ensembles to cope with concept drift,? in ICML, 2005, pp. 449–456.
  7. GuénaëlCabanes and YounèsBennani, ?Change detection in data streams through unsupervised learning?, WCCI 2012 IEEE World Congress on Computational Intelligence, 2012.
  8. J. Z. Kolter and M. A. Maloof,"Using additive expert ensembles to cope with concept drift ", in ICML,2005,pp,449-456.
  9. T. Kohonen "Self –Organizing Maps ". Berlin: Springer-Verlag,2001.
  10. W. Nick Street and Yong Seog Kim. A Streaming Ensemble Algorithm (SEA) for Large- Scale Classification. KDD – 01. San Francisco, CA.
  11. W. Nick Street and Yong Seog Kim. A Streaming Ensemble Algorithm (SEA) for Large- Scale Classification. KDD – 01. San Francisco, CA.
  12. B. Silverman, ?Using kernel density estimates to investigate multimodality,? Journal of the Royal Statistical Society, Series B, vol. 43,pp. 97–99, 1981.
  13. C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu, ?A framework for clustering evolving data streams,? in Very Large Data Base, 2003, pp. 81–92.
  14. Jordan,Michael I. ;Bishop,Christopher M. (2004). "Neural Networks". In Allen B. Tucker. Computer science Handbook,Second Edition (Section VII:Inrelligent Systems). Boca Raton,FL:Chapman & Hall /CRC Press LLC.
Index Terms

Computer Science
Information Sciences

Keywords

Concept Drift Data mining Data Stream.