CFP last date
20 December 2024
Reseach Article

DBCLUM: Density-based Clustering and Merging Algorithm

by Mohammad Fawzy, Amr Badr, Mostafa Reda, Ibrahim Farag
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 79 - Number 14
Year of Publication: 2013
Authors: Mohammad Fawzy, Amr Badr, Mostafa Reda, Ibrahim Farag
10.5120/13806-1732

Mohammad Fawzy, Amr Badr, Mostafa Reda, Ibrahim Farag . DBCLUM: Density-based Clustering and Merging Algorithm. International Journal of Computer Applications. 79, 14 ( October 2013), 1-6. DOI=10.5120/13806-1732

@article{ 10.5120/13806-1732,
author = { Mohammad Fawzy, Amr Badr, Mostafa Reda, Ibrahim Farag },
title = { DBCLUM: Density-based Clustering and Merging Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { October 2013 },
volume = { 79 },
number = { 14 },
month = { October },
year = { 2013 },
issn = { 0975-8887 },
pages = { 1-6 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume79/number14/13806-1732/ },
doi = { 10.5120/13806-1732 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:52:57.377023+05:30
%A Mohammad Fawzy
%A Amr Badr
%A Mostafa Reda
%A Ibrahim Farag
%T DBCLUM: Density-based Clustering and Merging Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 79
%N 14
%P 1-6
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Clustering is a primary method for DB mining. The clustering process becomes very challenge when the data is different densities, different sizes, different shapes, or has noise and outlier. Many existing algorithms are designed to find clusters. But, these algorithms lack to discover clusters of different shapes, densities and sizes. This paper presents a new algorithm called DBCLUM which is an extension of DBSCAN to discover clusters based on density. DBSCAN can discover clusters with arbitrary shapes. But, fail to discover different-density clusters or adjacent clusters. DBCLUM is developed to overcome these problems. DBCLUM discovers clusters individually then merges them if they are density similar and joined. By this concept, DBCLUM can discover different-densities clusters and adjacent clusters. Experiments revealed that DBCLUM is able to discover adjacent clusters and different-densities clusters and DBCLUM is faster than DBSCAN with speed up ranges from 11% to 52%.

References
  1. Ester M. , Kriegel H. -P. , Sander J. , and Xu X. (1996) "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise" In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD?96), Portland: Oregon, pp. 226-231. Ding, W. and Marchionini, G. 1997 A Study on Video Browsing Strategies. Technical Report. University of Maryland at College Park.
  2. P-N. Tan, M. Steinbach, V. Kumar (2005) "Introduction to Data Mining", Addison-Wesley.
  3. G. Karypis, E. H. Han, and V. Kumar (1999) "CHAMELEON: A hierarchical clustering algorithm using dynamic modeling," Computer, vol. 32, no. 8, pp. 68–75.
  4. M. Ankerst, M. Breunig, H. P. Kriegel, and J. Sander (1999) "OPTICS: Ordering Objects to Identify the Clustering Structure, Proc. ACM SIGMOD," in International Conference on Management of Data, pp. 49–60.
  5. Derya Birant, Alp Kut (2007) "ST-DBSCAN: An Algorithm for Clustering Spatial-temporal data" Data and Knowledge Engineering pg 208-221.
  6. SHOU Shui-geng, ZHOU Ao-ying JIN Wen, FAN Ye and QIAN Wei-ning (2000) "A Fast DBSCAN Algorithm" Journal of Software: 735-744.
  7. N. A. Yousria, M. S. Kamel, and M. A. Ismail (2009) "A distance-relatedness dynamic model for clustering high dimensional data of arbitrary shapes and densities," Pattern Recognition, pp. 1193-1209.
  8. J. MacQueen (1967) "Some methods for classification and analysis of multivariate observations", in: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297.
  9. H. Vinod (1969) "Integer programming and the theory of grouping", Journal of the American Statistical Association 64 (326) 506–519.
  10. L. Kaufman and P. Rousseeuw (1990) "Finding Groups in Data: An Introduction to Cluster Analysis": Wiley.
  11. R. T. Ng, J. Han (1994) "Efficient and effective clustering methods for spatial data mining, in: Proceedings of 20th International Conference on Very Large Data Bases", Santiago, Chile, pp. 144–155.
  12. M. Ester, H. -P. Kriegel, J. Sander, M. Wimmer, X. Xu (1998) "Incremental clustering for mining in a data warehousing environment", in: Proceedings of International Conference on Very Large Databases (VLDB'98), New York, USA, pp. 323–333.
  13. E. Januzaj, H. -P. Kriegel, M. Pfeifle (2004) "Scalable density-based distributed clustering", in: Proceedings of PKDD, Pisa, Italy, Lectures Notes in Computer Science, 3202, Springer, pp. 231–244.
  14. S. Guha, R. Rastogi, K. Shim (1998) "CURE: an efficient clustering algorithms for large databases", in: Proceeding ACM SIGMOD International Conference on Management of Data, Seattle, WA, pp. 73–84.
  15. A. Hinneburg, D. A. Keim (1998) "An efficient approach to clustering in large multimedia databases with noise", in: Proceedings of 4th International Conference on Knowledge Discovery and Data Mining, New York City, NY, pp. 58–65.
  16. S. Guha, R. Rastogi , K. Shim (2000) "ROCK: A robust clustering algorithm for categorical attributes," Inf. Syst. , vol. 25, no. 5, pp. 345–366.
  17. I. Witten, E. Frank, L. Trigg, M. Hall, G. Holmes, and S. Cunningham (1999) "Weka: Practical machine learning tools and techniques with java implementations".
Index Terms

Computer Science
Information Sciences

Keywords

Data mining DBSCAN Density-Based Clustering