CFP last date
20 December 2024
Reseach Article

An Extended Density based Clustering Algorithm for Large Spatial 3D Data using Polyhedron Approach

by Hrishav Bakul Barua, Sauravjyoti Sarmah
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 58 - Number 2
Year of Publication: 2012
Authors: Hrishav Bakul Barua, Sauravjyoti Sarmah
10.5120/9252-3418

Hrishav Bakul Barua, Sauravjyoti Sarmah . An Extended Density based Clustering Algorithm for Large Spatial 3D Data using Polyhedron Approach. International Journal of Computer Applications. 58, 2 ( November 2012), 4-15. DOI=10.5120/9252-3418

@article{ 10.5120/9252-3418,
author = { Hrishav Bakul Barua, Sauravjyoti Sarmah },
title = { An Extended Density based Clustering Algorithm for Large Spatial 3D Data using Polyhedron Approach },
journal = { International Journal of Computer Applications },
issue_date = { November 2012 },
volume = { 58 },
number = { 2 },
month = { November },
year = { 2012 },
issn = { 0975-8887 },
pages = { 4-15 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume58/number2/9252-3418/ },
doi = { 10.5120/9252-3418 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:01:28.997031+05:30
%A Hrishav Bakul Barua
%A Sauravjyoti Sarmah
%T An Extended Density based Clustering Algorithm for Large Spatial 3D Data using Polyhedron Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 58
%N 2
%P 4-15
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Discovering the meaningful patterns and trends out of large datasets needs a very special attention now a days, and one of the most prevalent and widely studied problems in this area is the detection and formation of clusters accurately and correctly. Previous works on this field does not meet the problem of 3D spatial datasets with minimization of Input Parameters. The objective of this paper is to present a Tetrahedron-density based clustering technique for large 3D datasets which we have named as 3D-CATD (Three Dimensional-Clustering Algorithm using Tetrahedron Density), for efficient clustering of 3D spatial data. This algorithm is capable of identifying embedded clusters of arbitrary shapes as well as multi-density clusters over large 3D spatial datasets. The polyhedron approach is being incorporated to perform the clustering where the number of points inside a tetrahedron (tetrahedron density) of a polyhedron is calculated using barycentric formulae for tetrahedron. This is because of the fact that partitioning of the data set can be performed more efficiently in tetrahedron shape than in any other 3D shape due to its smaller space dimension. The ratio of number of points between two tetrahedrons can be found out which forms the basis of nested clustering of 3D data. Experimental results establish the superiority of the technique in terms of cluster quality and complexity.

References
  1. J. Han and M. Kamber, Data Mining: Concepts and Techniques. India: Morgan Kaufmann Publishers, 2004.
  2. M. Ester, H. P. Kriegel, J. Sander and X. Xu, "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise", in International Conference on Knowledge Discovery inDatabases and Data Mining (KDD-96), Portland, Oregon, 1996, pp. 226-231.
  3. C. Hsu and M. Chen, "Subspace Clustering of High Dimensional Spatial Data with Noises", PAKDD, 2004, pp. 31-40.
  4. W. Wang, J. Yang, and R. R. Muntz, "STING: A Statistical Information Grid Approach to Spatial data Mining", in Proc. 23rd InternationalConference on Very Large Databases, (VLDB), Athens, Greece, Morgan Kaufmann Publishers, 1997, pp. 186 - 195.
  5. G. Sheikholeslami, S. Chatterjee and A. Zhang, "Wavecluster: A Multiresolution Clustering approach for very large spatial database", in SIGMOD'98, Seattle, 1998.
  6. R. Agrawal, J. Gehrke, D. Gunopulos and P. Raghavan, "Automatic subspace clustering of high dimensional data for data mining applications", in SIGMOD Record ACM Special Interest Group onManagement of Data, 1998, pp. 94–105.
  7. H. S. Nagesh, S. Goil and A. N. Choudhary, "A scalable parallel subspace clustering algorithm for massive data sets", in Proc. International Conference on Parallel Processing, 2000, pp. 477.
  8. L. Ertoz, M. Steinbach and V. Kumar, "Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data", in SIAMInternational Conference on Data Mining (SDM '03), 2003.
  9. G. Karypis, Han and V. Kumar, "CHAMELEON: A hierarchical clustering algorithm using dynamic modeling", IEEE Computer, 32(8), pp 68-75, 1999.
  10. Y. Zhao, S. Mei, X. Fan, S. Jun-de. 2003. Clustering Datasets Containing Clusters of Various Densities. Journal of Beijing Universityof Posts and Telecommunications, 26(2):42-47.
  11. H. S. Kim, S. Gao, Y. Xia, G. B. Kim and H. Y. Bae, "DGCL: An Efficient Density and Grid Based Clustering Algorithm for Large Spatial Database", Advances in Web-Age Information Management (WAIM'06), pp. 362-371, 2006.
  12. M. Ankerst, M. M. Breuing, H. P. Kriegel and J. Sander, "OPTICS: Ordering Points To Identify the Clustering Structure", in ACMSIGMOD, pp. 49-60, 1999.
  13. S. Roy and D. K. Bhattacharyya, "An Approach to Find Embedded Clusters Using Density Based Techniques", in Proc. ICDCIT, LNCS 3816, pp. 523-535, 2005.
  14. S. Sarmah, R. Das and D. K. Bhattacharyya, "Intrinsic Cluster Detection Using Adaptive Grids", in Proc. ADCOM'07, Guwahati, 2007.
  15. S. Sarmah, R. Das and D. K. Bhattacharyya, "A Distributed Algorithm for Intrinsic Cluster Detection over Large Spatial Data" A grid-density based clustering Technique (GDCT), World Academy of Science, Engineering and Technology 45, pp. 856-866, 2008.
  16. RajibMall ,"Software Engineering".
  17. Available: http//steve. hollasch. net /cgindex/math /barycentric. html
  18. Available:http://en. wikipedia. org/wiki/Barycentric_coordinate_system_(mathematics)
  19. J. Pocol, R. Etemadpour, F. V. Paulovich1, T. V. Long, P. Rosenthal, M. C. F. Oliveira1, L. Linsen and R. Minghim, "A Framework for Exploring Multidimensional Data with 3D Projections", Eurographics / IEEE Symposium on Visualization 2011 (EuroVis 2011), Volume 30 (2011), Number 3.
  20. Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunop-ulos, and Prabhakar Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In Proceedings of the 1998 ACM SIGMOD international conference on Management of data, pages 94{105. ACM Press, 1998.
  21. Raghunath Kar&Susant Kumar Dash "A Study On High Dimensional Clustering By Using Clique", International Journal of Computer Science and Informatics, pp. 22-25, Volume-I, Issue-II, 2011.
  22. Lance Parsons, Ehtesham Haque, Huan Liu, "Evaluating Subspace Clustering Algorithms" Supported in part by grants from Prop 301 (No. ECR A601) and CEINT 2004.
  23. Hrishav Bakul Barua, Dhiraj Kumar Das and Sauravjyoti Sarmah, "A Density Based Clustering Technique For Large Spatial Data Using Polygon Approach", TDCT, IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661 Volume 3, Issue 6 (July-Aug. 2012), PP 01-10.
Index Terms

Computer Science
Information Sciences

Keywords

Clustering Density- based Density Confidence Polyhedron approach Tetrahedron-density