CFP last date
20 December 2024
Reseach Article

Synchronous CR-OLAP Tool for Efficient Parallel Computing

by Mani Sarma Vittapu, Venkateswarlu Sunkari, Atoyoseph Abate
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 115 - Number 17
Year of Publication: 2015
Authors: Mani Sarma Vittapu, Venkateswarlu Sunkari, Atoyoseph Abate
10.5120/20240-2593

Mani Sarma Vittapu, Venkateswarlu Sunkari, Atoyoseph Abate . Synchronous CR-OLAP Tool for Efficient Parallel Computing. International Journal of Computer Applications. 115, 17 ( April 2015), 1-13. DOI=10.5120/20240-2593

@article{ 10.5120/20240-2593,
author = { Mani Sarma Vittapu, Venkateswarlu Sunkari, Atoyoseph Abate },
title = { Synchronous CR-OLAP Tool for Efficient Parallel Computing },
journal = { International Journal of Computer Applications },
issue_date = { April 2015 },
volume = { 115 },
number = { 17 },
month = { April },
year = { 2015 },
issn = { 0975-8887 },
pages = { 1-13 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume115/number17/20240-2593/ },
doi = { 10.5120/20240-2593 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:55:03.856080+05:30
%A Mani Sarma Vittapu
%A Venkateswarlu Sunkari
%A Atoyoseph Abate
%T Synchronous CR-OLAP Tool for Efficient Parallel Computing
%J International Journal of Computer Applications
%@ 0975-8887
%V 115
%N 17
%P 1-13
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Real time OLAP , or RTOLAP, is the capability to quickly retrieve, aggregate, analyze and present multidimensional data for cubes whenever there are changes to the data in the relational data sources, without having to run heavy processing on the cube. A big advantage of real time OLAP is that it calculates all relevant data and provides immediate output. One of the main roles of an RTOLAP system is that data is stored directly in main memory, or in an in memory database, enabling quicker access to the data. Another factor affecting the speed of calculation is compression data is compressed, in such a way that it can be accessed must faster in its compressed form. Additionally, pre-calculated values are not stored, therefore avoiding "data explosion". In contrast to queries for online transaction processing (OLTP) system that typically access only a small portion of a database, OLAP queries may need to aggregate large portion of a database which often leads to performance issues. In this paper introduced CR-OLAP, a cloud based Real Time OLAP system based on a new distributed index structure for OLAP, the distributed PDCR tree, that utilizes a cloud infrastructure consisting of (m+1) multicore processors. With increasing database size, CROLAP dynamically increases m to maintain performance. The distributed PDCR tree data structure supports multiple dimension hierarchies and efficient query processing on the sophisticated dimension hierarchies which are so central to OLAP system. It is particularly efficient for complex OLAP queries that need to aggregate large portions of the data warehouses. The static data cube approach proposed by Gray et. al. and materialize all or a subset of the cuboids of the data cube in order to ensure adequate query performance. Practitioners have called for some time for a real-time OLAP approach where the OLAP system gets updated instantaneously as new data arrives and always provides an up-to-date data warehouse for the decision support process. However, a major problem for real-time OLAP is the significant performance issues with large scale data warehouses. The main aim of our research is to address these problems through the use of efficient parallel computing methods. In this paper proposed a distributed data structure for real time OLAP. To our knowledge, the real-time OLAP system that has been parallelized and optimized for contemporary multi-core architectures allows for multiple insert and multiple query transactions to be executed in parallel and in real-time.

References
  1. M. C. Kurt and G. Agrawal, "A fault-tolerant environment for large- scale query processing," in High Performance Computing (HiPC), 2012 19th International Conference on, 2012, pp. 1–10.
  2. H. Al-Aqrabi, L. Liu, R. Hill, and N. Antonopoulos. Taking the business intelligence to the clouds. In High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on, pages 953–958. IEEE, 2012.
  3. D. Jin and T. Tsuji. Parallel data cube construction based on an extendible multidimensional array. In Trust, Security and Privacy in Computing and Communications (TrustCom), 2011 IEEE 10th International Conference on, pages 1139–1145. IEEE, 2011.
  4. K. Doka, D. Tsoumakos, and N. Koziris, "Brown dwarf: A fully- distributed, fault-tolerant data warehousing system," J. Parallel Distrib. Comput. , vol. 71, no. 11, pp. 1434–1446, Nov. 2011.
  5. Y. Zhai, M. Liu, J. Zhai, X. Ma, and W. Chen. Cloud versus in-house cluster: evalu- ating amazon cluster compute instances for running mpi applications. In State of the Practice Reports, page 11. ACM, 2011.
  6. P. Brezany, Y. Zhang, I. Janciak, P. Chen, and S. Ye. An elastic olap cloud plat- form. In Dependable, Autonomic and Secure Computing (DASC), 2011 IEEE Ninth International Conference on, pages 356–363. IEEE, 2011.
  7. Y. Cao, C. Chen, F. Guo, D. Jiang, Y. Lin, B. C. Ooi, H. T. Vo, S. Wu, and Q. Xu. Es¡ sup¿ 2¡/sup¿: A cloud data storage system for supporting both oltp and olap. In Data Engineering (ICDE), 2011 IEEE 27th International Conference on, pages 291–302. IEEE, 2011.
  8. Asiki, D. Tsoumakos, and N. Koziris, "Distributing and searching concept hierarchies: an adaptive dht-based system," Cluster Computing, vol. 13, no. 3, pp. 257–276, Sep. 2010.
  9. S. Melnik, A. Gubarev, J. J. Long, G. Romer, S. Shivakumar, M. Tolton, and T. Vassilakis, "Dremel: interactive analysis of web-scale datasets," Proc. VLDB Endow. , vol. 3, no. 1-2, pp. 330–339, Sep. 2010.
  10. Y. Zhang, S. Wang, and W. Huang. Paracube: A scalable olap model based on dis- tributed aggregate computing with sibling cubes. In Web Conference (APWEB), 2010 12th International Asia-Paci?c, pages 323–329. IEEE, 2010.
  11. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Antony, H. Liu, and R. Murthy. Hive-a petabyte scale data warehouse using hadoop. In Data Engineering (ICDE), 2010 IEEE 26th International Conference on, pages 996–1005. IEEE, 2010.
  12. Z. Guo-Liang, C. Hong, L. Cui-Ping, W. Shan, and Z. Tao, "Parallel Data Cube Computation on Graphic Processing Units," Chines Journal of Computers, vol. 33, no. 10, pp. 1788–1798, 2010.
  13. X. Zhang, J. Ai, Z. Wang, J. Lu, and X. Meng. An ef?cient multi-dimensional index for cloud data management. In Proceedings of the ?rst international workshop on Cloud data management, pages 17–24. ACM, 2009.
  14. R. J. Santos and J. Bernardino. Optimizing data warehouse loading procedures for enabling useful-time data warehousing. In Proceedings of the 2009 International Database Engineering & Applications Symposium, pages 292–299. ACM, 2009.
  15. C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig latin: a not-so- foreign language for data processing. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1099–1110. ACM, 2008.
  16. R. J. Santos and J. Bernardino. Real-time data warehouse loading methodology. pages 49–58, 2008.
  17. D. Power. A brief history of decision support systems, version 4. 0, march 10, 2007. Series A Brief History of Decision Support Systems. Version, 4, 2007.
  18. K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, J. Shalf, S. W. Williams, et al. The landscape of paral- lel computing research: A view from berkeley. Technical report, Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, 2006.
  19. S. S. Conn. Oltp and olap data integration: a review of feasible implementation meth- ods and architectures for real time data analysis. In SoutheastCon, 2005. Proceedings. IEEE, pages 515–520. IEEE, 2005.
  20. Amazon elastic compute cloud (amazon ec2). http://aws. amazon. com/ec2/.
  21. Amazon ec2 instance details. http://aws. amazon. com/ec2/.
Index Terms

Computer Science
Information Sciences

Keywords

RTOLAP CROLAP OLTP MOLAP PDCR performance latency