CFP last date
20 January 2025
Reseach Article

A Review on Big Data Integration

Published on February 2015 by B.arputhamary, L.arockiam
Advanced Computing and Communication Techniques for High Performance Applications
Foundation of Computer Science USA
ICACCTHPA2014 - Number 5
February 2015
Authors: B.arputhamary, L.arockiam
da2260a5-c30a-4b71-9331-de92311e8207

B.arputhamary, L.arockiam . A Review on Big Data Integration. Advanced Computing and Communication Techniques for High Performance Applications. ICACCTHPA2014, 5 (February 2015), 21-26.

@article{
author = { B.arputhamary, L.arockiam },
title = { A Review on Big Data Integration },
journal = { Advanced Computing and Communication Techniques for High Performance Applications },
issue_date = { February 2015 },
volume = { ICACCTHPA2014 },
number = { 5 },
month = { February },
year = { 2015 },
issn = 0975-8887,
pages = { 21-26 },
numpages = 6,
url = { /proceedings/icaccthpa2014/number5/19464-6061/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 Advanced Computing and Communication Techniques for High Performance Applications
%A B.arputhamary
%A L.arockiam
%T A Review on Big Data Integration
%J Advanced Computing and Communication Techniques for High Performance Applications
%@ 0975-8887
%V ICACCTHPA2014
%N 5
%P 21-26
%D 2015
%I International Journal of Computer Applications
Abstract

Big Data technologies are becoming a current talk and a new "buzz-word" both in science and in industry. Today data have grown from terabytes to petabytes and now it is in zeta bytes. Increased amount of information increases the challenges in managing and manipulating data. Data integration is a main issue in large data sets which is managed by Extract, Transform and Load (ETL) tools such as Data Warehouses. Data Warehouse is the process of transforming all multiple data formats into a single format and consolidating them in one place. Now days, data generated from social networks, web server logs, sensors used to gather climate information, stock market data, e-mails, transaction records, web click streams, etc. Most of these data are in unstructured or semi structured forms. Today organizations' are trying to find new solutions such as ETLs to manage the situation. The existing data warehousing tools and techniques were inefficient to handle unstructured and semi structured data. This paper presents the issues and challenges of data integration in Big Data environment and techniques for big data integration. A new ETL framework is proposed open problems for future research of data integration are identified in big data environment.

References
  1. Soumy sen, Ranak Ghosh, Debanjali, Nabendu Chaki,2012. "Integrating XML Data into Multiple ROLAP Data Warehouse Schemas", International Journal of Software Engineering and Application (USEA), Vol 3,No. 1, Jan 2012.
  2. Rabah Alshboul, 2012. "Data Warehouse Explorative Study", Applied Mathematical Sciences, Vol. 6, 2012, No. 61, 3015-3024.
  3. An Oracle White Paper, Sep 2013. Big Data and Enterprise Data: Bridging Two worlds with Oracle Integrator12C(ODI12C).
  4. Rajni Jindal, 2012. "Comparative study of Data Warehouse Design Approaches: A Survey", International Journal of Database Management Systems (IJDMS), vol. 4, No. 1, Feb 2012.
  5. Xin Luna Dong, Divesh Srivastava, 2013. "Big Data Integration", ICDE conference 2013.
  6. Andreas Schultz,, Andrea Matteini, Robert Isele, 2012. "LDIF- A framework for Large-Scale Linked Data Integration" www 2012 Developer Track, Apr 18-20, 2012, Lyon, France.
  7. Sachchidanand Singh, Nirmala Singh, 2012. "Big Data Analytics", International Conference on Communication, Information & Computing Technology (ICCICT), Oct 19-20, 2012.
  8. Gueyoung Jung, Nathan Gnanasambandam, Tridib Mukherjee, 2012. "Synchronous Parallel Processing of Big-Data Analytics Services to Optimize Performance in Federated Clouds", 2012 IEEE Fifth International Conference on Cloud Computing, IEEE.
  9. Yuri Demchenko, Paola Grosso, Cees De Laat, Peter Membrey, 2013. "Addressing Big Data Issues in Scientific Data Infrastructure", IEEE.
  10. Jiaqi Zhao, 2014. " A Security Framework in G-Hadoop for big data computing across distributed cloud data centers", Journal of Computer and System Sciences 80(2014) 994-1007.
  11. Katarina Grolinger, Mirriam A. M. Capretz , 2013. "Knowlegde as a service Framework for Disaster Data Management", 2013 workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises,2013.
  12. Aisha Naseer, Loredana Laera, Takahide Matsutsuka, 2013. "Enterprise BigGraph", 2013 46th Hawaii International Conference on System Sciences.
  13. Dr. Tulinda Larsen, 2013. "Cross-Platform Aviation Analytics Using Big Data Methods",IEEE.
  14. Chih-Wei Lu1, 2013. "An Improvement to Data Service in Cloud Computing with Content Sensitive Transaction Analysis and Adaptation", 2013 IEEE 37th Annual Computer Software and Applications Conference Workshops, 2013.
  15. Jin Zhou, 2013. " An Efficient Multidimensional Fusion Algorithm for IoT Data Based on Partitioning",
  16. Xiongpai QIN,Huiju WANG, Furong LI, Baoyao ZHOU 2012. "Beyond Simple Integration of RDBMS and MapReduce-Paving the Way toward a Unified System for Big Data Analytics: Vision and Progress", 2012 Second International Conference on Cloud and Green Computing, IEEE.
  17. Steven J. Rysavy, Dennis Bromley, and Valerie Daggett, 2014. "DIVE: A Graph-Based Visual- Analytics Framework for Big Data" March/April 2014 Published by the IEEE.
  18. David Loshin, "Big Data Analytics", Elsevier, 2013.
  19. Kiyana Zolfaghar, Naren Meadem, Ankur Teredesai, Senjuti Basu Roy, Brian Muckian, 2013. "Big Data Solutions for Predicting Risk-of-Readmission for Congestive Heart Failure Patients ",2013 IEEE International Conference on Big Data, IEEE.
  20. Sandro Fiore, Cosimo Palazzo, Alessandro D'Anca, Ian Foster, Dean N. Williams, Giovanni Aloisio, 2013. "A big data analytics framework for scientific data management" 2013 IEEE International Conference on Big Data.
  21. A White Paper, 2013. "Aggregation and analytics on Big Data using the Hadoop eco- system".
  22. A White Paper, 2013. "SAAS Institute in USA, "Big Data Meets Big Data Analytics".
  23. C. N. Hofer and G. Karagiannis, 2011. "Cloud Computing services: taxonomy and Comparison".
  24. Yi Yuan, Haiyang Wang, Dan Wang, Jiangchuan Liu, 2012. "On Inference- aware provisioning for cloud-based Big data Processing".
  25. Raymond Gardiner Goss and Kousikan Veeramuthu, 2013. "Heading Towards Big Data Building A Better Data Warehouse For More Data, More Speed, And More Users".
  26. Yuri Demchenko, Paola Grosso, Cees de Laat, Membray, 2013. " Addressing Big Data Issues in Scientific Data Infrastructure".
Index Terms

Computer Science
Information Sciences

Keywords

Big Data Integration Data Warehouse Hadoop.