We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

Fast and Efficient Conflict Identification and Resolution in Huge Streaming Data

by S. Charles Britto, S. P. Victor
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 146 - Number 1
Year of Publication: 2016
Authors: S. Charles Britto, S. P. Victor
10.5120/ijca2016910600

S. Charles Britto, S. P. Victor . Fast and Efficient Conflict Identification and Resolution in Huge Streaming Data. International Journal of Computer Applications. 146, 1 ( Jul 2016), 10-15. DOI=10.5120/ijca2016910600

@article{ 10.5120/ijca2016910600,
author = { S. Charles Britto, S. P. Victor },
title = { Fast and Efficient Conflict Identification and Resolution in Huge Streaming Data },
journal = { International Journal of Computer Applications },
issue_date = { Jul 2016 },
volume = { 146 },
number = { 1 },
month = { Jul },
year = { 2016 },
issn = { 0975-8887 },
pages = { 10-15 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume146/number1/25361-2016910600/ },
doi = { 10.5120/ijca2016910600 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:49:05.725450+05:30
%A S. Charles Britto
%A S. P. Victor
%T Fast and Efficient Conflict Identification and Resolution in Huge Streaming Data
%J International Journal of Computer Applications
%@ 0975-8887
%V 146
%N 1
%P 10-15
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Increased data generation has led to an increase in the availability of rich information online. However, complications occur in the form of heterogeneity in the data storage. In order to have complete information, all the data sources must be utilized. Hence a data integration mechanism is required. However, integrating heterogeneous data leads to conflicting data in the system. This paper presents a fast and efficient mechanism to identify and resolve conflicts on huge streaming data using Spark. A wrapper based query formulation module constructs queries depending on the underlying data sources. The retrieved data is converted to a structured format and similarity between the data is identified, followed by distributed conflict identification and resolution. Experiments were conducted on streaming data. Effective conflict detections and a speed up from ~589 seconds to 10 seconds was achieved.

References
  1. Ohm, J. 2015. Transmission and Storage of Multimedia Data. In Multimedia Signal Coding and Transmission (pp. 491-520). Springer Berlin Heidelberg.
  2. Park, J., 1999. Facilitating interoperability among heterogeneous geographic database systems: a theoretical framework, a prototype system, and evaluation.
  3. Chen, H., Ouyang, Y. and Jiang, W., 2015. An optimized data integration model based on reverse cleaning for heterogeneous multi-media data. Multimedia Tools and Applications, pp.1-16.
  4. Han, L. and Qing-zhong, L., 2004. Ontology based resolution of semantic conflicts in information integration. Wuhan University Journal of Natural Sciences, 9(5), pp.606-610.
  5. Liu, L. and Pu, C., 1997. An adaptive object-oriented approach to integration and access of heterogeneous information sources. Distributed and Parallel Databases, 5(2), pp.167-205.
  6. Wiederhold, G. 1993. Intelligent integration of information, in Proceedings ofACM/SIGMODAnnual Conference on Management of Data.
  7. Wiederhold,G. 1994. Interoperation, mediation, and ontologies, in Proc. Int. Symp. on Fifth Generation Comp Systems, ICOT, Tokyo, Japan, pp. 33–48.
  8. Liu, L. and Pu, C., 1995. Customizable information gathering across heterogeneous information sources. Technical report, Department of Computer Science, University of Alberta.
  9. Cattell, R. et al. 1994.The Object Database Standard: ODMG-93 (Release 1.1). Morgan Kaufmann.
  10. Leida, M., Gusmini, A. and Davies, J., 2013. Semantics-aware data integration for heterogeneous data sources. Journal of Ambient Intelligence and Humanized Computing, 4(4), pp.471-491.
  11. Chen, H., Ouyang, Y. and Jiang, W., 2015. An optimized data integration model based on reverse cleaning for heterogeneous multi-media data. Multimedia Tools and Applications, pp.1-16.
  12. Lee, M.L. and Ling, T.W., 2003. A methodology for structural conflict resolution in the integration of entity-relationship schemas. Knowledge and information systems, 5(2), pp.225-247.
  13. Sandhya, H. and Roy, M.M., 2016. Data Integration of Heterogeneous Data Sources Using QR Decomposition. In Intelligent Systems Technologies and Applications (pp. 333-344). Springer International Publishing.
  14. Bao, J.M., Hu, T.T., Pan, L., Xu, H. and Hu, H.F., 2014, December. Heterogeneous Data Integration and Fusion System Based on Metadata Conflict Algorithms in USPIOT. In Wireless Communication and Sensor Network (WCSN), 2014 International Conference on (pp. 95-100). IEEE.
  15. Isnard, E., Perez, E., Bercaru, R., Galatescu, A., Florian, V., Conescu, D., Costea, L. and Stanciu, A., 2004. Integration and maintenance of heterogeneous applications and data structures. In Advances in Information Systems (pp. 181-191). Springer Berlin Heidelberg.
  16. Chromiak, M. and Stencel, K., 2012. The linkup data structure for heterogeneous data integration platform. In Future Generation Information Technology (pp. 263-274). Springer Berlin Heidelberg.
  17. Comito, C. and Talia, D., 2006. Grid data integration based on schema mapping. In Applied Parallel Computing. State of the Art in Scientific Computing (pp. 319-328). Springer Berlin Heidelberg.
  18. Boufares, F. and Ben Salem, A., 2012, March. Heterogeneous data-integration and data quality: Overview of conflicts. In Sciences of Electronics, Technologies of Information and Telecommunications (SETIT), 2012 6th International Conference on (pp. 867-874). IEEE.
  19. Mirza, G.A., 2015, December. Null Value Conflict: Formal Definition and Resolution. In 2015 13th International Conference on Frontiers of Information Technology (FIT) (pp. 132-137). IEEE.
  20. Chirathamjaree, C. and Mukviboonchai, S., 2002, October. The mediated integration architecture for heterogeneous data integration. In TENCON'02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering (Vol. 1, pp. 77-80). IEEE.
  21. Sokolovska, N., Clément, K. and Zucker, J.D., 2016. Deep kernel dimensionality reduction for scalable data integration. International Journal of Approximate Reasoning, 74, pp.121-132.
  22. Calvanese, D., Liuzzo, P., Mosca, A., Remesal, J., Rezk, M. and Rull, G., 2016. Ontology-based data integration in EPNet: Production and distribution of food during the Roman Empire. Engineering Applications of Artificial Intelligence, 51, pp.212-229.
  23. Laraichi, S., Hammani, A. and Bouignane, A., 2016. Data Integration As The Key To Building A Decision Support System For Groundwater Management: Case Of Saiss Aquifers, Morocco. Groundwater for Sustainable Development.
Index Terms

Computer Science
Information Sciences

Keywords

Conflict identification conflict resolution Spark Streaming Data Wrappers