CFP last date
20 January 2025
Reseach Article

Big Data Analytics for Concurrent Data Processing

by A Samydurai, C Vijayakumaran, G Kumaresan, B Muthusenthil
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 120 - Number 3
Year of Publication: 2015
Authors: A Samydurai, C Vijayakumaran, G Kumaresan, B Muthusenthil
10.5120/21211-3912

A Samydurai, C Vijayakumaran, G Kumaresan, B Muthusenthil . Big Data Analytics for Concurrent Data Processing. International Journal of Computer Applications. 120, 3 ( June 2015), 36-41. DOI=10.5120/21211-3912

@article{ 10.5120/21211-3912,
author = { A Samydurai, C Vijayakumaran, G Kumaresan, B Muthusenthil },
title = { Big Data Analytics for Concurrent Data Processing },
journal = { International Journal of Computer Applications },
issue_date = { June 2015 },
volume = { 120 },
number = { 3 },
month = { June },
year = { 2015 },
issn = { 0975-8887 },
pages = { 36-41 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume120/number3/21211-3912/ },
doi = { 10.5120/21211-3912 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:05:19.515941+05:30
%A A Samydurai
%A C Vijayakumaran
%A G Kumaresan
%A B Muthusenthil
%T Big Data Analytics for Concurrent Data Processing
%J International Journal of Computer Applications
%@ 0975-8887
%V 120
%N 3
%P 36-41
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Attractively voluminous data describes an immense volume of structured and unstructured data that is difficult to process utilizing traditional database techniques. The tremendous growth in arrival rates of data to support a large number of user queries creates complex problems in the traditional structured databases. In this paper, the input file is assigned to a master who has the ability to split and control the work flow with different workers. This will reduce the fault tolerance issues raised with nodes. They will evaluate the intermediate files and data items. Over again the processed data will be amalgamated and the required output will be immediately/middle file given to the user. Also the first solution for processing perpetual text queries efficiently to address the above challenges is given. The solution indexes the streamed documents in main recollection with a structure predicate on the principles of the inverted file, and processes document advent and expiration events with an incremental threshold-predicated method.

References
  1. Jiong Xie, FanJun Meng, HaiLong Wang, JinHong Cheng, Hongfang Pan and Xiao Qin, "Adaptive Preshuffling In Hadoop Clusters", International Journal of Grid and Distributed Computing, Vol. 6,, No. 2, April 2013, pp. 79-92
  2. Saymon Castro de Souza, José Gonçalves Pereira Filho, and Eugênio Fraga Spessimille, "A Rule-Base Approach for WSN Application Development in a Cloud Environment", JACN 2013 Vol. 1(4): 306-309 ISSN: 1793-8244.
  3. Anil Madhavapeddy, Satnam Singh, "Reconfigurable Data Processing for Clouds", IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2011, pp. 141 - 145
  4. Nam-Luc Tran and Sabri Skhiri, Arthur Lesuisse and Esteban Zim´anyi, "AROM: Processing Big Data With Data Flow Graphs and Functional Programming", IEEE 4th International Conference on Cloud Computing Technology and Science (CloudCom), 2012, pp. 875 – 882
  5. Vasilis Sourlas, Georgios S. Paschosy, Paris Flegkas_ and Leandros Tassiulas, "Caching in content-based publish/subscribe systems", IEEE International Conference on Global Telecommunications Conference, 2009. GLOBECOM 2009, pp. 1 – 6.
  6. Ajeet Shankar, Rastislay Bodik, " DITTO: Automatic Incrementalization of Data Structure Invariant Checks (in Java)", Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, Volume 42 Issue 6, June 2007, Pages 310-319
  7. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, "Models and Issues in Data Stream Systems," Proc. Twenty-First ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems (PODS '02), pp. 1-16, 2002.
  8. J. Zobel and A. Moffat, "Inverted Files for Text Search Engines," ACM Computing Surveys, vol. 38, no. 2, pp. 1-55, July 2006.
  9. Y. Zhang and J. Callan, "Maximum Likelihood Estimation for Filtering Thresholds," Proc. 24th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '01), pp. 294-302, 2001.
  10. K. Mouratidis, S. Bakiras, and D. Papadias, "Continuous Monitoring of Top-k Queries over Sliding Windows," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '06), pp. 635- 646, 2006.
  11. M. Persin, J. Zobel, and R. Sacks-Davis, "Filtered Document Retrieval with Frequency-Sorted Indexes," J. Am. Soc. for Information Science, vol. 47, no. 10, pp. 749-764, 1996.
  12. V. N. Anh, O. de Kretser, and A. Moffat, "Vector-Space Ranking with Effective Early Termination," Proc. 24th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '02), pp. 35-42, 2001.
  13. V. N. Anh and A. Moffat, "Impact Transformation: Effective and Efficient Web Retrieval," Proc. 25th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '02), pp. 3-10, 2002.
  14. H. R. Turtle and J. Flood, "Query Evaluation: Strategies and Optimizations," Information Processing Management, vol. 31, no. 6, pp. 831-850, 1995.
  15. M. Kaszkiel, J. Zobel, and R. Sacks-Davis, "Efficient Passage Ranking for Document Databases," ACM Trans. Information Systems, vol. 17, no. 4, pp. 406-439, 1999.
  16. T. Strohman, H. Turtle, and W. B. Croft, "Optimization Strategies for Complex Queries," Proc. Research and Development in Information Retrieval (SIGIR '05), pp. 219-225, 2005.
  17. V Shunmei Meng, Wanchun Dou, Xuyun Zhang, and Jinjun Chen,' KASR:AKeyword-AwareService Recommendation Method on Map Reduce for Big Data Applications', IEEE Transactions on Parallel and Distributed Systems, Vol. 25, No. 12, 2014, PP. 3221-3231.
  18. Samydurai, A , Amitha, T , Bhagyalakshmi. S, Aparna. G, Govardhani. S, "Effective and Efficient Utilization of Green Computing Based Optimized Resource in Cloud with VMware", International Journal of Applied Engineering Research, Vol. 10, No. 17, 2015, pp. 13499- 13502.
Index Terms

Computer Science
Information Sciences

Keywords

Structured and unstructured data Fault Tolerance Perpetual Text Queries Structure Predicate Inverted File and Threshold-Predicated Method.