CFP last date
20 January 2025
Reseach Article

Adaptiveness in Map-Reduce using MPI

Published on June 2015 by Ahmed H.i Lakadkutta, Pushpanjali M. Chouragade
National Conference on Recent Trends in Computer Science and Engineering
Foundation of Computer Science USA
MEDHA2015 - Number 2
June 2015
Authors: Ahmed H.i Lakadkutta, Pushpanjali M. Chouragade
ed378ffc-7bae-4fc2-968c-99b3126574dd

Ahmed H.i Lakadkutta, Pushpanjali M. Chouragade . Adaptiveness in Map-Reduce using MPI. National Conference on Recent Trends in Computer Science and Engineering. MEDHA2015, 2 (June 2015), 14-19.

@article{
author = { Ahmed H.i Lakadkutta, Pushpanjali M. Chouragade },
title = { Adaptiveness in Map-Reduce using MPI },
journal = { National Conference on Recent Trends in Computer Science and Engineering },
issue_date = { June 2015 },
volume = { MEDHA2015 },
number = { 2 },
month = { June },
year = { 2015 },
issn = 0975-8887,
pages = { 14-19 },
numpages = 6,
url = { /proceedings/medha2015/number2/21433-8025/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Conference on Recent Trends in Computer Science and Engineering
%A Ahmed H.i Lakadkutta
%A Pushpanjali M. Chouragade
%T Adaptiveness in Map-Reduce using MPI
%J National Conference on Recent Trends in Computer Science and Engineering
%@ 0975-8887
%V MEDHA2015
%N 2
%P 14-19
%D 2015
%I International Journal of Computer Applications
Abstract

MapReduce is an emerging programming paradigm for data parallel applications proposed by Google to simplify large-scale data processing. MapReduce implementation consists of map function that processes input key/value pairs to generate intermediate key/value pairs and reduce function that merges and converts intermediate key/value pairs into final results. The reduce function can only start processing after completion of the map function. Due to dependencies between map and reduce function, if the map function is slow for any reason, this will affect the whole running time. In this technique, the message passing interface (MPI) strategies is used to implement MapReduce which reduces the runtime and optimized data exchange. MPI is used for algorithmic parallelization. MapReduce with MPI combines redistribution and reduce and moves them into the network. In this paper, new technology used as MapReduce overlapping using MPI, which is an enhancing structure of the MapReduce programming model for fast data processing. This implementation is based on running the map and the reduce functions concurrently in parallel by exchanging partial intermediate data between them in a pipeline fashion using MPI. At the same time, performing the algorithm parallelism in order to increase the performance with data parallelism of using overlapping mapreduce MPI. MPI support more efficiently all MapReduce applications.

References
  1. T. White, Hadoop: The Definitive Guide, first edition ed. O'Reilly, june 2009.
  2. F. Ahmad, S. Lee, M. Thottethodi, and T. N. Vijaykumar, "Mapreduce with communication overlap," in Technical 2007.
  3. M. Elteir, H. Lin, and W. chun Feng, "Enhancing mapreduce via asynchronous data processing," in Parallel and Distributed Systems (ICPADS), 2010 IEEE 16th International Conference on, dec. 2010, pp. 397 –405.
  4. T. Hoefler, A. Lumsdaine, and J. Dongarra, "Towards efficient mapreduce using mpi. " In PVM/MPI, ser. Lecture Notes in Computer Science, M. Ropo, J. Westerholm, and J. Dongarra,Eds. , vol. 5759. Springer, 2009, pp. 240–249.
  5. M. Elteir, H. Lin, and W. chun Feng, "Enhancing mapreduce via asynchronous data processing," in Parallel and Distributed Systems (ICPADS), 2010 IEEE 16th International Conference on, dec. 2010, pp. 397 –405
  6. Hishan Mohamed, Stephane´ Marchand-Maillet, "Enhancing MapReduce using MPI and an optimized data exchange policy" in 2012 41st International Conference on Parallel Processing Workshops
  7. Dean, J. , Ghemawat, S. : MapReduce: Simplified Data Processing on Large Clusters. Commun. ACM 51(1).
  8. L¨ammel, R. : Google's MapReduce programming model — Revisited. Sci. Comput. rogram. 68(3) (2007) .
  9. de Kruijf, M. , Sankaralingam, K. : MapReduce for the CELL B. E. Architecture. IBM Journal of Research and Development 52(4) (2007)
  10. He, B. , Fang, W. , Luo, Q. , Govindaraju, N. K. , Wang, T. : Mars: a MapReduce framework on graphics processors. In: PACT '08: Proceedings of the 17th international conference on Parallel architectures and compilation techniques, New York, NY, USA, ACM (2008) 260–269
  11. Ranger, C. , Raghuraman, R. , Penmetsa, A. , Bradski, G. , Kozyrakis, C. : Evaluating MapReduce for Multi- core andMultiprocessor Systems. In: HPCA '07: Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture, Washington, DC, USA, IEEE Computer Society (2007) 13–24
  12. Langville, A. N. , Meyer, C. D. : Google's PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press (July 2006)
  13. Chu, C. T. , Kim, S. K. , Lin, Y. A. , Yu, Y. , Bradski, G. R. , Ng, A. Y. , Olukotun, K. :Map-Reduce for Machine Learning on Multicore. In Sch¨olkopf, B. , Platt, J. C. , Hoffman, T. , eds. : NIPS, MIT Press (2006) 281–288
  14. Kimball, A. , Michels-Slettvet, S. , Bisciglia, C. : Cluster computing for web-scale data processing. SIGCSE Bull. 40(1) (2008) 116–120
  15. Hadoop: http://hadoop. apache. org (2009)
  16. Pike, R. , Dorward, S. , Griesemer, R. , Quinlan, S. : Interpreting the data: Parallel analysis with Sawzall. Scientific Programming 13(4) (2005) 277–298
  17. Ghemawat, S. , Gobioff, H. , Leung, S. T. : The Google file system. SIGOPS Oper. Syst. Rev. 37(5) (2003) 29–43
Index Terms

Computer Science
Information Sciences

Keywords

Hadoop Mapreduce Overlapping Mpi-mapreduce Parallel Mapreduce.