CFP last date
20 December 2024
Reseach Article

Map vs. Unordered Map: An Analysis on Large Datasets

by Akanksha Bindal, Prateek Narang, S. Indu
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 127 - Number 2
Year of Publication: 2015
Authors: Akanksha Bindal, Prateek Narang, S. Indu
10.5120/ijca2015906322

Akanksha Bindal, Prateek Narang, S. Indu . Map vs. Unordered Map: An Analysis on Large Datasets. International Journal of Computer Applications. 127, 2 ( October 2015), 5-9. DOI=10.5120/ijca2015906322

@article{ 10.5120/ijca2015906322,
author = { Akanksha Bindal, Prateek Narang, S. Indu },
title = { Map vs. Unordered Map: An Analysis on Large Datasets },
journal = { International Journal of Computer Applications },
issue_date = { October 2015 },
volume = { 127 },
number = { 2 },
month = { October },
year = { 2015 },
issn = { 0975-8887 },
pages = { 5-9 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume127/number2/22699-2015906322/ },
doi = { 10.5120/ijca2015906322 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:18:48.259801+05:30
%A Akanksha Bindal
%A Prateek Narang
%A S. Indu
%T Map vs. Unordered Map: An Analysis on Large Datasets
%J International Journal of Computer Applications
%@ 0975-8887
%V 127
%N 2
%P 5-9
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In the current scenario, with the transpiring big data explosion, data sets are often too large to fit completely inside the computers´ internal memory. In efficient processes, speed is not an option, it is a must. Hence every alternative is explored to further enhance performance, by expanding in-place memory storage that enables more data to be resident in the memory, eliminating operation latency, and even deploying an in-memory database (IMDB) system where all the data can be kept in memory. However, the technique of in-memory data handling is still at an infant stage and not viable in the current scenario. To tackle this problem a hierarchical hashing scheme is discussed where only one component of a big data structure resides in the memory. In this paper two data structures are explored: 1) Map which is implemented as self-balancing binary search trees or more commonly Red Black Trees and 2) Unordered Map which is based on hashing with chaining technique. Serialization and deserialization operations are also performed to free the internal memory and preserve the data structure object for later use. Operations such as read, write are performed, along with documentation of the results and illustrations of visual representations of the two algorithmic data structures.

References
  1. Jeffrey Scott Vitter, Purdue University ”Dealing with massive data” http://www.cs.purdue.edu/homes/jsv/Papers/Vit.IO survey.pdf 2006.
  2. G. DeCandia, D. Hastorun, M. Jampani et al., Dynamo: Amazons highly available key-value store, OSR, 2007.
  3. Hao Zhang, Gang Chen,Kian-Lee Tan et al., ”In-Memory Big Data Management and Processing: A Survey”
  4. S. Robbins, Ram is the new disk, InfoQ News, Jun. 2008.
  5. J. Ousterhout, P. Agrawal, D. Erickson et al., The case for ramclouds: Scalable high-performance storage entirely in dram, OSR, 2010.
  6. F. Li, B. C. Ooi, M. T. O zsu, and S. Wu, Distributed data manage-ment using mapreduce,ACM Comput. Surv., 2014.
  7. Wikipedia-Wikipedia.org.
  8. Standard C++ Library reference: http://www.cplusplus.com/reference/unorderedmap/unordered map/ Cpp Reference Documentation
  9. The Boost C++ Libraries: http://theboostcpplibraries.com/ Boost C++ Documentation
  10. Standard C++ Library reference: http://www.cplusplus.com/reference/map/map/ Cpp Reference Documentation
  11. HP, Vertica systems, 2011. [Online]. Available: http://www.vertica. com
  12. D. J. DeWitt, R. H. Katz, F. Olken et al., Implementation techniques for main memory database systems, in SIGMOD 84, 1984.
  13. R. B. Hagmann, A crash recovery scheme for a memory-resident database system,TC, 1986.
  14. T. J. Lehman and M. J. Carey, A recovery algorithm for a high performance memory-resident database system, in SIGMOD 87, 1987.
  15. M. H. Eich, Mars: The design of a main memory database machine, in Database Machines and Knowledge Base Machines.Springer US, 1988.
  16. D. J. DeWitt and J. Gray, Parallel database systems: The future of high performance database systems, CACM, 1992.
  17. S. Wu, B. C. Ooi, and K.-L. Tan, Online aggregation, in Advanced Query Processing. Springer Berlin Heidelberg, 2013
  18. S. Wu, S. Jiang, B. C. Ooi, and K.-L. Tan, Distributed online aggregations, in PVLDB 09, 2009.
  19. T. J. Lehman and M. J. Carey, A study of index structures for main memory database management systems, in PVLDB 86, 1986.
  20. J. Rao and K. A. Ross, Cache conscious indexing for decision-support in main memory, in PVLDB 99, 1999.
  21. D. B. Lomet, S. Sengupta, and J. J. Levandoski, The bw-tree: A b-tree for new hardware platforms, in ICDE 13, 2013.
  22. T. Brown, F. Ellen, and E. Ruppert, A general technique for non-blocking trees, in PPoPP, pp. 329342, ACM, 2014.
  23. T. Brown, Personal homepage.
  24. Oracle, Concurrentskiplistmap, 2014.
  25. D. Dice, Y. Lev, M. Moir, D. Nussbaum, and M. Olszewski, Early experience with a commercial hardware transactional memory im-plementation, tech. rep., Sun Microsystems, Inc., 2009..
  26. A. C. Yao. On random 2-3 trees. Acta Informatica, 9, 159170, 1978.
  27. K.-Y. Whang and R. Krishnamurthy. Multilevel grid filesA dy-namic hierarchical multidimensional file structure. In Proceedings of the International Symposium on Database Systems for Advanced Applications, 449459. World Scientific Press, 1992.
  28. J. S. Vitter and P. Flajolet. Average-case analysis of algorithms and data structures. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity, chapter 9, 431524
Index Terms

Computer Science
Information Sciences

Keywords

Map Unordered Map Boost C++ Serialization Hierarchical hashing Memory management