CFP last date
20 January 2025
Reseach Article

Adaptive Zone-Aware Multi-bank on Chip last level L2 Cache Partitioning for Chip Multiprocessors

by Nitin Chaturvedi, Jithin P Thoma, S Gurunarayanan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 7 - Number 1
Year of Publication: 2010
Authors: Nitin Chaturvedi, Jithin P Thoma, S Gurunarayanan
10.5120/1131-1482

Nitin Chaturvedi, Jithin P Thoma, S Gurunarayanan . Adaptive Zone-Aware Multi-bank on Chip last level L2 Cache Partitioning for Chip Multiprocessors. International Journal of Computer Applications. 7, 1 ( September 2010), 19-23. DOI=10.5120/1131-1482

@article{ 10.5120/1131-1482,
author = { Nitin Chaturvedi, Jithin P Thoma, S Gurunarayanan },
title = { Adaptive Zone-Aware Multi-bank on Chip last level L2 Cache Partitioning for Chip Multiprocessors },
journal = { International Journal of Computer Applications },
issue_date = { September 2010 },
volume = { 7 },
number = { 1 },
month = { September },
year = { 2010 },
issn = { 0975-8887 },
pages = { 19-23 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume7/number1/1131-1482/ },
doi = { 10.5120/1131-1482 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T19:55:19.344430+05:30
%A Nitin Chaturvedi
%A Jithin P Thoma
%A S Gurunarayanan
%T Adaptive Zone-Aware Multi-bank on Chip last level L2 Cache Partitioning for Chip Multiprocessors
%J International Journal of Computer Applications
%@ 0975-8887
%V 7
%N 1
%P 19-23
%D 2010
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper proposes a novel efficient Non-Uniform Cache Architecture (NUCA) scheme for the Last-Level Cache (LLC) to reduce the average on-chip access latency and improve core isolation in Chip Multiprocessors (CMP). The architecture proposed is expected to improve upon the various NUCA schemes proposed so far such as S-NUCA, D-NUCA and SP-NUCA[9][10][5] in terms of average access latency without a significant reduction in the hit rate. The complete set of L2 banks is divided into various zones. Each core belongs to one particular zone which is the closest to it. Consequently, adjacent cores are grouped into the same zone. Each zone individually follows the SP-NUCA scheme [5] for maintaining core isolation and sharing common blocks. However, blocks that need to be shared by cores which belong to different zones are replicated. This scheme is much more scalable than the SP-NUCA scheme and bounds the maximum on-chip access latency to a lower value as the number of cores increases.

References
  1. B. M. Beckmann and D. A. Wood. Managing wire delay in large chip-multiprocessor caches. In 37th International Symposium on Microarchitecture, 2004.
  2. J.Chang and G.S.Sohi, “Cooperative Caching for Chip Multiprocessors”, ISCA,2006.
  3. E.Herrero, J.Gonzalez and R.Canal, “Distributed Cooperative Caching”, PACT,2008.
  4. H. Dybdahl and P. Stenstrom. An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors. In Proceedings of the 13th Annual International Symposium on High Performance Computer Architecture, 2007.
  5. J.Merino, V.Puente, P.Prieto and J.A.Gregorio, “SP-NUCA: A Cost Effective Dynamic Non-Uniform Cache Architecture”, ACM SIGARCH Computer Architecture News, Vol. 36, No.2, May,2008
  6. B.M. Beckmann, M.R. Marty, D.A. Wood, “ASR: Adaptive Selective Replication for CMP Caches”, MICRO 2006.
  7. K.Kedzierski, M.Moreto, F.J.Cazorla, M.Valero, “Adpating Cache Partitioing Algorithms to Pseudo-LRU Replacement Policies”, IPDPS,2010
  8. M.K.Qureshi and Y.N.Patt, “Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches”, MICRO, 2006
  9. M.Zhang and Krste Asanovic, “Victim Replication: Maximizing Capacity while Hiding Wire Delay in Tiled Chip Multiprocessors”, ISCA, 2005
  10. J.Huh, C.Kim, H.Shafi and L.Zhang, “A NUCA Substrate for Flexible CMP Cache Sharing”, ICS, 2005
  11. C. Kim, D. Burger and, S. W. Keckler, “An Adaptive, non-uniform cache structure for wire-delay dominated on-chip caches”. ASPLOS X, pp. 211-222, October 2002.
  12. Z. Chishti, M. D. Powell, and T. N. Vijaykumar. Distance associativity for high-performance energy-efficient non-uniform cache architectures. In 36th International Symposium on Microarchitecture,2003.
  13. J. Huh, C. Kim, H. Shafi, L. Zhang, D. Burger, and S. W. Keckler. A nuca substrate for flexible cmp cache sharing. In 19th ACM International Conference on Supercomputing, 2005.
  14. C. Kim, D. Burger, and S. W. Keckler. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In 10th International Conference on Architectural Support for Programming Languages and Operating Systems, 2002.
  15. N. Muralimanohar and R. Balasubramonian. Interconnect design considerations for large nuca caches. In 34th International Symposium on Computer Architecture, 2007.
  16. J. Chang and G. S. Sohi. Cooperative caching for chip multiprocessors. In 33rd International Symposium on Computer Architecture, 2006
  17. C. Bienia, S. Kumar, and K. Li. Parsec vs. splash-2: A quantitative comparison of two multithreaded benchmark suites on chip-multiprocessors. In Procs. Of the IEEE International Symposium on Workload Characterization, IISWC 2008.
  18. C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: Characterization and architectural implications. In International Conference on Parallel Architectures and Compilation Techniques, 2008.
  19. P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A Full System Simulator Platform, volume 35-2, pages 50–58. Computer, 2002.
Index Terms

Computer Science
Information Sciences

Keywords

Chip Multiprocessor (CMP) Non-Uniform Cache Architecture (NUCA) Shared Last level Cache (LLC)