CFP last date
20 December 2024
Reseach Article

Variability Analysis in the Power Appetite of GPGPU Applications

by Winnie Thomas, Rohin Daruwala
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 147 - Number 12
Year of Publication: 2016
Authors: Winnie Thomas, Rohin Daruwala
10.5120/ijca2016911281

Winnie Thomas, Rohin Daruwala . Variability Analysis in the Power Appetite of GPGPU Applications. International Journal of Computer Applications. 147, 12 ( Aug 2016), 28-34. DOI=10.5120/ijca2016911281

@article{ 10.5120/ijca2016911281,
author = { Winnie Thomas, Rohin Daruwala },
title = { Variability Analysis in the Power Appetite of GPGPU Applications },
journal = { International Journal of Computer Applications },
issue_date = { Aug 2016 },
volume = { 147 },
number = { 12 },
month = { Aug },
year = { 2016 },
issn = { 0975-8887 },
pages = { 28-34 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume147/number12/25708-2016911281/ },
doi = { 10.5120/ijca2016911281 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:51:47.285135+05:30
%A Winnie Thomas
%A Rohin Daruwala
%T Variability Analysis in the Power Appetite of GPGPU Applications
%J International Journal of Computer Applications
%@ 0975-8887
%V 147
%N 12
%P 28-34
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Due to the high-performance demands, GPGPUs are designed to be optimized for higher performance, even at the cost of large power consumption. This article presents the variation in the power appetite of GPGPU applications. It proposes a method to predict the characteristics of an application through the way power is consumed by different components of the GPGPU. It is observed that certain components which are over-used by one application may be under-used by another application. This presents a challenge for the GPU architects to design a favorable and balanced system for different types of GPGPU applications. An architecture that improves power efficiency is currently required but for time-constraint real time system, performance cannot be compromised. This work is an attempt to provide precious insights on designing a reconfigurable system that fulfills the demand of end users.

References
  1. CUDA C Programming Guide. Retrieved February 3, 2016 from docs.nvidia.com/cuda/cuda-c-programming-guide
  2. Stone, J.E., Gohara, D. and Shi, G., 2010. OpenCL: A parallel programming standard for heterogeneous computing systems. Computing in science & engineering, 12(1-3), pp.66-73.
  3. Lee, J., Sathisha, V., Schulte, M., Compton, K. and Kim, N.S., 2011, October. Improving throughput of power-constrained GPUs using dynamic voltage/frequency and core scaling. In Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on (pp. 111-120). IEEE.
  4. Sethia, Ankit, and Scott Mahlke. "Equalizer: Dynamic tuning of gpu resources for efficient execution." Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2014.
  5. Thomas, W. and Daruwala, R.D., 2015, December. Investigations into techniques to accelerate memory intensive GPGPU applications. In 2015 Annual IEEE India Conference (INDICON) (pp. 1-6). IEEE.
  6. Huang, S., Xiao, S. and Feng, W.C., 2009, May. On the energy efficiency of graphics processing units for scientific computing. In Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on (pp. 1-8). IEEE.
  7. Hong, S. and Kim, H., 2010, June. An integrated GPU power and performance model. In ACM SIGARCH Computer Architecture News (Vol. 38, No. 3, pp. 280-289). ACM.
  8. Zhang, Y., Hu, Y., Li, B. and Peng, L., 2011, July. Performance and power analysis of ati gpu: A statistical approach. In Networking, Architecture and Storage (NAS), 2011 6th IEEE International Conference on (pp. 149-158). IEEE.
  9. Abe, Y., Sasaki, H., Peres, M., Inoue, K., Murakami, K. and Kato, S., 2012. Power and performance analysis of GPU-accelerated systems. In Presented as part of the 2012 Workshop on Power-Aware Computing and Systems.
  10. Wang, W., Duan, B., Tang, W., Zhang, C., Tang, G., Zhang, P. and Sun, N., 2012, February. A coarse-grained stream architecture for cryo-electron microscopy images 3D reconstruction. In Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays (pp. 143-152). ACM.
  11. Lashgar, A., Baniasadi, A. and Khonsari, A., 2013, February. Inter-warp instruction temporal locality in deep-multithreaded GPUs. In International Conference on Architecture of Computing Systems (pp. 134-146). Springer Berlin Heidelberg.
  12. Gebhart, M., Johnson, D.R., Tarjan, D., Keckler, S.W., Dally, W.J., Lindholm, E. and Skadron, K., 2011, June. Energy-efficient mechanisms for managing thread context in throughput processors. In ACM SIGARCH Computer Architecture News (Vol. 39, No. 3, pp. 235-246). ACM.
  13. Kirk, D.B. and Wen-mei, W.H., 2012. Programming massively parallel processors: a hands-on approach. Newnes.
  14. Bakhoda, A., Yuan, G.L., Fung, W.W., Wong, H. and Aamodt, T.M., 2009, April. Analyzing CUDA workloads using a detailed GPU simulator. In Performance Analysis of Systems and Software, 2009. ISPASS 2009. IEEE International Symposium on (pp. 163-174). IEEE.
  15. Kayıran, O., Jog, A., Kandemir, M.T. and Das, C.R., 2013, October. Neither more nor less: optimizing thread-level parallelism for GPGPUs. In Proceedings of the 22nd international conference on Parallel architectures and compilation techniques (pp. 157-166). IEEE.
  16. Jog, A., Kayiran, O., Chidambaram Nachiappan, N., Mishra, A.K., Kandemir, M.T., Mutlu, O., Iyer, R. and Das, C.R., 2013, March. OWL: cooperative thread array aware scheduling techniques for improving GPGPU performance. In ACM SIGPLAN Notices (Vol. 48, No. 4, pp. 395-406). ACM.
  17. NVIDIA CUDA Toolkit 4.1.-Archive. Retrieved July 3, 2016 from https://developer.nvidia.com/cuda-toolkit-31-downloads
  18. Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.H. and Skadron, K., 2009, October. Rodinia: A benchmark suite for heterogeneous computing. In Workload Characterization, 2009. IISWC 2009. IEEE International Symposium on (pp. 44-54). IEEE.
  19. Leng, J., Hetherington, T., ElTantawy, A., Gilani, S., Kim, N.S., Aamodt, T.M. and Reddi, V.J., 2013, June. GPU Wattch: enabling energy optimizations in GPGPUs. In ACM SIGARCH Computer Architecture News (Vol. 41, No. 3, pp. 487-498). ACM.
Index Terms

Computer Science
Information Sciences

Keywords

Variability analysis power breakdown cycle-accurate frequency scaling.