We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

CUDA�S Mapped Memory to Get High Performance using GPU

Published on June 2015 by Tofik R. Kacchi, Pushpanjali Chauragade
National Conference on Recent Trends in Computer Science and Engineering
Foundation of Computer Science USA
MEDHA2015 - Number 1
June 2015
Authors: Tofik R. Kacchi, Pushpanjali Chauragade
df26ac51-f89c-43af-b9be-ec18861b8c38

Tofik R. Kacchi, Pushpanjali Chauragade . CUDA�S Mapped Memory to Get High Performance using GPU. National Conference on Recent Trends in Computer Science and Engineering. MEDHA2015, 1 (June 2015), 15-13.

@article{
author = { Tofik R. Kacchi, Pushpanjali Chauragade },
title = { CUDA�S Mapped Memory to Get High Performance using GPU },
journal = { National Conference on Recent Trends in Computer Science and Engineering },
issue_date = { June 2015 },
volume = { MEDHA2015 },
number = { 1 },
month = { June },
year = { 2015 },
issn = 0975-8887,
pages = { 15-13 },
numpages = -1,
url = { /proceedings/medha2015/number1/21425-8007/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Conference on Recent Trends in Computer Science and Engineering
%A Tofik R. Kacchi
%A Pushpanjali Chauragade
%T CUDA�S Mapped Memory to Get High Performance using GPU
%J National Conference on Recent Trends in Computer Science and Engineering
%@ 0975-8887
%V MEDHA2015
%N 1
%P 15-13
%D 2015
%I International Journal of Computer Applications
Abstract

The API interfaces provided by CUDA help programmers to get high performance CUDA applications in GPU, but they cannot support most I/O operations in device codes. The characteristics of CUDA's mapped memory are used here to create a dynamic polling service model in the host which can satisfy most I/O functions such as read/write file and "printf". The technique to implement these I/O functions has some influence on the performance of the original applications. These functions quickly respond to the users' I/O requirements with the "printf" performance better than CUDA's. An easy and effective real-time method is given for users to debug their programs using the I/O functions. These functions improve productivity of converting legacy C/C++ codes to CUDA and broaden CUDA's functions.

References
  1. IEEE Paper on "CUDA's Mapped Memory To Support I/O functions on GPU" , presented at TSINGHUA SCIENCE AND TECHNOLOGY, by Wei Wu, FengbinQi,WangQuan He and Shanshan Wang, Volume 18,Number 6,December 2013.
  2. NVIDIA Corporation, CUDA Toolkit 3. 1 Downloads, https://developer. nvidia. com/cuda-toolkit-31-downloads, 2010.
  3. S. Lee, S. Min, and R. Eigenmann, OpenMP to GPGPU: A compiler framework for automatic translation and optimization, presented at the 14th ACM SIGPLANSymposium on Principles and Practice of Parallel Programming, Raleigh, NC, USA, 2009.
  4. R. Dolbeau, S. Bihan, and F. Bodin, HMPP?: A hybrid multi-core parallelprogramming environment, presented at the 1st Workshop on General Purpose Processing on Graphics Processing Units, Boston, USA, 2007.
  5. Khronos Group, The open standard for parallel programming of heterogeneous systems, http://www. khronos. org/opencl, 2011.
  6. J. Breitbart, Cupp-A framework for easy CUDA integration, in Proc. the 2009 IEEEInternational Symposium on Parallel&Distributed Processing, Washington, DC, USA, 2009, pp. 1-8.
  7. S. Zhang, Y. Zhu, K. Zhao, and Y. Zhang, GPU High Performance Computing withCUDA, (in Chinese). Beijing, China: China WaterPower Press, 2009.
  8. D. B. Kirk and W. W. Hwu, Programming Massively Parallel Processors: A Hands-on Approach. Burlington, MA, USA: Morgan Kaufmann Publishers, 2010.
  9. J. Sanders and E. Kandrot, CUDA by Example: An Introduction to General-Purpose GPU Programming. Boston, MA, USA: Addison-Wesley,2010.
  10. G. Diamos, A. Kerr, and S. Yalamanchili, Ocelot: A dynamic optimizationframework for bulk-synchronousapplications in heterogeneous systems, presented atthe19th International Conference on Parallel Architectures andCompilation Techniques, Vienna, Austria, 2010.
Index Terms

Computer Science
Information Sciences

Keywords

Cuda's Introduction Architecture Input-output Functions Mapping Of Memory.