CFP last date
20 January 2025
Reseach Article

Accelerated Dynamic Programming on GPU: A Study of Speed Up and Programming Approach

by Mohsin Altaf Wani, S. M. K Quadri
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 69 - Number 3
Year of Publication: 2013
Authors: Mohsin Altaf Wani, S. M. K Quadri
10.5120/11822-7518

Mohsin Altaf Wani, S. M. K Quadri . Accelerated Dynamic Programming on GPU: A Study of Speed Up and Programming Approach. International Journal of Computer Applications. 69, 3 ( May 2013), 18-21. DOI=10.5120/11822-7518

@article{ 10.5120/11822-7518,
author = { Mohsin Altaf Wani, S. M. K Quadri },
title = { Accelerated Dynamic Programming on GPU: A Study of Speed Up and Programming Approach },
journal = { International Journal of Computer Applications },
issue_date = { May 2013 },
volume = { 69 },
number = { 3 },
month = { May },
year = { 2013 },
issn = { 0975-8887 },
pages = { 18-21 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume69/number3/11822-7518/ },
doi = { 10.5120/11822-7518 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:29:14.237216+05:30
%A Mohsin Altaf Wani
%A S. M. K Quadri
%T Accelerated Dynamic Programming on GPU: A Study of Speed Up and Programming Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 69
%N 3
%P 18-21
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

GPUs (Graphics processing units) can be used for general purpose parallel computation. Developers can develop parallel programs running on GPUs using different computing architectures like CUDA or OpenCL. The Optimal Matrix Chain Multiplication problem is an optimization problem to find the optimal order for multiplying a chain of matrices. The optimal order of multiplication depends only on the dimensions of the matrices. It is known that this problem can be solved by dynamic programming technique using O(n3)-time complexity algorithm and a work space of size O(n2). The main contribution of this paper is to present a parallel implementation of this O(n3)-time algorithm on a GPU and to assess the amount of programming effort required to develop this parallel implementation when compared to a similar sequential implementation.

References
  1. Cormen, T. H. , Lieserson, C. E. , Rivest, R. L. 1990 Introduction to Algorithms. MIT Press
  2. Neapolitan, R and Naimipour, K. 2003 Foundations of Algorithms using C++ pseudocode.
  3. Nvidia Corp. 2011 Nvidia CUDA programming guide version 4. 1.
  4. Nvidia Corp. 2011 CUDA C Best Practices Guide version 4. 1.
  5. W. W. Hwu. 2011 GPU Computing Gems Emerald Edition. Morgan Kaufmann,.
  6. D. Man, K. Uda, Y. Ito, and K. Nakano. Dec. 2011 "A GPU implementation of computing euclidean distance map with efficient memory access," in Proc. of International Conference on Networking and Computing, , pp. 68–76.
  7. A. Uchida, Y. Ito, and K. Nakano. Dec. 2011 "Fast and accurate template matching using pixel rearrangement on the GPU," in Proc. of International Conference on Networking and Computing , pp. 153–159.
  8. AMD. 2011 Introduction to OpenCL programming.
Index Terms

Computer Science
Information Sciences

Keywords

Dynamic programming parallel algorithms GPU CUDA