CFP last date
20 February 2025
Reseach Article

Optimization Method to Reduce Matrices Multiplications in the Context of CUDA

by Arezoo Khatibi, Omid Khatibi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 182 - Number 15
Year of Publication: 2018
Authors: Arezoo Khatibi, Omid Khatibi
10.5120/ijca2018917780

Arezoo Khatibi, Omid Khatibi . Optimization Method to Reduce Matrices Multiplications in the Context of CUDA. International Journal of Computer Applications. 182, 15 ( Sep 2018), 5-7. DOI=10.5120/ijca2018917780

@article{ 10.5120/ijca2018917780,
author = { Arezoo Khatibi, Omid Khatibi },
title = { Optimization Method to Reduce Matrices Multiplications in the Context of CUDA },
journal = { International Journal of Computer Applications },
issue_date = { Sep 2018 },
volume = { 182 },
number = { 15 },
month = { Sep },
year = { 2018 },
issn = { 0975-8887 },
pages = { 5-7 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume182/number15/29936-2018917780/ },
doi = { 10.5120/ijca2018917780 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:15:01.743158+05:30
%A Arezoo Khatibi
%A Omid Khatibi
%T Optimization Method to Reduce Matrices Multiplications in the Context of CUDA
%J International Journal of Computer Applications
%@ 0975-8887
%V 182
%N 15
%P 5-7
%D 2018
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Parallel programming is an effective way to increase the speed of processing applications. It is carried out simultaneously by multiple processors rather than by a single processor. We compare the number of necessary calculations for multiplying the chain matrix in normal mode with the parallel mode. Since we used the famous parallel language named CUDA in our program, we will first present a brief description of the language and secondly, we explain essential mathematical notions and compare the performance of both programs.

References
  1. Chittampally Vasanth Raja, Srinivas Balasubramanian, Prakash S aghavendra. 2012. Heterogeneous Highly Parallel Implementation of Matrix Exponentiation Using GPU. International Journal of Distributed and Parallel Systems (IJDPS), 3(2).
  2. Chetan Jhurani, Paul Mullowney. 2015. A GEMM Interface and Implementation on NVIDIA GPUs for Multiple Small Matrices. Parallel Distrib. Comput. 75, 133-140.
  3. Wangda Zuo, Andrew McNeil, Michael Wetter and Eleanor S. ee. 2014. Acceleration of The Matrix Multiplication of Radiance Three Phase Daylighting Simulations with Parallel Computing on Heterogeneous Hardware of Personal Computer. Journal Of Building Performance Simulation, 7(2), 152-160.
  4. Minwoo Kim, Won Woo Ro. 2014. Architectural Investigation of Matrix Data Layout on Multicore Processors. Future Generation Computer Systems, 37, 64-75.
  5. Kazuya Matsumotoa, Naohito Nakasato, Tomoya Sakai, Hideki Yahagi, Stanislav G. Sedukhin. 2011. Multi-level Optimization of Matrix Multiplication for GPU-equipped Systems GPU-equipped SystemsInterfaces. Procedia Computer Science, 4, 342-351.
  6. Luis-Pedro Garc´ıa, Javier Cuenca and Domingo Gim´enez. 2015. On Optimization Techniques for the Matrix Multiplication on Hybrid CPU+GPU Platforms Annals of Multicore and GPU Programming, 2(1).
  7. Toomas Remmelg, Thibaut Lutz, Michel Steuwer, Christophe Dubach. 2011. Proceedings of the 9th Annual Workshop on general purpose processing using graphics processing unit, 12 March 2016, pp.22-31.
Index Terms

Computer Science
Information Sciences

Keywords

CUDA GPU Parallel programming