Efficient Dynamic Multiple GPGPU Layer for OpenCV

Afshan Jafri

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

FORENSIC ANALYSIS FRAMEWORKS FOR ENCRYPTED CLOUD STORAGE INVESTIGATIONS

Joy Awoleye Sarah Mavire Allan Munyira Kelvin Magora

Random Articles

An Easily Comprehendible Unicode based Sorting Algorithm for Bangla Words

October

2013

Detection and Prevention of Sybil Attack in MANET using MAC Address

July

2015

A Comparative Study of Assessing Software Reliability using SPC: An MMLE Approach

July

2012

Performance Comparison of Three Types of Sensor Matrices for Indoor Multi-Robot Localization

Nov

2018

Reseach Article

Efficient Dynamic Multiple GPGPU Layer for OpenCV

by Afshan Jafri

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 164 - Number 3

Year of Publication: 2017

Authors: Afshan Jafri

10.5120/ijca2017913604

Afshan Jafri . Efficient Dynamic Multiple GPGPU Layer for OpenCV. International Journal of Computer Applications. 164, 3 ( Apr 2017), 42-48. DOI=10.5120/ijca2017913604

@article{ 10.5120/ijca2017913604,

author = { Afshan Jafri },

title = { Efficient Dynamic Multiple GPGPU Layer for OpenCV },

journal = { International Journal of Computer Applications },

issue_date = { Apr 2017 },

volume = { 164 },

number = { 3 },

month = { Apr },

year = { 2017 },

issn = { 0975-8887 },

pages = { 42-48 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume164/number3/27467-2017913604/ },

doi = { 10.5120/ijca2017913604 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:10:18.899445+05:30

%A Afshan Jafri

%T Efficient Dynamic Multiple GPGPU Layer for OpenCV

%J International Journal of Computer Applications

%@ 0975-8887

%V 164

%N 3

%P 42-48

%D 2017

%I Foundation of Computer Science (FCS), NY, USA

Abstract

General purpose graphic processing unit (GPGPU) provides high performance resource for computing. CUDA (Compute Unified Device Architecture) and OpenCL (Open Computing Language) permit writing of parallel computing programs that utilize multiple central processing units (CPU) and GPGPUs. The image processing library, OpenCV (Open Source Computer Vision library), may benefit greatly from parallel use of multiple GPGPUs, however, its CUDA implementation is restricted to benefiting from a single GPGPU only. This research develops an abstraction layer above OpenCV single GPU module that enables multiple GPUs for single instruction multiple data (SIMD) architecture. This approach has a controller/parent thread which generates various worker threads to operate on several GPU devices, to handle balancing of work load on GPUs, as the task allocation is dynamic for any number of GPUs. The experiments on running bilateral filtering, color to gray conversion, fast Fourier transform, and convolution on homogeneous and heterogeneous sized images of scenery, objects, and faces, indicate that: (1) threading reduces computation time by half of sequential operation for GPU; (2) tuned static load balanced GPU threading reduces computation time by up to a fourth when compared to CPU threading; (3) performance of dynamic load balancing approaches that of manually iteratively balanced static operation.

References

Jespersen, D.C., 2010. Acceleration of a CFD code with a GPU. Scientific Programming, 18(3-4), pp.193-201.
Xu, R., Tian, X., Chandrasekaran, S. and Chapman, B., 2015. Multi-GPU support on single node using directive-based programming model. Scientific Programming.
Lee, J.H., Nigania, N., Kim, H., Patel, K. and Kim, H., 2015. OpenCL performance evaluation on modern multicore CPUs. Scientific Programming, 2015, p.4.
J., Varbanescu, A.L. and Sips, H., 2011, September. A comprehensive performance comparison of CUDA and OpenCL. In Parallel Processing (ICPP), 2011 International Conference on (pp. 216-225). IEEE.
Karimi, K., Dickson, N.G. and Hamze, F., 2010. A performance comparison of CUDA and OpenCL. arXiv preprint arXiv:1005.2581.
Bradski, G. and Kaehler, A., 2008. Learning OpenCV: Computer vision with the OpenCV library. " O'Reilly Media, Inc.".
OpenCV, GPU Module Introduction. [online] http://docs.opencv.org/modules/gpu/doc/introduction.html
Sanders, J. and Kandrot, E., 2010. CUDA by Example: An Introduction to General-Purpose GPU Programming, Portable Documents. Addison-Wesley Professional.
Kirk, D.B. and Wen-mei, W.H., 2010. Programming massively parallel processor. Morgan Kaufmann.
Nielsen, I. and Janssen, C.L., 2008. Multicore challenges and benefits for high performance scientific computing. Scientific Programming, 16(4), pp.277-285.
Lan, Z., Taylor, V.E. and Bryan, G., 2002. Dynamic load balancing of SAMR applications on distributed systems. Scientific Programming, 10(4), pp.319-328.
Parent, J., Verbeeck, K., Lemeire, J., Nowe, A., Steenhaut, K. and Dirkx, E., 2004. Adaptive load balancing of parallel applications with multi-agent reinforcement learning on heterogeneous systems. Scientific Programming, 12(2), pp.71-79.
OpenCV Test data. [online] Available at: https://github.com/itseez/opencv_extra.
Caltech 256 database, J2K and 256_object category, http://www.csee.wvu.edu/~xinl/database.html
Standard test Image, online http://www.imageprocessingplace.com/root_files_v3/image_databases.html

Index Terms

Computer Science

Information Sciences

Keywords

GPGPU OpenCV SIMD CUDA OpenCL Multiple GPU Load Balancing Threading.