CFP last date
20 January 2025
Reseach Article

Experimental Evaluation of the Performance of Processing Stealing Technique: A Scalable Load Balancing Technique for a Dynamic Multiprocessor System

by O. O. Olakanmi, O. A. Fakolujo
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 83 - Number 3
Year of Publication: 2013
Authors: O. O. Olakanmi, O. A. Fakolujo
10.5120/14426-2568

O. O. Olakanmi, O. A. Fakolujo . Experimental Evaluation of the Performance of Processing Stealing Technique: A Scalable Load Balancing Technique for a Dynamic Multiprocessor System. International Journal of Computer Applications. 83, 3 ( December 2013), 7-13. DOI=10.5120/14426-2568

@article{ 10.5120/14426-2568,
author = { O. O. Olakanmi, O. A. Fakolujo },
title = { Experimental Evaluation of the Performance of Processing Stealing Technique: A Scalable Load Balancing Technique for a Dynamic Multiprocessor System },
journal = { International Journal of Computer Applications },
issue_date = { December 2013 },
volume = { 83 },
number = { 3 },
month = { December },
year = { 2013 },
issn = { 0975-8887 },
pages = { 7-13 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume83/number3/14426-2568/ },
doi = { 10.5120/14426-2568 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:58:24.818480+05:30
%A O. O. Olakanmi
%A O. A. Fakolujo
%T Experimental Evaluation of the Performance of Processing Stealing Technique: A Scalable Load Balancing Technique for a Dynamic Multiprocessor System
%J International Journal of Computer Applications
%@ 0975-8887
%V 83
%N 3
%P 7-13
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper reports preliminary experimental evaluation of a Processing Elements Stealing (PE-S) technique which was targeted as efficient and scalable load balancing technique for dynamically structured multiprocessor systems. The multiprocessor system is imagined as a dynamic cluster based multiprocessor. Each cluster of the multiprocessor system is a node in symmetric multiprocessor architecture and the number of Processing Element (PE) in each cluster is dynamically determined at runtime. The PE-S technique dynamically computes the configuration ratio using the number of threads in the dynamically assigned tasks to generate the new number of PE for each cluster. This new configuration ratio is thereafter used to balance the additional computational work generated by runtime instantiation of current workloads for each cluster. In this work, the efficiency of the PE-S was evaluated using memory traces of some tightly parallel applications where the amount of parallelism is parameterized. These traces were used as workloads on two different simulation setups; the first is a dynamic multiprocessor with PE-S while the other was also a dynamic multiprocessor but without PE-S. This is to evaluate the performance of the PE-S load balancing technique on the targeted multiprocessor. Also the efficiency of PE-S reconfigurations was compared with other possible reconfiguration ratios. The experimental results showed that the load balancing algorithm is efficient and scalable for balancing at least 100,000 instructions tasks and PE-S generated ratios are averagely better than any other reconfiguration ratios.

References
  1. Acar, U. , Blelloch, G. , & Blumofe, R. (2000). The Data Locality of Work stealing. Proceedings 12th ACM Symposium on Parallel Algorithms and Architecture (pp. Pp. 1-12). ACM.
  2. Alexandra, F. , Margo, S. , & Michael, S. (2007). Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler. Proceeding of 16th of International Conference on Parallel Architecture and Compilation Technique (PACT 2007). Brasov, Romania.
  3. Amir, M. R. , & Mohammad, A. V. (2008). A novel task scheduling in Multiprocessor Systems with Genetic Algorithm by Using Elitism stepping method.
  4. Belady, L. (1966). A study of replacement algorithms for virtual-storage computer. IBM Systems Journal , 78-101.
  5. Berger, J. , & Browne, J. (1999). Scalable Load Distribution and Load Balancing for Dynamic Parallel Programs. International Workshop on Cluster-Based Computing.
  6. Blumofe, R. , & Leiserson, C. (1994). Scheduling Multithreaded Computations by Work Stealing. Proceedings 35th IEEE Conference on Foundations of Computer Science, (pp. 356-368).
  7. Chou, T. C. , & Abraham, A. J. (1983). Load redistribution under failure in distributed systems. IEEE Trans. Computing , C-32 (9), 799-808.
  8. Chow, C. , & Kohler, W. (1979). Models for dynamic load balancing in a heterogenous multiple processor system. IEEE Trans. Computing , C-28 (5), 354-361.
  9. Gautam, G. , & Soo-Young, L. (1998). Dynamic Reconfiguration of a PMMLA for High Throughput Applications. Parallel and Distributed Processing Workshops Held in Conjunction with the 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing, 10IPPS/SPDP'98 (pp. Pp 1-6). Florida: Springer.
  10. Gene, E. J. , & Hwang, Y. S. (2003). An Efficient Algorithm for Perfect Load Balancing on Hypercube Multiprocessors. The Journal of Supercomputing , 25 (1), 5-15.
  11. Mitzenmatcher, M. (1998). Analysis of Load Stealing Models Based on Differential Equations. Proceedings of 10th ACM Symposium on Parallel Algorithms and Architectures (pp. 212-221). ACM.
  12. Neil, D. , & Wierman, A. (2011). On the Benefits of Work Stealing in Shared Memory Multiprocessors. Carnegie Mellon University.
  13. Nozar, T. , Nader, B. , Amir, H. K. , & Haitao, D. (2004). MaRS: A Macro-pipelined Reconfigurable System. ACM .
  14. Olakanmi, O. , & Fakolujo, O. (2012(a)). Design and Performance Analysis of Reconfigurable Hybridized Macro Pipeline Multiprocessor. International Journal of Ubiquitous Computing and Communication , 7, Pp. 17-24.
  15. Olakanmi, O. , & Fakolujo, O. (2012(b)). Load Balancing in the Macro Pipeline Multiprocessor System using Processing Element Stealing Technique. International Journal of Ubiquitous Computing and Communication , 7, Pp. 25-31.
  16. R, D. , & et, a. (2002). A Dynamically Reconfigurable Architecture Dealing with Future Mobile Telecommunications Constraints. Proceeding of Parallel and Distributed Processing Symposium, IPDPS, (pp. 156-163).
  17. Wall, D. W. (1993). Limits of Instruction-Level Parallelism. Digital Western Research Laboratory 93/6.
  18. Yi-Hsuan, L. , & Cheng, C. (n. d. ). A Modified Genetic Algorithm for Task Scheduling in Multiprocessor Systems.
  19. Zeljko, V. , Havard, E. , Palm, H. , & Carsten, G. (2009). Limits of Work-Stealing Scheduling. In E. Frachtenberg (Ed. ), 14th International WorkshopJSSPP, (pp. 280-300).
Index Terms

Computer Science
Information Sciences

Keywords

Load balancing multiprocessor parallel application work stealing and sharing processing element stealing