AutoScale-ML with HASA: A Docker-based Framework for Distributed AutoML Model Selection

Md. Attaur Rahman Sofi; Mohd. Yousuf

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 20 July 2026

Submit your paper

Know more

The week's pick

Quantifying Label-Induced Bias in Large Language Model Self and Cross Evaluations

Muskan Saraf Sajjad Rezvani Boroujeni Justin Beaudry Hossein Abedi Tom Bush

Random Articles

Simulation of MJ_CDTmin based Scheduling Algorithm in Grid Environment

March

2012

Real-Time Implementation and Analysis of Crop-Field for Agriculture Management System based on Microcontroller with GPRS (M-GPRS) and SMS

July

2014

Simulation based Performance Analysis of Zone Routing Protocol in Manet

February

2014

Overview and Applications of Particle Swarm Optimization on GPGPU

November

2014

Reseach Article

AutoScale-ML with HASA: A Docker-based Framework for Distributed AutoML Model Selection

by Md. Attaur Rahman Sofi, Mohd. Yousuf

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 187 - Number 121

Year of Publication: 2026

Authors: Md. Attaur Rahman Sofi, Mohd. Yousuf

10.5120/ijca57e5a1e8f472

Md. Attaur Rahman Sofi, Mohd. Yousuf . AutoScale-ML with HASA: A Docker-based Framework for Distributed AutoML Model Selection. International Journal of Computer Applications. 187, 121 ( Jun 2026), 8-14. DOI=10.5120/ijca57e5a1e8f472

@article{ 10.5120/ijca57e5a1e8f472,

author = { Md. Attaur Rahman Sofi, Mohd. Yousuf },

title = { AutoScale-ML with HASA: A Docker-based Framework for Distributed AutoML Model Selection },

journal = { International Journal of Computer Applications },

issue_date = { Jun 2026 },

volume = { 187 },

number = { 121 },

month = { Jun },

year = { 2026 },

issn = { 0975-8887 },

pages = { 8-14 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume187/number121/autoscale-ml-with-hasa-a-docker-based-framework-for-distributed-automl-model-selection/ },

doi = { 10.5120/ijca57e5a1e8f472 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2026-07-01T03:10:16.293161+05:30

%A Md. Attaur Rahman Sofi

%A Mohd. Yousuf

%T AutoScale-ML with HASA: A Docker-based Framework for Distributed AutoML Model Selection

%J International Journal of Computer Applications

%@ 0975-8887

%V 187

%N 121

%P 8-14

%D 2026

%I Foundation of Computer Science (FCS), NY, USA

Abstract

This paper presents AutoScale-ML with HASA, a hierarchical adaptive search framework for automated machine learning (AutoML) model selection, implemented within a simulated seven-node distributed computing environment consisting of one master node and six independent Docker containers, each exposing a REST endpoint through Flask. Each worker trains a randomly assigned Scikit-learn classifier drawn from RandomForest, GradientBoosting, ExtraTrees, DecisionTree, and LogisticRegression on a 50,000-sample synthetic classification dataset (50 features, 20 informative) generated via scikit-learn's make_classification, and returns accuracy, training runtime, simulated network delay, and a composite score to a central master process. The master applies a three-phase Hierarchical Adaptive Search Algorithm (HASA): Phase 1 collects all six worker evaluations and retains the top-4 by composite score; Phase 2 re-ranks those four candidates and retains the top-2; Phase 3 selects the single best model by maximum composite score. Experimental results—including per-model benchmarks, phase-by-phase HASA traces, penalty coefficient sensitivity analysis, and network delay characterisation—demonstrate that the framework effectively balances prediction accuracy and computational efficiency through runtime-aware hierarchical model selection. Comprehensive evaluation across five classifier families reveals that the composite scoring function heavily penalises ensemble training times, often favouring lightweight models over higher-accuracy alternatives. The penalty coefficient α is shown to be a critical first-class configuration parameter that must be calibrated to deployment context. The findings highlight the framework's usefulness as a reproducible baseline for containerised AutoML experimentation.

References

X. He, K. Zhao, and X. Chu, "AutoML: A survey of the state-of-the-art," Knowledge-Based Systems, vol. 212, p. 106622, 2021.
F. Hutter, L. Kotthoff, and J. Vanschoren, Automated Machine Learning: Methods, Systems, Challenges. Springer, 2019.
M. Feurer et al., "Auto-sklearn 2.0: Hands-free AutoML via meta-learning," Journal of Machine Learning Research, vol. 23, no. 261, pp. 1–61, 2022.
M. A. Zöller and M. F. Huber, "Benchmark and survey of automated machine learning frameworks," Journal of Artificial Intelligence Research, vol. 70, pp. 409–472, 2021.
L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar, "Hyperband: A novel bandit-based approach to hyperparameter optimization," Journal of Machine Learning Research, vol. 18, no. 185, pp. 1–52, 2018.
J. Bergstra and Y. Bengio, "Random search for hyper-parameter optimization," Journal of Machine Learning Research, vol. 13, pp. 281–305, 2012.
S. Falkner, A. Klein, and F. Hutter, "BOHB: Robust and efficient hyperparameter optimization at scale," Proc. ICML, pp. 1437–1446, 2018.
B. Shahriari et al., "Taking the human out of the loop: A review of Bayesian optimization," Proceedings of the IEEE, vol. 104, no. 1, pp. 148–175, 2016.
J. Snoek, H. Larochelle, and R. P. Adams, "Practical Bayesian optimization of machine learning algorithms," in Advances in NeurIPS, 2012, pp. 2951–2959.
A. Merkel, "Docker: Lightweight Linux containers for consistent development and deployment," Linux Journal, vol. 2014, no. 239, 2014.
F. Pedregosa et al., "Scikit-learn: Machine learning in Python," Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
X. Bouthillier et al., "Accounting for variance in machine learning benchmarks," Proc. MLSys, 2021.
P. Trirat, W. Jeong, and S. J. Hwang, "AutoML-Agent: A multi-agent LLM framework for full-pipeline AutoML," arXiv:2410.02958, 2024.
L. Franceschi et al., "Hyperparameter optimization in machine learning," arXiv:2410.22854, 2024.
M. Semmelrock et al., "Reproducibility in machine-learning-based research: Overview, barriers, and drivers," AI Magazine (Wiley), 2025. doi:10.1002/aaai.70002.
B. Burns, B. Grant, D. Oppenheimer, E. Brewer, and J. Wilkes, "Borg, Omega, and Kubernetes," Communications of the ACM, vol. 59, no. 5, pp. 50–57, 2016.
A. Ronacher, "Flask Documentation," Pallets Projects. [Online]. Available: https://flask.palletsprojects.com. Accessed: Apr. 09, 2026.
F. Pedregosa et al., "Scikit-learn Documentation: sklearn.datasets.make_classification," Scikit-learn Developers. [Online]. Available: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html. Accessed: Apr. 19, 2026.

Index Terms

Computer Science

Information Sciences

Keywords

AutoML; Hierarchical Search; Flask REST; Docker Compose; Scikit-learn; Model Selection; Containerised ML; Distributed Computing; HASA; Composite Scoring; Penalty Coefficient