International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 186 - Number 36 |
Year of Publication: 2024 |
Authors: Sameh Abdulah, Walid Atwa |
10.5120/ijca2024923878 |
Sameh Abdulah, Walid Atwa . Robust Frequent Patterns Mining Algorithms on Parallel Systems. International Journal of Computer Applications. 186, 36 ( Aug 2024), 24-30. DOI=10.5120/ijca2024923878
Data Mining (DM) algorithms have become increasingly prevalent in analyzing vast amounts of data generated in scientific fields like instrument and simulation data and in areas such as social networks and financial transactions. The availability of High-Performance Computing (HPC) systems has made parallel implementations of these algorithms commonplace. However, these systems, which were designed with data movement constraints in mind, often experience faults in computing devices, resulting in permanent process or node failures. This paper presents fault-tolerant parallel algorithms that enable checkpointing and recovery in memory for frequent pattern mining algorithms. Long-running data-intensive applications typically utilize the Message Passing Interface (MPI). Therefore, we tackle the challenge of fault tolerance in MPI-based applications by leveraging internal algorithm features and using MPI one-sided communication technology. Although this paper focuses on the FP-Growth frequent mining algorithm, we anticipate that the proposed approaches will serve as a foundation for designing fault-tolerant DM algorithms in general, given the effectiveness of the proposed implementations. Our evaluation demonstrates that MPI one-sided communication can act as a robust support system for efficient memory-based fault tolerance in parallel algorithms, even when compared to existing parallel programming models like Hadoop and Spark. To evaluate our fault-tolerant (FT) algorithms, we conduct tests on a large-scale InfiniBand cluster using several extensive datasets, employing up to 2K cores. Our evaluation reveals excellent efficiency in checkpointing and recovery compared to the disk-based approach. Furthermore, we observe an average speed-up of 20X for the FP-Growth algorithm compared to Spark. This establishes that a well-designed algorithm can easily surpass a solution based on a general fault-tolerant programming model.