| International Journal of Computer Applications |
| Foundation of Computer Science (FCS), NY, USA |
| Volume 187 - Number 110 |
| Year of Publication: 2026 |
| Authors: Ali Diyaa, Engy Refaai, Alyaa Tamer, Soher Mohamed, Aya Adel Muhammed Hassan, Rana Ehab, Mohamed AbdelFattah |
10.5120/ijca28d82db3210b
|
Ali Diyaa, Engy Refaai, Alyaa Tamer, Soher Mohamed, Aya Adel Muhammed Hassan, Rana Ehab, Mohamed AbdelFattah . Automatic Multi-Label Stuttering Detection from Speech using Attention-Enhanced Deep Neural Networks. International Journal of Computer Applications. 187, 110 ( May 2026), 1-8. DOI=10.5120/ijca28d82db3210b
Speech disorders like stuttering interfere with normal speech patterns. Repeating sounds, syllables, or words; prolonging sounds for an excessive amount of time; becoming trapped in silent blocks where no sound is produced despite the speaker’s best efforts to speak; or employing interjections. The speech muscles don’t work properly, even though the speaker usually knows exactly what they want to say. Stuttering, which affects almost 80 million people globally, can make everyday communication feel challenging and frustrating. If left untreated, it frequently causes problems with social connections and self-confidence. A hybrid deep learning system for automatically identifying stuttering disfluencies in speech recordings is presented in this work. The method combines bidirectional long short-term memory (BiLSTM) layers, an attention mechanism (AM), and convolutional neural networks (CNN) for local acoustic feature extraction. Thirteen Mel-frequency cepstral coefficients (MFCCs) and their first-order delta and secondorder delta-delta derivatives are among the many acoustic features used in the model. Evaluations on benchmark datasets, such as SEP-28K and FluencyBank, reveal F1 scores of 97.3% to 98.9% for important disfluency types and accuracy between 97.0% and 98.2%,these results are comparable to human expert agreement.