A Deep Learning Approach for Urban Sound Classification

Sanjoy Barua; Tahmina Akter; Mahmud Abu Saleh Musa; Muhammad Anwarul Azim

Call for Paper

October Edition

IJCA solicits high quality original research papers for the upcoming October edition of the journal. The last date of research paper submission is 22 September 2025

Submit your paper

Know more

The week's pick

Real-Time Video Transmission using Gaussian Minimum Shift Keying (GMSK) on GNU Radio and USRP for Radiation Monitoring Applications in Nuclear Reactors

Nabiha Ben Abid Abdalla M. Khattab Hani A.M. Harb Chokri Souani

Random Articles

Reseach Article

A Deep Learning Approach for Urban Sound Classification

by Sanjoy Barua, Tahmina Akter, Mahmud Abu Saleh Musa, Muhammad Anwarul Azim

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 185 - Number 24

Year of Publication: 2023

Authors: Sanjoy Barua, Tahmina Akter, Mahmud Abu Saleh Musa, Muhammad Anwarul Azim

10.5120/ijca2023922991

Sanjoy Barua, Tahmina Akter, Mahmud Abu Saleh Musa, Muhammad Anwarul Azim . A Deep Learning Approach for Urban Sound Classification. International Journal of Computer Applications. 185, 24 ( Jul 2023), 8-14. DOI=10.5120/ijca2023922991

@article{ 10.5120/ijca2023922991,

author = { Sanjoy Barua, Tahmina Akter, Mahmud Abu Saleh Musa, Muhammad Anwarul Azim },

title = { A Deep Learning Approach for Urban Sound Classification },

journal = { International Journal of Computer Applications },

issue_date = { Jul 2023 },

volume = { 185 },

number = { 24 },

month = { Jul },

year = { 2023 },

issn = { 0975-8887 },

pages = { 8-14 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume185/number24/32838-2023922991/ },

doi = { 10.5120/ijca2023922991 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T01:26:56.207716+05:30

%A Sanjoy Barua

%A Tahmina Akter

%A Mahmud Abu Saleh Musa

%A Muhammad Anwarul Azim

%T A Deep Learning Approach for Urban Sound Classification

%J International Journal of Computer Applications

%@ 0975-8887

%V 185

%N 24

%P 8-14

%D 2023

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Urban sound classification is the task of identifying the type of sound present in a given recording, such as car honks, pedestrian footsteps, or construction noise. Accurate classification of urban sounds is important for a variety of applications, including environmental monitoring, traffic management, and public safety. To address this problem, we experiment with five different deep learning models: ANN, CNN, RNN, LSTM plus GRU combined model, and Bi-LSTM plus Bi-GRU model. These models are trained and evaluated on the Urban Sound 8K dataset, which consists of 8,000 urban sound recordings from 10 different classes. Our results show that the ANN model achieved the highest accuracy, reaching 95% on the test set. Overall, our results demonstrate the effectiveness of deep learning for urban sound classification and suggest that the ANN model is the most suitable for this task. This work has the potential to impact a variety of fields that rely on the accurate identification of urban sounds.

References

Chachada, S. and Kuo, C.C.J., 2014. Environmental sound recognition: A survey. APSIPA Transactions on Signal and Information Processing, 3, p.e14.
Sharma, Jivitesh, Ole-ChristofferGranmo, and Morten Goodwin. "Environment Sound Classification Using Multiple Feature Channels and Attention Based Deep Convolutional Neural Network." Interspeech. Vol. 2020. 2020.
Massoud Massoudi, Siddhant Verma &Riddhima Jain “Urban Sound Classification using CNN“ Published in 2021, 6th International Conference on Inventive Computation Technologies (ICICT)
IuriiLezhenin, Natalia Bogach&EvgenyPyshkin “Urban Sound Classification using Long Short-Term Memory Neural Network“Published in 2019, Federated Conference on Computer Science and Information Systems (FedCSIS)
Fatih Demir, Muammer Turkoglu, Muzaffer Aslan & Abdulkadir Sengur “A new pyramidal concatenated CNN approach for environmental sound classification “ Published in 2020 by ELSEVIER .Applied Acoustics Vol. 170
Joy Krishan Das, Arka Ghosh, Abhijit Kumar Pal, Sumit Dutta & Amitabha Chakrabarty “Urban Sound Classification Using Convolutional Neural Network and Long Short Term Memory Based on Multiple Features“ Published in 2020 Fourth International Conference On Intelligent Computing in Data Sciences (ICDS)
Achyut Mani Tripathi &Aakansha Mishra “Environment sound classification using an attention-based residual neural network “Published In 2021 by ELSEVIER . Neurocomputing Vol. 460
Wenjie Mu, BoYin, Xianqing Huang, Jiali Xu &Zehua Du “Environmental sound classifcation using temporal-frequency attention based convolutional neural network” Published in 2021 by Scientific Reports
Fatih Demir, Daban Abdulsalam Abdullah & Abdulkadir Sengur “A New Deep CNN model for Environmental Sound Classification” Published in 2020 by IEEE Access Vol. 8
Chen, Y., Guo, Q., Liang, X., Wang, J. and Qian, Y., 2019. Environmental sound classification with dilated convolutions. Applied Acoustics, 148, pp.123-132.
Huang, Z., Liu, C., Fei, H., Li, W., Yu, J. and Cao, Y., 2020. Urban sound classification based on 2-order dense convolutional network using dual features. Applied Acoustics, 164, p.107243.
Guo, J., Li, C., Sun, Z., Li, J. and Wang, P., 2022. A Deep Attention Model for Environmental Sound Classification from Multi-Feature Data. Applied Sciences, 12(12), p.5988.
Raguraman, Preeth, R. Mohan, and Midhula Vijayan. "Librosa based assessment tool for music information retrieval systems." 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE, 2019.
Majeed, S.A., Husain, H., Samad, S.A. and Idbeaa, T.F., 2015. MEL FREQUENCY CEPSTRAL COEFFICIENTS (MFCC) FEATURE EXTRACTION ENHANCEMENT IN THE APPLICATION OF SPEECH RECOGNITION: A COMPARISON STUDY. Journal of Theoretical & Applied Information Technology, 79(1).
Hossan, Md Afzal, Sheeraz Memon, and Mark A. Gregory. "A novel approach for MFCC feature extraction." 2010 4th International Conference on Signal Processing and Communication Systems. IEEE, 2010.
Gaikwad, S., Gawali, B., Yannawar, P. and Mehrotra, S., 2011, December. Feature extraction using fusion MFCC for continuous marathi speech recognition. In 2011 Annual IEEE India Conference (pp. 1-5). IEEE.
Murty, K. Sri Rama, and BayyaYegnanarayana. "Combining evidence from residual phase and MFCC features for speaker recognition." IEEE signal processing letters 13.1 (2005): 52-55.
Abiodun, Oludare Isaac, et al. "Comprehensive review of artificial neural network applications to pattern recognition." IEEE Access 7 (2019): 158820-158846.
Bashar, Syed K., Abdullah Al Fahim, and Ki H. Chon. "Smartphone based human activity recognition with feature selection and dense neural network." 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 2020.
Chauhan, Rahul, Kamal Kumar Ghanshala, and R. C. Joshi. "Convolutional neural network (CNN) for image detection and recognition." 2018 first international conference on secure cyber computing and communication (ICSCCC). IEEE, 2018.
Kattenborn, Teja, Jens Leitloff, Felix Schiefer, and Stefan Hinz. "Review on Convolutional Neural Networks (CNN) in vegetation remote sensing." ISPRS journal of photogrammetry and remote sensing 173 (2021): 24-49.
Zeng, Chao, et al. "Parking occupancy prediction method based on multi factors and stacked GRU-LSTM." IEEE Access 10 (2022): 47361-47370.
Shahid, F., Zameer, A. and Muneeb, M., 2020. Predictions for COVID-19 with deep learning models of LSTM, GRU and Bi-LSTM. Chaos, Solitons & Fractals, 140, p.110212.
Islam, M.S. and Hossain, E., 2021. Foreign exchange currency rate prediction using a GRU-LSTM hybrid network. Soft Computing Letters, 3, p.100009.
Bohan, H. and Yun, B., 2019, July. Traffic flow prediction based on BRNN. In 2019 IEEE 9th International Conference on Electronics Information and Emergency Communication (ICEIEC) (pp. 320-323). IEEE.
Mouthami, K., Anandamurugan, S. and Ayyasamy, S., 2022, December. BERT-BiLSTM-BiGRU-CRF: Ensemble Multi Models Learning for Product Review Sentiment Analysis. In 2022 6th International Conference on Electronics, Communication and Aerospace Technology (pp. 1514-1519). IEEE.

Index Terms

Computer Science

Information Sciences

Keywords

Urban sound environmental monitoring deep learning ANN CNN RNN LSTM LSTM plus GRUBi-LSTM plus Bi-GRU.