International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 183 - Number 13 |
Year of Publication: 2021 |
Authors: Chetan Sharma, Rajdeep Singh |
10.5120/ijca2021921447 |
Chetan Sharma, Rajdeep Singh . A Performance Analysis of Face and Speech Recognition in the Video and Audio Stream using Machine Learning Classification Techniques. International Journal of Computer Applications. 183, 13 ( Jul 2021), 41-46. DOI=10.5120/ijca2021921447
Biometric authentication is an emerging technology that utilizes biometric data for the purpose of person identification or recognition in security applications. A number of biometrics can be used in a person authentication system. Among the widely used biometrics, voice and face traits are most promising for pervasive application in every life, because they can be easily obtained using unobtrusive and user-friendly procedures. The low-cost audio and visual capture sensors on smart phones, laptops, and tablets has made the advantages of voice and face biometrics more outstanding compared with others. For quite a long time, the use of acoustic information alone has been a great success for speaker authentication applications. Meanwhile, the last decades or two also witnessed great advancement in face recognition technologies. Object detection and tracking is usually the first step in applications such as video surveillance. The static camera face recognition and tracking system's main purpose is to estimate the speed and distance parameters. We propose a general detection and tracking method for motion based on the visual system and using the image difference algorithm. Then recognize the person's voice to get feedback from the corresponding person. The process focuses on detecting people on stage and then completes the voice signal processing. We propose a new person recognition technology that uses face and voice fusion Compared to a single biometric recognition, and this technology can greatly improve the recognition speed. Development of security systems uses the Viola-Jones face recognition algorithm. The proposed method uses the Local Binary Pattern (LBP) as a function extraction technique to calculate local functions. Our project uses Mel Frequency Divergence Coefficient (MFCC) extraction technology for speech recognition. The extracted functions are used as input to the multi-SVM classifier to provide a gender to identify individuals and display the results. The new system can be used in various areas, such as identity verification and other potential commercial applications.