We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 November 2024
Reseach Article

Computer Vision Architecture using Fusion Technique

by Nidhi Srivastava, Harsh Dev
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 76 - Number 4
Year of Publication: 2013
Authors: Nidhi Srivastava, Harsh Dev
10.5120/13238-0674

Nidhi Srivastava, Harsh Dev . Computer Vision Architecture using Fusion Technique. International Journal of Computer Applications. 76, 4 ( August 2013), 40-43. DOI=10.5120/13238-0674

@article{ 10.5120/13238-0674,
author = { Nidhi Srivastava, Harsh Dev },
title = { Computer Vision Architecture using Fusion Technique },
journal = { International Journal of Computer Applications },
issue_date = { August 2013 },
volume = { 76 },
number = { 4 },
month = { August },
year = { 2013 },
issn = { 0975-8887 },
pages = { 40-43 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume76/number4/13238-0674/ },
doi = { 10.5120/13238-0674 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:45:03.721050+05:30
%A Nidhi Srivastava
%A Harsh Dev
%T Computer Vision Architecture using Fusion Technique
%J International Journal of Computer Applications
%@ 0975-8887
%V 76
%N 4
%P 40-43
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Humans want to communicate with the computers in the same way as they communicate with other humans. Speech is the most natural and spontaneous form of communication. Speech is bimodal in nature and it combines audio and visual information to enhance speech recognition rate especially under poor audio conditions. This paper proposes novel computer vision architecture using fusion technique. This architecture combines or fuses more than one modality using multi-agents. In this we have used two modalities- audio and video. The audio part extracts the speech of a person and the video part extracts the face and lip information of the person. Here, different agents process the modalities and the fusion agent fuses these modalities for effective and efficient automatic speech recognition.

References
  1. Maximilian Kruger, Achim Schafer, Andreas Tewes, Rolf P. Wurtz, "Communicating Agents Architecture with Applications in Multimodal Human Computer Interaction", GI Jahrestagung, pp. 641-645, 2004.
  2. Elfriede I. Krauth, Jos van Hillegersberg, Steef L. van de Velde, "Agent-based Human-computer-interaction for Real-time Monitoring Systems in the Trucking Industry" IEEE Proceedings of the 40th Hawaii International Conference on System Sciences pp. 1-7, 2007.
  3. Cecilia Inks Sosa Arias, Beatriz Mascia Daltrini, "A Multi-Agent Environment for User Interface Design", Proceedings of the 22nd EUROMICRO Conference IEEE, Pague, pp. 242-247, 2nd -5th September 1996.
  4. J. Coutaz. , "Interfaces homme-machine: un regard critique", Technique et Science Informatiques, 153-64, 1991.
  5. Simon C. Lynch (University of Teesside, UK) and Keerthi Rajendran (University of Teesside, UK)" A multiagent approach to teaching complex systems development a hand book", 2011
  6. Shankar T. Shivappa, Bhaskar D. Rao, Mohan M. Trivedi "An Iterative Decoding Algorithm for Fusion of Multimodal Information", EURASIP Journal on Advances in Signal Processing, 2008.
  7. A. Corradini , M. Mehta, N. O. Bernsen , J. -C. Martin , S. Abrilian "Multimodal Input Fusion In Human-Computer Interaction on the Example of the NICE Project", Proceedings of the NATO ASI Conference on Data Fusion for Situation Monitoring, Incident Detection, Alert and Response Management, NAREK Center of Yerevan University, Tsakhkadzor, Armenia, Kluwer, 18th -29th August, 2003
  8. Rajeev Sharma, Vladimir I. Pavlovic, Thomas S. Huang, "Toward Multimodal Human–Computer Interface", Proceedings of the IEEE, Vol. 86, No. 5, pp. 853-869, May, 1998.
  9. Prof. S. Qamar Abbas, Nidhi Srivastava, "Development of Framework for Automatic Speech Recognition", IJCSE, Vol. 4 No. 05 May 2012.
  10. Zhihong Zeng, Jilin Tu, Brian M. Pianfetti, Jr. , and Thomas S. Huang, "Audio–Visual Affective Expression Recognition Through Multistream Fused HMM", IEEE Transactions On Multimedia, Vol. 10, No. 4, pp. 570-577, June 2008.
  11. Erno Makien, "Face Analysis Techniques for Human-Computer Interaction", Tampere 2007.
  12. Nallaperumal K. , Subban R. , Krishnaveni. K, Fred L. , Selvakumar K. R. Human Face Detection in Color Images Using Skin Color and Template Matching Models for Multimedia on the Web, IFIP International Conference on Wireless and Optical Communications Networks (IEEE), 7th August 2006.
  13. Nidhi Srivastava, Dr. Harsh Dev, Dr. Qamar Abbas, "Framework for Face Recognition", IJCA, Vol. 58, No. 17, November 2012.
  14. Gerasimos Potamianos, Chalapathy Neti, Guillaume Gravier, Ashutosh Garg, Andrew W. Senior "Recent Advances in the Automatic Recognition of Audio-Visual Speech" Proceedings of the IEEE Vol. 91, No. 9, pp. 1306-1326, September 2003.
  15. Mustafa Nazmi Kaynak, Qi Zhi, Adrian David Cheok, Kuntal Sengupta, Ko Chi Chung, "Audio-Visual Modeling for Bimodal Speech Recognition" IEEE International Conference on Systems, Man, and Cybernetics, Tucson, AZ, Vol. 1, pp. 181-186, 2001.
  16. Trent W. Lewis, David M. W. Powers, "Audio-visual Speech Recognition using Red Exclusion and Neural Networks", Journal of Australian Computer Science Communications, Vol. 24 No. 1, pp. 149-156, January- February, 2002.
  17. Tieyan Fu, Xiao Xing Liu, Lu Hong Liang, Xiaobo Pi, Ara V. Nefian "Audio-Visual Speaker Identification Using Coupled Hidden Markov Models" Proceedings of International Conference on Image Processing, IEEE, Vol. 3, pp. III-29-32, 14th -17th September, 2003.
  18. Yashwanth H, Harish Mahendrakar and Sumam David "Automatic Speech Recognition using Audio Visual Cues" IEEE First Proceedings of the India Annual Conference, pp. 166-169, 20th – 22nd December, 2004.
  19. Jong-Seok Lee and Cheol Hoon Park "Robust Audio-Visual Speech Recognition Based on Late Integration", IEEE Transactions on Multimedia, Vol. 10, No. 5, pp. 767-779, August, 2008.
  20. Nidhi Srivastava, Dr. Harsh Dev, Dr. Qamar Abbas, "Speech recognition using MFCC and Neural Network", National Conference on Challenges & Opportunities for Technological Innovation in India, AIMT, February, 2013.
Index Terms

Computer Science
Information Sciences

Keywords

Computer vision fusion agent architecture multimodal multi-agent