Computer Vision Architecture using Fusion Technique

Nidhi Srivastava; Harsh Dev

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

Evaluating Text-to-Text Generation from LLMs: A Case Study and Scalable Framework

Ziqiao Ao Juhi Singh Sebastian Antinome

Random Articles

DC Motor Drive with P, PI, and Particle Swarm Optimization Speed Controllers

May

2017

Color Edge Detector with Sobel-PCA

August

2013

Response Time Reduction and Performance Analysis of Load Balancing Algorithms at Peak Hours in Cloud Computing

October

2015

Stochastic Modeling of a Single-Unit Repairable System with Preventive Maintenance under Warranty

August

2013

Reseach Article

Computer Vision Architecture using Fusion Technique

by Nidhi Srivastava, Harsh Dev

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 76 - Number 4

Year of Publication: 2013

Authors: Nidhi Srivastava, Harsh Dev

10.5120/13238-0674

Nidhi Srivastava, Harsh Dev . Computer Vision Architecture using Fusion Technique. International Journal of Computer Applications. 76, 4 ( August 2013), 40-43. DOI=10.5120/13238-0674

@article{ 10.5120/13238-0674,

author = { Nidhi Srivastava, Harsh Dev },

title = { Computer Vision Architecture using Fusion Technique },

journal = { International Journal of Computer Applications },

issue_date = { August 2013 },

volume = { 76 },

number = { 4 },

month = { August },

year = { 2013 },

issn = { 0975-8887 },

pages = { 40-43 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume76/number4/13238-0674/ },

doi = { 10.5120/13238-0674 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:45:03.721050+05:30

%A Nidhi Srivastava

%A Harsh Dev

%T Computer Vision Architecture using Fusion Technique

%J International Journal of Computer Applications

%@ 0975-8887

%V 76

%N 4

%P 40-43

%D 2013

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Humans want to communicate with the computers in the same way as they communicate with other humans. Speech is the most natural and spontaneous form of communication. Speech is bimodal in nature and it combines audio and visual information to enhance speech recognition rate especially under poor audio conditions. This paper proposes novel computer vision architecture using fusion technique. This architecture combines or fuses more than one modality using multi-agents. In this we have used two modalities- audio and video. The audio part extracts the speech of a person and the video part extracts the face and lip information of the person. Here, different agents process the modalities and the fusion agent fuses these modalities for effective and efficient automatic speech recognition.

References

Maximilian Kruger, Achim Schafer, Andreas Tewes, Rolf P. Wurtz, "Communicating Agents Architecture with Applications in Multimodal Human Computer Interaction", GI Jahrestagung, pp. 641-645, 2004.
Elfriede I. Krauth, Jos van Hillegersberg, Steef L. van de Velde, "Agent-based Human-computer-interaction for Real-time Monitoring Systems in the Trucking Industry" IEEE Proceedings of the 40th Hawaii International Conference on System Sciences pp. 1-7, 2007.
Cecilia Inks Sosa Arias, Beatriz Mascia Daltrini, "A Multi-Agent Environment for User Interface Design", Proceedings of the 22nd EUROMICRO Conference IEEE, Pague, pp. 242-247, 2nd -5th September 1996.
J. Coutaz. , "Interfaces homme-machine: un regard critique", Technique et Science Informatiques, 153-64, 1991.
Simon C. Lynch (University of Teesside, UK) and Keerthi Rajendran (University of Teesside, UK)" A multiagent approach to teaching complex systems development a hand book", 2011
Shankar T. Shivappa, Bhaskar D. Rao, Mohan M. Trivedi "An Iterative Decoding Algorithm for Fusion of Multimodal Information", EURASIP Journal on Advances in Signal Processing, 2008.
A. Corradini , M. Mehta, N. O. Bernsen , J. -C. Martin , S. Abrilian "Multimodal Input Fusion In Human-Computer Interaction on the Example of the NICE Project", Proceedings of the NATO ASI Conference on Data Fusion for Situation Monitoring, Incident Detection, Alert and Response Management, NAREK Center of Yerevan University, Tsakhkadzor, Armenia, Kluwer, 18th -29th August, 2003
Rajeev Sharma, Vladimir I. Pavlovic, Thomas S. Huang, "Toward Multimodal Human–Computer Interface", Proceedings of the IEEE, Vol. 86, No. 5, pp. 853-869, May, 1998.
Prof. S. Qamar Abbas, Nidhi Srivastava, "Development of Framework for Automatic Speech Recognition", IJCSE, Vol. 4 No. 05 May 2012.
Zhihong Zeng, Jilin Tu, Brian M. Pianfetti, Jr. , and Thomas S. Huang, "Audio–Visual Affective Expression Recognition Through Multistream Fused HMM", IEEE Transactions On Multimedia, Vol. 10, No. 4, pp. 570-577, June 2008.
Erno Makien, "Face Analysis Techniques for Human-Computer Interaction", Tampere 2007.
Nallaperumal K. , Subban R. , Krishnaveni. K, Fred L. , Selvakumar K. R. Human Face Detection in Color Images Using Skin Color and Template Matching Models for Multimedia on the Web, IFIP International Conference on Wireless and Optical Communications Networks (IEEE), 7th August 2006.
Nidhi Srivastava, Dr. Harsh Dev, Dr. Qamar Abbas, "Framework for Face Recognition", IJCA, Vol. 58, No. 17, November 2012.
Gerasimos Potamianos, Chalapathy Neti, Guillaume Gravier, Ashutosh Garg, Andrew W. Senior "Recent Advances in the Automatic Recognition of Audio-Visual Speech" Proceedings of the IEEE Vol. 91, No. 9, pp. 1306-1326, September 2003.
Mustafa Nazmi Kaynak, Qi Zhi, Adrian David Cheok, Kuntal Sengupta, Ko Chi Chung, "Audio-Visual Modeling for Bimodal Speech Recognition" IEEE International Conference on Systems, Man, and Cybernetics, Tucson, AZ, Vol. 1, pp. 181-186, 2001.
Trent W. Lewis, David M. W. Powers, "Audio-visual Speech Recognition using Red Exclusion and Neural Networks", Journal of Australian Computer Science Communications, Vol. 24 No. 1, pp. 149-156, January- February, 2002.
Tieyan Fu, Xiao Xing Liu, Lu Hong Liang, Xiaobo Pi, Ara V. Nefian "Audio-Visual Speaker Identification Using Coupled Hidden Markov Models" Proceedings of International Conference on Image Processing, IEEE, Vol. 3, pp. III-29-32, 14th -17th September, 2003.
Yashwanth H, Harish Mahendrakar and Sumam David "Automatic Speech Recognition using Audio Visual Cues" IEEE First Proceedings of the India Annual Conference, pp. 166-169, 20th – 22nd December, 2004.
Jong-Seok Lee and Cheol Hoon Park "Robust Audio-Visual Speech Recognition Based on Late Integration", IEEE Transactions on Multimedia, Vol. 10, No. 5, pp. 767-779, August, 2008.
Nidhi Srivastava, Dr. Harsh Dev, Dr. Qamar Abbas, "Speech recognition using MFCC and Neural Network", National Conference on Challenges & Opportunities for Technological Innovation in India, AIMT, February, 2013.

Index Terms

Computer Science

Information Sciences

Keywords

Computer vision fusion agent architecture multimodal multi-agent