CFP last date
20 December 2024
Reseach Article

Facial Expression and Visual Speech based Person Authentication

by S. Saravanan, S. Palanivel, M. Balasubramanian
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 100 - Number 6
Year of Publication: 2014
Authors: S. Saravanan, S. Palanivel, M. Balasubramanian
10.5120/17527-8097

S. Saravanan, S. Palanivel, M. Balasubramanian . Facial Expression and Visual Speech based Person Authentication. International Journal of Computer Applications. 100, 6 ( August 2014), 8-15. DOI=10.5120/17527-8097

@article{ 10.5120/17527-8097,
author = { S. Saravanan, S. Palanivel, M. Balasubramanian },
title = { Facial Expression and Visual Speech based Person Authentication },
journal = { International Journal of Computer Applications },
issue_date = { August 2014 },
volume = { 100 },
number = { 6 },
month = { August },
year = { 2014 },
issn = { 0975-8887 },
pages = { 8-15 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume100/number6/17527-8097/ },
doi = { 10.5120/17527-8097 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:29:16.004514+05:30
%A S. Saravanan
%A S. Palanivel
%A M. Balasubramanian
%T Facial Expression and Visual Speech based Person Authentication
%J International Journal of Computer Applications
%@ 0975-8887
%V 100
%N 6
%P 8-15
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Most of the person authentication system lacks perfection due to face poses and illumination variation. One more problem in person authentication is the selection of source for feature generation. In this work, videos have been recorded with variations in poses. The videos have been taken in normal office lighting condition. Videos of persons are taken in three situations. First when faces are kept normal, then with smile facial expression and third during speech. Second session of video recording is done similar to the first session with a time gap. This work employs a powerful method to identify the video frames which have single face without pose and excerpt necessary number of frames from the video. Methods are used to automatically identify the mouth area. Features are generated from mouth area in such a way to overcome the issues due to illumination variation. The features created from the first and second session are used to train and test respectively a neural network for person authentication. Among several neural network models, auto associative neural network is used due to its features distribution capturing ability. Person authentication capacity is compared while features created from normal face, features created from smile expression, and visual speech. Equal error rate is used as a tool to compare the capacity of person authentication. The outcome of this project is that while intensity based feature vectors like this is used for person authentication, the visual speech is more efficient than normal face, and face with smile expression performs the lowest.

References
  1. S. Saravanan, S. Palanivel, and M. Balasubramanian, "Facial Expression based Person Authentication", International Journal of Computer Applications, vol. 94, no. 13, pp. 1-8, May 2014.
  2. S. Palanivel, and B. Yegnanarayana, "Multimodal person authentication using speech, face and visual speech", Computer Vision and Image Understanding, vol. 109, no. 1, pp. 44–55, Jan. 2008.
  3. M. Balasubramanian, S. Palanivel, and V. Ramalingam, "Real time face and mouth recognition using radial basis function neural networks", Expert Systems with Applications, vol. 36, no. 3, pp. 6879-6888, Apr. 2009.
  4. Xudong Xie, and Kin-Man Lam, "Face recognition using elastic local reconstruction based on a single face image", Pattern Recognition, vol. 41, no. 1, pp. 406-417, Jan. 2008.
  5. Roland Hu, and R. I. Damper, "Optimal weighting of bimodal biometric information with specific application to audio-visual person identification", Information Fusion, vol. 10, no. 2, pp. 172-182, Apr. 2009.
  6. Federico Matta, and Jean-Luc Dugelay, "Person recognition using facial video information: A state of the art", Journal of Visual Languages and Computing, vol. 20, no. 3, pp. 180-187, Jun. 2009.
  7. Meng Li, and Yiu-ming Cheung, "Automatic lip localization under face illumination with shadow consideration", Signal Processing, vol. 89, no. 12, pp. 2425-2434, Dec. 2009.
  8. N. J. Nalini, S. Palanivel, and M. Balasubramanian, "Speech Emotion Recognition Using Residual Phase and MFCC Features", International Journal of Engineering and Technology, vol. 5, no. 6, pp. 4515-4527, Dec. 2013-Jan. 2014.
  9. N. Michael Brooke, "Using the visual component in automatic speech recognition", Proceeding of Fourth International Conference on Spoken Language Processing ICSLP 1996, Philadelphia, PA, IEEE, Oct. 3-6, 1996, vol. 3, pp. 1656-1659.
  10. Engin Erzin, Yücel Yemez, and A. Murat Tekalp, "Multimodal Speaker Identification Using an Adaptive Classifier Cascade Based on Modality Reliability", IEEE Transactions on Multimedia, vol. 7, no. 5, pp. 840-852, Oct. 2005.
  11. Kate Saenko, Karen Livescu, Michael Siracusa, Kevin Wilson, James Glass, and Trevor Darrell, "Visual Speech Recognition with Loosely Synchronized Feature Streams", Tenth IEEE International Conference on Computer Vision ICCV 2005, Beijing, IEEE, Oct. 17-21, 2005, vol. 1, pp. 1424-1431.
  12. B. Goswami, C. Chan, J. Kittler, and W. Christmas, "Speaker authentication using video-based lip information", IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP, Prague, IEEE, May 22-27, 2011, pp. 1908-1911.
  13. Wai Chee Yau, Hans Weghorn, and Dinesh Kant Kumar, "Visual Speech Recognition and Utterance Segmentation Based on Mouth Movement", 9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications DICTA 2007, Glenelg, Australia, Dec. 3-5, 2007, pp. 7-14.
  14. H. E. Çetingül, E. Erzin, Y. Yemez, and A. M. Tekalp, "Multimodal speaker/speech recognition using lip motion, lip texture and audio", Signal Processing, vol. 86, no. 12, pp. 3549–3558, Dec. 2006.
  15. Paul Viola, and Michael Jones, "Rapid object detection using a boosted cascade of simple features", IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Hawaii, IEEE, Dec. 08-14, 2001, vol. 1, pp. I-511-I-518.
  16. Rainer Lienhart, and Jochen Maydt, "An Extended Set of Haar-like Features for Rapid Object Detection", Proceedings of the 2002 International Conference on Image Processing, USA, Sep. 22-25, 2002, vol. 1, pp. 900-903.
  17. M. Castrillon, O. Déniz, C. Guerra, and M. Hernández, "ENCARA2: Real-time detection of multiple faces at different resolutions in video streams", Journal of Visual Communication and Image Representation, vol. 18, no. 2, pp. 130-140, Apr. 2007.
  18. Michal U?i?á?, Vojt?ch Franc, and Václav Hlavá?, "Facial Landmarks Detector Learned by the Structured Output SVM", Proceedings of the 7th International Joint Conference on Computer Vision Theory and Applications, on Computer Graphics Theory and Applications and on Information Visualization Theory and Applications, Springer Berlin Heidelberg, Italy, Feb. 24-26, 2012, pp. 383-398
Index Terms

Computer Science
Information Sciences

Keywords

Auto associative neural network Automatic pose free face detector Facial expression Person authentication Visual speech.