We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

Review Paper on Video Content Analysis into Text Description

Published on December 2015 by Vandana D. Edke, Ramesh M. Kagalkar
National Conference on Advances in Computing
Foundation of Computer Science USA
NCAC2015 - Number 3
December 2015
Authors: Vandana D. Edke, Ramesh M. Kagalkar
8e36ded3-ce75-438a-88b1-780fdeebde37

Vandana D. Edke, Ramesh M. Kagalkar . Review Paper on Video Content Analysis into Text Description. National Conference on Advances in Computing. NCAC2015, 3 (December 2015), 24-28.

@article{
author = { Vandana D. Edke, Ramesh M. Kagalkar },
title = { Review Paper on Video Content Analysis into Text Description },
journal = { National Conference on Advances in Computing },
issue_date = { December 2015 },
volume = { NCAC2015 },
number = { 3 },
month = { December },
year = { 2015 },
issn = 0975-8887,
pages = { 24-28 },
numpages = 5,
url = { /proceedings/ncac2015/number3/23374-5040/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Conference on Advances in Computing
%A Vandana D. Edke
%A Ramesh M. Kagalkar
%T Review Paper on Video Content Analysis into Text Description
%J National Conference on Advances in Computing
%@ 0975-8887
%V NCAC2015
%N 3
%P 24-28
%D 2015
%I International Journal of Computer Applications
Abstract

This paper reviews video content analysis from the various situations into matter version. The totally different researchers are applied different technique to unravel the approaches. It is a tendency to tend to jointly obtaining down addressing the required down siting extracting the frames from video, comparison the frames; pattern matching and generating the corresponding text description is address here. Hence additionally created a discussion, observation and comparison of quick work applied during this work. It is a tendency to mix the output of progressive object and activity detectors with "real-world" data to pick the foremost probable subject-verb-object triplet for describing a video. It is a tendency to show that this data, mechanically well-mined from web-scale text corpora, hence projected choice rule by providing it discourse information and results in a four-fold increase in activity identification. In contrast to previous ways mentioned in literature survey, therefore in this approach will annotate absolute videos while not requiring the high-priced assortment and annotation of an analogous coaching video corpus.

References
  1. J. K. Aggarwal and S. Park, "Human motion: Modeling and recognition of actions and interactions', in 3DPVT, (2004).
  2. Marie Catherine De Marneffe, Bill Maccartney, and Christopher D. Manning, 'Generating typed dependency parses from phrase structureparses', in LREC, (2006).
  3. David L. Chen and William B. Dolan, 'Collecting highly parallel data for paraphrase evaluation', in ACL, (2011).
  4. Timoth´eeCour, Chris Jordan, EleniMiltsakaki, and Ben Taskar, 'Movie/script: Alignment and parsing of video and text transcription', in ECCV, (2008).
  5. Ke Chen and Kazuaki Maeda David Graff, Junbo Kong,'English gigaword second edition', in LDC, (2005).
  6. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-FeiLi, 'Imagenet: A large-scale hierarchical image database', in CVPR,(2009).
  7. Pedro Domingos and Michael Pazzani, 'On the optimality of the simple Bayesian classifier under zero-one loss, ML,(1997).
  8. Alexei A. Efros, Alexander C. Berg, Er C. Berg, Greg Mori, and JitendraMalik, 'Recognizing action at a distance', in ICCV, (2003).
  9. Mark Everingham, Luc J. Van Gool, Christopher K. I. Williams, John M. Winn, and Andrew Zisserman, 'The Pascal visual object classes (voc) challenge', IJCV, (2010).
  10. Mark Everingham, Josef Sivic, and Andrew Zisserman, 'Hello! Myname is. . . buffy" – automatic naming of characters in TV video', inBMVC, (2006).
  11. P. F. Felzenszwalb, R. B. Girshick, and D. McAllester. Discriminativelytrained deformable part models, release 4.
  12. http://people. cs. uchicago. edu/ pff/latent-release4/. Pedro F. Felzenszwalb, Ross B. Girshick, David A. McAllester, and Deva Ramanan, 'Object detection with discriminatively trained part based models', IEEE Trans. Pattern Anal. Mach. Intell. , (2010).
  13. AdrienGaidon, MarcinMarszalek, and CordeliaSchmid, 'Mining visual actions from movies', in BMVC, (2009).
  14. Abhinav Gupta and Larry S. Davis, 'Objects in action: An approach for combining action understanding and object perception', in CVPR, (2007).
  15. Sonal Gupta and Raymond J. Mooney, 'Using closed captions as supervision for video activity recognition', in AAAI, (2010).
  16. AnthonyHoogs and A. G. AmithaPerera, "Video Activity Recognition in the Real World. ' in AAAI, (2008).
  17. Jay J. Jiang and David W. Conrath, 'Semantic similarity based on corpus statistics and lexical taxonomy', CoRR, (1997).
  18. Ivan Laptev, "On space-time interest points", IJCV, (2005).
  19. Ivan Laptev, MarcinMarszalek, CordeliaSchmid, and Benjamin Rozenfeld, 'Learning realistic human actions from movies', in CVPR,(2008).
  20. Dong X u,Shih-Fu Chang, "Video Event Recognization Using Kernal Methods With multilevel Temporal Alignment". IEEE (Transaction on pattern analysis and Machine intelligence, Vol. 30 No. 11, 2008.
  21. Niveda krishnamoorthy, Girish Malkarnenkar, RaymondMooney,KateSaenko,SergioGuadarrama,"Generating Natural –Language Video Descriptions Using Text-Mined Knowledge". Association for a advancement of Artificial Intelligence 2013.
  22. Motwani,T. , and Mooney, R. "Improving Video Activity Recognition Using Object Recognition and Text Mining". In European Conference On Artificial Intelligence (ECAI) ,2012.
  23. Marcus Rohrbach, Wei Qiu, Ivan TitovStefan, Thater Manfred , Pinkal BerntSchiele," Translating Video Content To Natural Language"proc. of IEEE International Conference On Computer Vision( ICCV)Dec. 2013.
  24. Muhammad Usman Ghani Khan, YoshihikoGotoh "Describing Video Contents in Natural Language" Proceeding of the workshop on Innovative Hybrid Approaches to the processing Of Textual data (Hybrid2012) EACL 2012.
  25. Andrei Barbu, Alezander Bridge, Zachary Burchill,Dan Coroian, Sven Dickinson, Sanja Fidler, Aaron Michaux, Sam Mussman,"Video In Sentence Out"2012.
  26. Ding,D. ;Metze,F. ;Rawat,S. ;Schulam,P. ;Burger,S. ;Younesian,E. ;Bao,L. ;Christel,M. ;and Hauptmann,"A. Beyond Audio and Video Retrival:Towards multimedia summarization". In Proceeding of the 2nd ACM International Conference on multimedia Retrieval. 2012.
  27. Lee,M. ; Hakeem, A. ; Haering, N. ; and Zhu, S. "Save: A Framework For Semantic Annotation of Visual Events". In IEEE Computer Vision and Pattern Recognition Workshops (CVPR-W), 2008.
  28. J. Sivic, M. Everingham, and A. Zisserman,"Who Are You?-Learning Persons Specific Classifiers from Video,"proc. IEEEConf. Computer Vision and pattern Recognition, 2009.
  29. S. Gupta and R. J. Mooney,"Using Closed Caption as Supervision for Video Activity Recognition"Proc. 24th AAAI Conf. Artificial Intelligenc, pp1083-1088, july 2010.
  30. D. Xu and S-F. Chang. "Visual Event recognition in News Video Using Kernel Methods With Multi-Level Temporal Alignment,"Proc. IEEE Conf. Computer Vision And Pattern Recognition, 2007.
  31. A. Hakeem and M. Shah,"Learning Detection and Representation of multiple Agent Events in Videos" Artificial Intelligence Journal, 2007.
  32. A. Kojima,T. Tamura,and. Kukunaga,"Natural Language Description of Human Activities of Video Image Based on Concept Hierarchy of action,"International Journal of computer vision, vol. 50, pp. 171-184, 2002.
  33. Douglas Ayers, Mubarakak Shah," Monitoring Human Behavior from Video Taken in an Office Environment", Image and vision computing 2001.
  34. Ramesh. M. Kagalkar,Mrityunjaya. V. Latteand Basavaraj. M. Kagalkar "Template Matching Method For Localization Of Suspicious Area And Classification Of Benign Or Malignant Tumors Area In Mammograms", International Journal on Computer Science and Information Technology (IJCECA), ISSN 0974-2034, Vol. 25, Issue1, 2011.
  35. Ramesh. M. Kagalkar Mrityunjaya . V. Latte and Basavaraj. M. Kagalkar ""An Improvement In Stopping Force Level Set Based Image Segmentation", International Journal on Computer Science and Information Technology(IJCEIT), ISSN 0974-2034,Vol 25,Issue1,Page 11-18,2010.
  36. Mrunmayee Patil and Ramesh Kagalkar "An Automatic Approach for Translating Simple Images into Text Descriptions and Speech for Visually Impaired People", International Journal of Computer Applications (0975 – 8887) Volume 118 – No. 3, May 2015.
  37. Mrunmayee and Ramesh Kagalkar "A Review on Conversion of Image to Text As well As Speech Using Edge Detection and Image Segmentation" International Journal of Advance Research in Computer Science Management Studies, Volume 2, and Issue 11 (November-2014) publish on 29th November to 30th November 2014.
Index Terms

Computer Science
Information Sciences

Keywords

Natural Language Generation Concept Hierarchy Semantic Primitive Position/posture And Estimation Of Human Case Frame.