A Review on Action Recognition and Action Prediction of Human(s) using Deep Learning Approaches

Syed Abdussami; Nagendraprasad S.; Shivarajakumara K.; Sanjeet Singh; A. Thyagarajamurthy

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

FORENSIC ANALYSIS FRAMEWORKS FOR ENCRYPTED CLOUD STORAGE INVESTIGATIONS

Joy Awoleye Sarah Mavire Allan Munyira Kelvin Magora

Random Articles

Wirelessly Transmitting a Grayscale Image using Visible Light

November

2012

Development and Performance Evaluation of Mismatched Filter using Differential Evolution

May

2012

A Novel Prioritised Concealment and Flexible Macroblock Ordering Scheme for Video Transmission

Sep

2016

An Optimizing Technique based on Genetic Algorithm for Power Management in Heterogeneous Multi-Tier Web Clusters

April

2015

Reseach Article

A Review on Action Recognition and Action Prediction of Human(s) using Deep Learning Approaches

by Syed Abdussami, Nagendraprasad S., Shivarajakumara K., Sanjeet Singh, A. Thyagarajamurthy

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 177 - Number 20

Year of Publication: 2019

Authors: Syed Abdussami, Nagendraprasad S., Shivarajakumara K., Sanjeet Singh, A. Thyagarajamurthy

10.5120/ijca2019919605

Syed Abdussami, Nagendraprasad S., Shivarajakumara K., Sanjeet Singh, A. Thyagarajamurthy . A Review on Action Recognition and Action Prediction of Human(s) using Deep Learning Approaches. International Journal of Computer Applications. 177, 20 ( Nov 2019), 1-5. DOI=10.5120/ijca2019919605

@article{ 10.5120/ijca2019919605,

author = { Syed Abdussami, Nagendraprasad S., Shivarajakumara K., Sanjeet Singh, A. Thyagarajamurthy },

title = { A Review on Action Recognition and Action Prediction of Human(s) using Deep Learning Approaches },

journal = { International Journal of Computer Applications },

issue_date = { Nov 2019 },

volume = { 177 },

number = { 20 },

month = { Nov },

year = { 2019 },

issn = { 0975-8887 },

pages = { 1-5 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume177/number20/31012-2019919605/ },

doi = { 10.5120/ijca2019919605 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:46:23.988381+05:30

%A Syed Abdussami

%A Nagendraprasad S.

%A Shivarajakumara K.

%A Sanjeet Singh

%A A. Thyagarajamurthy

%T A Review on Action Recognition and Action Prediction of Human(s) using Deep Learning Approaches

%J International Journal of Computer Applications

%@ 0975-8887

%V 177

%N 20

%P 1-5

%D 2019

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Human Action Recognition and Prediction are some of the hot topics in Computer Vision these days. It has its formidable contribution in the Anomaly detection. Many research scientists have been working in this field. Many new algorithms have been tried out in recent decades. In this paper, eight such approaches proposed in eight research papers have been reviewed. Compared to their counterparts for still images (the 2D CNNs for visual recognition), the 3D CNNs are considered to be comparatively less efficient, due to the limitations like high training complexity of spatio-temporal fusion and huge memory cost. So in the first referred paper the authors have proposed MiCT (Mixed Convolution Tube – for videos) with the right use of both 2D CNNs and 3D CNNs which reduces the training time. In the second research paper, the glimpse sequences in each frame correspond to interest points in the scene that are relevant to the classified activities. Unlike the last referred paper, the third referred paper presents a novel method to recognize human action as the evolution of pose estimation maps. The fourth referred paper presents a model for long term prediction of pedestrians from on-board observations. In the fifth research article referred, an attempt has been made to recognize the Human Rights Violation activities using the Deep Convolutional Neural Networks. In the sixth research article, Convolutional LSTM is used for the purpose of detecting violent videos. The seventh paper introduces a new Two-Stream Inﬂated 3D ConvNet (I3D) that is based on 2D ConvNet inﬂation. In the eighth research paper, a new temporal transition layer (TTL) that models variable temporal convolution kernel depths is embedded into 3D CNN to form T3D (Temporal 3D Convnets). Transferring knowledge from a pre-trained 2D CNN to a 3D CNN reduces the number of training samples required for 3D CNNs.

References

Y. Zhou, X. Sun, Z. Zha and W. Zeng, "MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 449-458
F. Baradel, C. Wolf, J. Mille and G. W. Taylor, "Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 469-478
M. Liu and J. Yuan, "Recognizing Human Actions as the Evolution of Pose Estimation Maps," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 1159-1168.
A. Bhattacharyya, M. Fritz and B. Schiele, "Long-Term On-board Prediction of People in Traffic Scenes Under Uncertainty," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 4194-4202.
Kalliatakis, Grigorios & Ehsan, Shoaib & Fasli, Maria & Leonardis, Ales & Gall, Juergen & McDonald-Maier, Klaus. (2016). Detection of Human Rights Violations in Images: Can Convolutional Neural Networks help?.
S. Sudhakaran and O. Lanz, "Learning to detect violent videos using convolutional long short-term memory," 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, 2017, pp. 1-6.
Carreira, J & Zisserman, Andrew. (2017). Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. 4724-4733. 10.1109/CVPR.2017.502.
Diba, Ali & Fayyaz, Mohsen & Sharma, Vivek & Hossein Karami, Amir & Mahdi Arzani, Mohammad & Van Gool, Luc & Yousefzadeh, Rahman. (2017). Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification.
Tu, Zhigang&Xie, Wei & Qin, Qianqing&Poppe, Ronald &Veltkamp, Remco & Li, Baoxin& Yuan, Junsong. (2018). Multi-stream CNN: Learning representations based on human-related regions for action recognition. Pattern Recognition. 79. 32-43. 10.1016/j.patcog.2018.01.020.

Index Terms

Computer Science

Information Sciences

Keywords

CNN SVM MiCT Glimpse Clouds Two-stream Bayesian Encoder-Decoder Pose estimation Heat Maps ConvLSTM Two-stream 3D CNN TTL T3D.