Comparing Action as Input and Action as Output in a Reinforcement Learning Task

Evans Miriti; Peter Waiganjo; andrew Mwaura

Call for Paper

January Edition

IJCA solicits high quality original research papers for the upcoming January edition of the journal. The last date of research paper submission is 22 December 2025

Submit your paper

Know more

The week's pick

A Lightweight Proof of Stake Voting Mechanism with Byzantine Agreement and Cryptographic Sortition for Telemedicine Systems

Denis Wapukha Walumbe Gabriel Ndung’u Kamau Jane Wanjiru Njuki

Random Articles

Reseach Article

Comparing Action as Input and Action as Output in a Reinforcement Learning Task

by Evans Miriti, Peter Waiganjo, andrew Mwaura

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 76 - Number 1

Year of Publication: 2013

Authors: Evans Miriti, Peter Waiganjo, andrew Mwaura

10.5120/13212-0593

Evans Miriti, Peter Waiganjo, andrew Mwaura . Comparing Action as Input and Action as Output in a Reinforcement Learning Task. International Journal of Computer Applications. 76, 1 ( August 2013), 24-28. DOI=10.5120/13212-0593

@article{ 10.5120/13212-0593,

author = { Evans Miriti, Peter Waiganjo, andrew Mwaura },

title = { Comparing Action as Input and Action as Output in a Reinforcement Learning Task },

journal = { International Journal of Computer Applications },

issue_date = { August 2013 },

volume = { 76 },

number = { 1 },

month = { August },

year = { 2013 },

issn = { 0975-8887 },

pages = { 24-28 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume76/number1/13212-0593/ },

doi = { 10.5120/13212-0593 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:44:47.842654+05:30

%A Evans Miriti

%A Peter Waiganjo

%A andrew Mwaura

%T Comparing Action as Input and Action as Output in a Reinforcement Learning Task

%J International Journal of Computer Applications

%@ 0975-8887

%V 76

%N 1

%P 24-28

%D 2013

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Generalization techniques are useful for enabling an agent to be able to approximate the value of states it has not encountered so far in reinforcement learning. They are also useful as memory use minimization mechanisms in situations where the state space is too large such that is infeasible to represent every state in the state space in the computer memory. Artificial Neural Networks are one generalization technique that is usually employed. Various network structures have been proposed in literature. In this study, two of the structures that have been proposed were implemented in a robot navigation task and their performance compared. The results indicate that having a network structure where there is an output node for each of the possible actions, is superior to the structure in which the selected action is fed as an input to the network and its value output by the single network output node.

References

Asada, M. , Noda, S. , Tawaratsumida, S. & Hosoda, K. , 1994. Vision-Based Behavior Acquisition for a Shooting robot by Using a Reinforcement Learning. In IAPR/IEEE Workshop on Visual Behaviours. , 1994.
Kaelbling, L. P. , Littman, M. L. & Moore, A. W. , 1996. Reinforcement Learning: A Survey. Journal of Artificial Intelligence,4 , pp. 237-285. Available Through: CiteSeer [Accessed 7 August 2013].
McClelland, J. L. , 2013. Explorations in Parallel and Distributed Processing: A Handbook of Models, Programs and Exercises.
Microsoft, 2011. Robotics Developer Studio: Getting Started.
Mitchel, T. , 1997. Machine Learning. Singapore: McGraw-Hill.
Sherstov, A. A. & Stone, P. , 2005. Improving Action Selection in MDPs via Knowledge Transfer. In 20th National Conference on Artificial Intelligence. Pittsburgh, USA, 2005. last accessed online on: http://www. cs. utexas. edu/~pstone/Papers/bib2html/b2hd-AAAI05-actions. html.
Sutton S Richard, 1998. Implementation Details of the TD(gamma) Procedure for the Case of Vector Predictions and Backpropagation.
Sutton, S. R. & Barto, G. A. , 1998. Reinforcement Learning: An Introduction. London: MIT Press.
Taylor, M. E. & Stone, P. , July 2005. Behavior Transfer for Value-Function-Based Reinforcement Learning. In Fourth International Joint Conference on Autonomous Agents and Multiagent Systems. Utrecht, The Netherlands, July 2005.
Tesauro, G. , 1995. Temporal Difference Learning and TD-Gammon. Communications of the ACM, 38(3).
Usher, K. , 2006. Obstacle avoidance for a non-holonomic vehicle using occupancy grids. In MacDonald, B. , ed. Conference on Robotics and Automation. Auckland, Newzealand, 2006.

Index Terms

Computer Science

Information Sciences

Keywords

Reinforcement Learning Artificial Neural Networks obstacle avoidance