Dynamic Action Space Handling Method for Reinforcement Learning Models

Woo, Sangchul;Sung, Yunsick;

doi:10.3745/JIPS.02.0146

Journal of Information Processing Systems

Volume 16 Issue 5
/
Pages.1223-1230
/
2020
/
1976-913X(pISSN)
/
2092-805X(eISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

Dynamic Action Space Handling Method for Reinforcement Learning Models

Woo, Sangchul (Dept. of Multimedia Engineering, Dongguk University) ;
Sung, Yunsick (Dept. of Multimedia Engineering, Dongguk University)

Received : 2020.07.28
Accepted : 2020.08.24
Published : 2020.10.31

https://doi.org/10.3745/JIPS.02.0146 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Recently, extensive studies have been conducted to apply deep learning to reinforcement learning to solve the state-space problem. If the state-space problem was solved, reinforcement learning would become applicable in various fields. For example, users can utilize dance-tutorial systems to learn how to dance by watching and imitating a virtual instructor. The instructor can perform the optimal dance to the music, to which reinforcement learning is applied. In this study, we propose a method of reinforcement learning in which the action space is dynamically adjusted. Because actions that are not performed or are unlikely to be optimal are not learned, and the state space is not allocated, the learning time can be shortened, and the state space can be reduced. In an experiment, the proposed method shows results similar to those of traditional Q-learning even when the state space of the proposed method is reduced to approximately 0.33% of that of Q-learning. Consequently, the proposed method reduces the cost and time required for learning. Traditional Q-learning requires 6 million state spaces for learning 100,000 times. In contrast, the proposed method requires only 20,000 state spaces. A higher winning rate can be achieved in a shorter period of time by retrieving 20,000 state spaces instead of 6 million.

Keywords

References

V. Francois-Lavet, P. Henderson, R. Islam, M. G. Bellemare, and J. Pineau, "An introduction to deep reinforcement learning," Foundations and Trends in Machine Learning, vol. 11, no. 3-4, pp. 219-354, 2018. https://doi.org/10.1561/2200000071
O. Alemi, J. Françoise, and P. Pasquier, "GrooveNet: real-time music-driven dance movement generation using artificial neural networks," in Workshop on Machine Learning for Creativity in conjunction with the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, Canada, 2017.
A. Raghu, M. Komorowski, L. A. Celi, P. Szolovits, and M. Ghassemi, "Continuous state-space models for optimal sepsis treatment-a deep reinforcement learning approach," in Proceedings of the Machine Learning for Health Care Conference (MLHC), Boston, MA, 2017, pp. 147-163.
R. Garg and D. P. Nayak, "Game of tic-tac-toe: Simulation using Min-Max algorithm," International Journal of Advanced Research in Computer Science, vol. 8, no. 7, pp. 1074-1077, 2017. https://doi.org/10.26483/ijarcs.v8i7.4409
C. Jin, Z. Allen-Zhu, S. Bubeck, and M. I. Jordan, "Is Q-learning provably efficient?," Advances in Neural Information Processing Systems, vol. 31, pp. 4863-4873, 2018.

Journal of Information Processing Systems

Dynamic Action Space Handling Method for Reinforcement Learning Models

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)