DOI QR코드

DOI QR Code

Modeling and Simulation on One-vs-One Air Combat with Deep Reinforcement Learning

깊은강화학습 기반 1-vs-1 공중전 모델링 및 시뮬레이션

  • Received : 2018.12.12
  • Accepted : 2019.12.27
  • Published : 2020.03.31

Abstract

The utilization of artificial intelligence (AI) in the engagement has been a key research topic in the defense field during the last decade. To pursue this utilization, it is imperative to acquire a realistic simulation to train an AI engagement agent with a synthetic, but realistic field. This paper is a case study of training an AI agent to operate with a hardware realism in the air-warfare dog-fighting. Particularly, this paper models the pursuit of an opponent in the dog-fighting setting with a gun-only engagement. In this context, the AI agent requires to make a decision on the pursuit style and intensity. We developed a realistic hardware simulator and trained the agent with a reinforcement learning. Our training shows a success resulting in a lead pursuit with a decreased engagement time and a high reward.

인공지능(AI)를 교전상황에 활용하는 것은 최근 10년간 국방 분야의 주요 관심사였다. 이러한 응용을 위해서, AI 교전에이전트를 훈련해야 하며, 이를 위해 현실적인 시뮬레이션이 반드시 필요하다. 하드웨어 차원의 현실성을 가진 공중 무기체계 공중전 모델에서 AI 에이전트를 학습한 사례에 대해서 본 논문은 서술하고 있다. 특히, 본 논문은 기총만을 활용하는 공중전 상황에서 적을 어떻게 추적해야하는지 AI를 학습하였다. 본 논문은 현실적인 공중전 시뮬레이터를 작성하여, 에이전트의 행동을 강화학습으로 수행한 결과를 제시한다. 훈련 결과로는 Lead 추적을 활용하여 단축된 교전시간과 높은 보상을 갖는 에이전트의 학습에 성공하였다.

Keywords

References

  1. 이동진, & 방효충. (2009). 강화학습을 이용한 무인전투기(UCAV) 근접 공중전. 한국항공우주학회 학술발표회초록집, 249-252.
  2. 박현주, 이병윤, 유동완, & 탁민제. (2015). Scoring Function Matrix를 활용한 전투기 3 차원 공중전 기동 생성. 한국항공우주학회 학술발표회 초록집, 442-445.
  3. Breton, R., & Rousseau, R. (2005, June). The C-OODA: A cognitive version of the OODA loop to represent C2 activities. In Proceedings of the 10th International Command and Control Research Technology Symposium.
  4. Ernest, N., Carroll, D., Schumacher, C., Clark, M., Cohen, K., & Lee, G. (2016). Genetic fuzzy based artificial intelligence for unmanned combat aerial vehicle control in simulated air combat missions. J Def Manag, 6(144), 2167-0374.
  5. Toghiani-Rizi, B., Kamrani, F., Luotsinen, L. J., & Gisslen, L. (2017, October). Evaluating deep reinforcement learning for computer generated forces in ground combat simulation. In Systems, Man, and Cybernetics (SMC), 2017 IEEE International Conference on (pp. 3433-3438). IEEE.
  6. Park, H., Lee, B. Y., Tahk, M. J., & Yoo, D. W. (2016). Differential game based air combat maneuver generation using scoring function matrix. International Journal of Aeronautical and Space Sciences, 17(2), 204-213. https://doi.org/10.5139/IJASS.2016.17.2.204
  7. Sutton, R. S., & Barto, A. G. (1998). Introduction to reinforcement learning (Vol. 135). Cambridge: MIT press.
  8. Shaw, R. L. (1985). Fighter Combat. Naval Institute Press.
  9. Ng, A. Y., Harada, D., & Russell, S. (1999, June). Policy invariance under reward transformations: Theory and application to reward shaping. In ICML (Vol. 99, pp. 278-287).
  10. Sutton R. S., McAllester D., Singh S., Mansour Y. (2000) Policy Gradient Methods for Reinforcement Learning with Function. In NIPS.
  11. Silver, D., Lever G., Heess N., Degris T., Wierstra D., Riedmiller M. (2014). Deterministic Policy Gradient Algorithms. In ICML (JMLR: W&CP volume 32.).
  12. Lillicrap T. P., Hunt J. J., Pritzel A., Heess N., Erez T., Tassa Y., Silver D. & Wierstra D. (2016) Continuous Control with Deep Reinforcement Learning. In ICLR.