Distributed Deep Reinforcement Learning Method Using Profit Sharing for Learning Acceleration

Naoki Kodama, Taku Harada, Kazuteru Miyazaki

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Profit Sharing (PS), a reinforcement learning method that strongly reinforces successful experiences, has been shown to contribute to the improvement of learning speed when combined with a deep Q-network (DQN). We expect a further improvement in learning speed by integrating PS-based learning and Ape-X DQN that has state-of-the-art learning speed instead of the DQN. However, PS-based learning does not use replay memory. In contrast, the Ape-X DQN requires the use of replay memory because the exploration of the environment for collecting experiences and network training are performed asynchronously. In this study, we propose Learning-accelerated Ape-X, which integrates the Ape-X DQN and PS-based learning with some improvements including the use of replay memory. We show through numerical experiments that the proposed method improves the scores in Atari 2600 video games in a shorter time than the Ape-X DQN.

Original languageEnglish
Pages (from-to)1188-1196
Number of pages9
JournalIEEJ Transactions on Electrical and Electronic Engineering
Volume15
Issue number8
DOIs
Publication statusPublished - 1 Aug 2020

Keywords

  • atari
  • deep reinforcement learning
  • distributed learning
  • profit sharing
  • q-learning

Fingerprint

Dive into the research topics of 'Distributed Deep Reinforcement Learning Method Using Profit Sharing for Learning Acceleration'. Together they form a unique fingerprint.

Cite this