A Proposal for Reducing the Number of Trial-and-Error Searches for Deep Q-Networks Combined with Exploitation-Oriented Learning

Naoki Kodama, Kazuteru Miyazaki, Taku Harada

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Deep reinforcement learning has attracted attention for its application in deep Q-networks (DQNs). A DQN can attain superhuman performance, but it requires a large number of trial-and-error searches. To reduce the number of trial-and-error searches required for learning convergence in a DQN, multistep learning can be used and deep Q-networks with profit sharing (DQNwithPS) is one solution, but it has its own set of disadvantages. DQNwithPS optimizes a neural network by learning based on a DQN and profit sharing. However, multistep learning requires proper prefetching parameter tuning, and DQNwithPS has a learning performance degradation problem caused by profit sharing by not considering the expected rewards of a future episode. In this paper, we propose a learning-accelerated DQN combining multistep learning and DQNwithPS to cancel each disadvantage. The proposed method improves the prefetching parameter tuning in multistep learning by DQNwithPS and the learning performance degradation problem in DQNwithPS by multistep learning. By this method, we aim to reduce the number of trial-and-error searches compared to a DQN and DQNwithPS and to realize a manageable fast learning method.

Original languageEnglish
Title of host publicationProceedings - 17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018
EditorsM. Arif Wani, Moamar Sayed-Mouchaweh, Edwin Lughofer, Joao Gama, Mehmed Kantardzic
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages983-988
Number of pages6
ISBN (Electronic)9781538668047
DOIs
Publication statusPublished - 15 Jan 2019
Event17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018 - Orlando, United States
Duration: 17 Dec 201820 Dec 2018

Publication series

NameProceedings - 17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018

Conference

Conference17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018
Country/TerritoryUnited States
CityOrlando
Period17/12/1820/12/18

Keywords

  • Deep q-network
  • Deep reinforcement learning
  • Exploitation-oriented learning
  • Profit sharing
  • Q-learning

Fingerprint

Dive into the research topics of 'A Proposal for Reducing the Number of Trial-and-Error Searches for Deep Q-Networks Combined with Exploitation-Oriented Learning'. Together they form a unique fingerprint.

Cite this