# Deep Reinforcement Learning — Papers

Many recent advancements in AI research stem from breakthroughs in deep reinforcement learning. This is a complex and varied field, but Junhyuk Oh at the University of Michigan has compiled a great list of papers. The list, which originally appeared on GitHub, are sorted by time with most recent appearing first.

## Bookmarks

## All Papers

- Deep Reinforcement Learning with an Unbounded Action Space, J. He et al.,
*arXiv*, 2015. - Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,
*arXiv*, 2015. - Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,
*arXiv*, 2015. - Generating Text with Deep Reinforcement Learning, H. Guo,
*arXiv*, 2015. - ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al.,
*arXiv*, 2015. - Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,
*arXiv*, 2015. - Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,
*arXiv*, 2015. - Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al.,
*arXiv*, 2015. - Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,
*arXiv*, 2015. - Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,
*EMNLP*, 2015. - Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai,
*arXiv*, 2015. - Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,
*NIPS*, 2015. - Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al.,
*arXiv*, 2015. - Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,
*arXiv*, 2015. - Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al.,
*arXiv*, 2015. - Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,
*arXiv*, 2015. - Maximum Entropy Deep Inverse Reinforcement Learning, M. Wulfmeier et al.,
*arXiv*, 2015. - End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,
*arXiv*, 2015. - DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al.,
*RSS*, 2015. - Universal Value Function Approximators, T. Schaul et al.,
*ICML*, 2015. - Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,
*ICML Workshop*, 2015. - Trust Region Policy Optimization, J. Schulman et al.,
*ICML*, 2015. - Human-level control through deep reinforcement learning, V. Mnih et al.,
*Nature*, 2015. - Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,
*NIPS*, 2014. - Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,
*NIPS Workshop*, 2013.

## Q-learning

- Deep Reinforcement Learning with an Unbounded Action Space, J. He et al.,
*arXiv*, 2015. - Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,
*arXiv*, 2015. - Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,
*arXiv*, 2015. - Generating Text with Deep Reinforcement Learning, H. Guo,
*arXiv*, 2015. - Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,
*arXiv*, 2015. - Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al.,
*arXiv*, 2015. - Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,
*arXiv*, 2015. - Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,
*EMNLP*, 2015. - Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,
*NIPS*, 2015. - Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,
*arXiv*, 2015. - Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,
*arXiv*, 2015. - Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,
*ICML Workshop*, 2015. - Human-level control through deep reinforcement learning, V. Mnih et al.,
*Nature*, 2015. - Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,
*NIPS Workshop*, 2013.

## Policy Gradient

- ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al.,
*arXiv*, 2015. - Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,
*arXiv*, 2015. - End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,
*arXiv*, 2015. - Trust Region Policy Optimization, J. Schulman et al.,
*ICML*, 2015.

## Discrete Control

- Deep Reinforcement Learning with an Unbounded Action Space, J. He et al.,
*arXiv*, 2015. - Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,
*arXiv*, 2015. - Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,
*arXiv*, 2015. - Generating Text with Deep Reinforcement Learning, H. Guo,
*arXiv*, 2015. - ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al.,
*arXiv*, 2015. - Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,
*arXiv*, 2015. - Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,
*arXiv*, 2015. - Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al.,
*arXiv*, 2015. - Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,
*EMNLP*, 2015. - Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai,
*arXiv*, 2015. - Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,
*NIPS*, 2015. - Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,
*arXiv*, 2015. - Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al.,
*arXiv*, 2015. - Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,
*arXiv*, 2015. - Universal Value Function Approximators, T. Schaul et al.,
*ICML*, 2015. - Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,
*ICML Workshop*, 2015. - Human-level control through deep reinforcement learning, V. Mnih et al.,
*Nature*, 2015. - Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,
*NIPS*, 2014. - Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,
*NIPS Workshop*, 2013.

## Continuous Control

- Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,
*arXiv*, 2015. - Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,
*arXiv*, 2015. - Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al.,
*arXiv*, 2015. - End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,
*arXiv*, 2015. - DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al.,
*RSS*, 2015. - Trust Region Policy Optimization, J. Schulman et al.,
*ICML*, 2015.

## Text Domain

- Deep Reinforcement Learning with an Unbounded Action Space, J. He et al.,
*arXiv*, 2015. - Generating Text with Deep Reinforcement Learning, H. Guo,
*arXiv*, 2015. - Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,
*EMNLP*, 2015. - Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al.,
*arXiv*, 2015.

## Visual Domain

- Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,
*arXiv*, 2015. - Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,
*arXiv*, 2015. - Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,
*arXiv*, 2015. - Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,
*arXiv*, 2015. - Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,
*arXiv*, 2015. - Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai,
*arXiv*, 2015. - Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,
*NIPS*, 2015. - Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,
*arXiv*, 2015. - Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,
*arXiv*, 2015. - End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,
*arXiv*, 2015. - Universal Value Function Approximators, T. Schaul et al.,
*ICML*, 2015. - Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,
*ICML Workshop*, 2015. - Trust Region Policy Optimization, J. Schulman et al.,
*ICML*, 2015. - Human-level control through deep reinforcement learning, V. Mnih et al.,
*Nature*, 2015. - Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,
*NIPS*, 2014. - Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,
*NIPS Workshop*, 2013.

## Robotics

- Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,
*arXiv*, 2015. - Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al.,
*arXiv*, 2015. - End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,
*arXiv*, 2015. - DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al.,
*RSS*, 2015. - Trust Region Policy Optimization, J. Schulman et al.,
*ICML*, 2015.

## Games

- Deep Reinforcement Learning with an Unbounded Action Space, J. He et al.,
*arXiv*, 2015. - Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,
*arXiv*, 2015. - Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,
*arXiv*, 2015. - Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,
*arXiv*, 2015. - Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,
*arXiv*, 2015. - Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,
*EMNLP*, 2015. - Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai,
*arXiv*, 2015. - Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,
*NIPS*, 2015. - Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,
*arXiv*, 2015. - Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,
*arXiv*, 2015. - Universal Value Function Approximators, T. Schaul et al.,
*ICML*, 2015. - Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,
*ICML Workshop*, 2015. - Trust Region Policy Optimization, J. Schulman et al.,
*ICML*, 2015. - Human-level control through deep reinforcement learning, V. Mnih et al.,
*Nature*, 2015. - Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,
*NIPS*, 2014. - Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,
*NIPS Workshop*, 2013.

## Monte-Carlo Tree Search

- Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,
*NIPS*, 2014.

## Inverse Reinforcement Learning

- Maximum Entropy Deep Inverse Reinforcement Learning, M. Wulfmeier et al.,
*arXiv*, 2015.

## Transfer Learning

- ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al.,
*arXiv*, 2015.

## Improving Exploration

- Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,
*NIPS*, 2015. - Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,
*arXiv*, 2015.

Josh.ai is an artificial intelligence agent for your home. If you’re interested in learning more, visit us at **https://josh.ai**.