Deep Q-learning(DQN) – Experience Replay

To perform experience replay we store the agent’s experiences et=(st,at,rt,st+1)

Then we use a random sample of these prior actions instead of the most recent action to proceed.

* This removes correlations in the observation sequence and smooths changes in the data distribution.

* Iterative update adjusts Q towards target values that are only periodically updated, further reducing correlations with the target.

Comments are closed, but trackbacks and pingbacks are open.