Q-learning – Model-free reinforcement learning algorithm

On: July 6, 2019

In: AI, Algorithms, Deep Machine Learning, Reinforcement Learning

Tagged: action, agent, circumstances, expected, maximizes, optimal, policy, reward, sense, state, steps, successive

Learns a policy which tells an agent what action to take under what circumstances.

Q-learning learns a policy that is optimal in the sense that it maximizes the expected value of the total reward over any and all successive steps, starting from the current state.

Previous Post: Markov decision process (MDP) – Reinforcement Learning decision model

Next Post: Deep Q-learning(DQN) – Experience Replay

Comments are closed, but trackbacks and pingbacks are open.

Search

Q-learning – Model-free reinforcement learning algorithm

יאיר שנער

Yair Shinar