Q-Learning

In-depth explanation

Q-learning learns a Q-function that estimates the expected future reward for taking an action in a state. It's model-free (doesn't need environment model) and off-policy (can learn from any experience). Deep Q-Networks (DQN) combine Q-learning with neural networks to handle large state spaces, achieving human-level performance on Atari games.

Examples

Atari game playing

Resource allocation

Related terms

Reinforcement Learning

More in Reinforcement Learning

Policy Gradient

Policy Gradient methods are a class of algorithms in reinforcement learning that optimize the policy directly by using the gradient of the expected reward with respect to the policy parameters.

Reinforcement Learning

Machine learning where an agent learns to make decisions by taking actions and receiving rewards or penalties.

In-depth explanation

Examples

Related terms

More in Reinforcement Learning

Policy Gradient

Reinforcement Learning

Master Q-Learning.