What is reinforcement learning?
Reinforcement Learning (RL) is a subfield of Machine Learning. The idea behind reinforcement learning is that an agent (an AI) learns from the environment by interacting with it (through trial-and-error) and receiving rewards (positive or negative) as feedback for performing actions. Thus, the learning process in RL is similar to that of humans and animals.
What are the essential components of RL?
In the following, the essential components of RL and how it works are explained using an example. In our example, a robot is to walk from a starting point A to a destination point B.
Due to the trial-and-error approach, the RL process resembles a loop (see figure). The repetition of actions and the resulting rewards/punishments allow the agent (the program) to learn how to walk with the robot more efficiently from starting point A to destination point B.
Probably the best-known application of reinforcement learning is DeepMind's Alpha Go program, which in 2014 became the first computer program to beat a professional Go player. Go is a complex strategy game for two people that originated in China. AlphaZero, the improved version of Alpha Go, was not only able to defeat its predecessor Alpha Go, but also generalized for other games. Thus, AlphaZero was able to beat the best chess computer up to that time (Stockfish).
Sources (translated): Towards Data Science and Medium
Damage good. All good.