当前位置：网站首页>What is Machine Reinforcement Learning?What is the principle?

What is Machine Reinforcement Learning?What is the principle?

2022-08-11 03:51:00 【Program Yuan Keke】

Reinforcement Learning (RL), also known as Reinforcement Learning and Evaluation Learning, is an important machine learning method. It has many applications in the fields of intelligent control of robots and analysis and predictionapplication.

So what is reinforcement learning?

Reinforcement learning is the learning of intelligent system mapping from environment to behavior, so as to maximize the function value of reward signal (reinforcement signal), reinforcement learning is different from supervision in connectionist learningLearning is mainly manifested in the teacher signal. The reinforcement signal provided by the environment in reinforcement learning is an evaluation (usually a scalar signal) of the quality of the generated action, rather than telling the reinforcement learning system RLS (reinforcement learning system) how to go.produce the correct action.Because the external environment provides little information, RLS must learn from its own experience or ability.In this way, the RLS acquires knowledge in an action-evaluation environment and adapts the program to suit the environment.

In layman's terms, when a child is confused or confused in learning, if the teacher finds that the child's method or thinking is correct, he or she will be given positive feedback (reward orencouragement); otherwise, give him (her) negative feedback (lessons or punishment), motivate the child's potential, strengthen his (her) self-learning ability, rely on his or her own strength to actively learn and continue to explore, and finally let him (her) findThe correct method or idea to adapt to the changing external environment.

Reinforcement learning is different from traditional machine learning. It cannot get a mark immediately, but can only get a feedback (reward or penalty). It can be said that reinforcement learning is a kind of markDelayed supervised learning.Reinforcement learning is developed from theories such as animal learning and parameter perturbation adaptive control.

Principles of reinforcement learning:

If a certain behavioral strategy of the agent leads to a positive reward (reinforcing signal) in the environment, then the tendency of the agent to produce this behavioral strategy in the future will be strengthened.The agent's goal is to discover the optimal policy in each discrete state to maximize the expected discounted reward sum.

Reinforcement learning regards learning as a tentative evaluation process. Agent selects an action to use in the environment. After the environment accepts the action, the state changes, and a reinforcement signal (reward orThe agent then selects the next action according to the reinforcement signal and the current state of the environment. The principle of selection is to increase the probability of positive reinforcement (reward).The selected action not only affects the immediate enhancement value, but also affects the state of the environment at the next moment and the final enhancement value.

If the R/A gradient information is known, the supervised learning algorithm can be used directly.Because the reinforcement signal R and the action A generated by the Agent are not described in a clear functional form, the gradient information R/A cannot be obtained.Therefore, in reinforcement learning systems, some kind of random unit is required, with which the agent searches through the space of possible actions and finds the correct action.

Free to share some artificial intelligence learning materials that I have organized for you. It has been organized for a long time and is very comprehensive.Including some basic introduction videos of artificial intelligence + practical videos of common AI frameworks, image recognition, OpenCV, NLP, YOLO, machine learning, pytorch, computer vision, deep learning and neural networks and other videos, courseware source code, domestic and foreign well-known essence resources, AI popularpapers, etc.
The following are some screenshots, and the free download method is attached at the end of the article.

Table of Contents