当前位置：网站首页>Deep Learning - Principles of Neural Networks 2

Deep Learning - Principles of Neural Networks 2

2022-08-09 06:01:00 【qq_43479892】

Quality resource sharing

Learn route directions (click to unlock)	Knowledge Positioning	Crowd targeting
🧡 Python actual WeChat ordering applet 🧡	Advanced	This course is a perfect combination of python flask+WeChat applet, from project construction to Tencent Cloud deployment, to create a full-stack meal ordering system.
Python quantitative trading practice	Entry Level	Take you hand in hand to build a quantitative trading system that is easy to expand, safer and more efficient

Principles of Neural Networks

For a neural network, we can divide it into input layer, hidden layer, and output layer. For neural network training, it can be divided into forward propagation and back propagation.Here we make conventions on the representation of data at different levels of the neural network.Here we demonstrate with a two-layer neural network model.
Here we define A0 as input layer data and A1 as hidden layer data,The last A2 is defined as the output layer data.The matrix input in this A0 is usually (m, n) m is the number of features, and n is the number of samples.For the L layer, the AL layer has a elements, and the L-1 layer has b
elements. Here we can go until WL is a matrix of shape (a, b), bL is a matrix of (1, b), and the inputAL-1 is a matrix of (b, n), and ZL and AL are a matrix of (a, n).

Propagation forward

The forward propagation is relatively simple, mainly as the above picture, each layer can be regarded as a separate logistic regression, but the activation function of one layer may be different, usually the Relu or tanh function is used as the activation function.

The specific calculation process is as follows:

Z[i]=W[i]A[i−1]+b[i]A[i]=σ(Z[i])Z[i+1]=W[i+1]A[i]+b[i+1]A[i+1]=σ(Z[i+!])## Backpropagation

The essence of backpropagation is to obtain the partial derivatives of w and b for each layer according to the chain development, and only need to modify one variable to realize an extremum process of the cost function.Assuming that the neural network has L layers, the activation function of the last layer is:

A[L]=sigmod(z)=11+e−z can be obtained for the cost function as:

J(W,b)=−1m∑1m[yilog(ALi)+(1−y)log(1−ALi)] So for the last layer we can calculate:

dJdAL=−1m[YAL−−1−Y1−AL] What is calculated here is matrix division, and a corresponding matrix is obtained.
Then calculate the reciprocal of one direction about , assuming that the activation function is σ(z)

is the derivative value of the activation function point dJdZL=dJdALσ' (σ' is the derivative value of the activation function z point) It is known that ZL=WLA[L−1]+b, so dJdWL and dJdbL can be calculated according to the chain rule:

dJdWL=dJdZL·dZLdWL=dJdZL·AL−1dJdbL=dJdZL can also be calculated according to the chain rule:

dJdAL−1=dJdZL·WL and then loop to obtain the gradient of w and b corresponding to each layer, which is used to update a data.
The specific process of backpropagation is roughly as shown in the above formula. The specific symbols may be wrong, but the idea is this.

原网站

版权声明
本文为[qq_43479892]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/221/202208090555474489.html