当前位置：网站首页>1.1 pytorch and neural network

1.1 pytorch and neural network

2022-04-23 07:21:00 【sunshinecxm_ BJTU】

The first 1 Chapter PyTorch And neural networks

1.1 PyTorch introduction

1.1.2 PyTorch tensor

1.1.3 PyTorch Automatic derivation mechanism

1.1.4 Calculation chart

Automatic gradient calculation seems magical , But it's not magic .
The principle behind it is worthy of in-depth understanding , This knowledge will help us build larger networks .
Take a look at this very simple network . It's not even a neural network , It's just a series of calculations .
Insert picture description here
In the diagram above , We see the input x Used to calculate y,y Then used to calculate the output z.
hypothesis y and z The calculation process of is as follows ：

If we want to know the output $z$ How to follow $x$ change , We need to know the gradient $d y / d x$ . Let's calculate step by step .
Insert picture description here
The first line is the chain rule of calculus （chain rule）, Very important to us .
We just figured out , $z$ along with $x$ The change of can be expressed as $4 x$ . If $x = 3.5$ , be $d z / d x = 4 \times 3.5 = 14$ .

When $y$ With $x$ The formal definition of , and $z$ With $y$ The formal definition of ,PyTorch Then connect these tensors into a picture , To show how these tensors are connected . This picture is called the calculation chart （computation graph）.

In our case , The calculation diagram may look like the following ：
Insert picture description here
We can see $y$ How is from $x$ Calculated , $z$ How is from $y$ Calculated . Besides ,PyTorch Several reverse arrows have also been added , Express $y$ How to follow $x$ change , $z$ How to follow $y$ change . These are gradients , It is used to update the neural network in the training process . The process of calculus consists of PyTorch complete , There is no need for us to calculate .

To work out $z$ How to follow $x$ change , We merged from $z$ Through $y$ go back to $x$ All gradients in the path of . This is the chain rule of calculus .
Insert picture description here
PyTorch There is only one forward connection graph . We need to pass backward() function , send PyTorch Calculate the reverse gradient .

gradient dz/dx In tensor x Is stored as x.grad.

It is worth noting that , tensor $x$ The internal gradient value is the same as z It's about . This is because we ask PyTorch Use z.backward () from $z$ Reverse calculation . therefore , $x . g r a d$ yes $d z / d x$ , instead of $d y / d x$ .

Most effective neural networks contain multiple nodes , Each node has multiple links connected to the node , And the link from this node . Let's take a simple example , The node in the example has multiple incoming links .
Insert picture description here
so , Input $a$ and $b$ At the same time $x$ and $y$ Have an impact on , And output $z$ By $x$ and $y$ Calculated .
The relationship between these nodes is as follows .

We calculate the gradient in the same way .

next , Add this information to the calculation diagram .
Insert picture description here
Now? , We can easily pass z To a Calculate the gradient of the path dz/da. actually , from z To a There are two paths , One goes through x, The other one goes through y, We just need to add the expressions of the two paths . This is reasonable , Because from a To z Both paths are affected z Value , This is also the same as what we calculated with the chain rule of calculus dz/da The result is the same .

$d z / d a = d z / d x + d x / d a + d z / d y + d y / d a$
Insert picture description here
The first path goes through $x$ , Expressed as $2 \times 2$ ; The second path goes through $y$ , Expressed as $3 \times 10 a$ . therefore , $z$ along with $a$ The rate of change is $4 + 30 a$ .
If $a$ yes 2, be $d z / d a$ yes 4 + 30 × 2 = 64.

Let's test it with PyTorch Whether this value can also be obtained . First , We define PyTorch The relationships needed to build the calculation diagram .

Insert picture description here
next , We trigger the gradient calculation and query the tensor $a$ The value of the inside .

Effective neural networks are usually much larger than this small network . however PyTorch The way to construct the calculation diagram and the process of calculating the gradient backward along the path are the same .

1.1.5 Learning points

Colab The service allows us to run on Google's servers Python Code .Colab Use Python The notebook , We only need one Web Browser ready to use .
PyTorch It's a leading Python Machine learning architecture . It is associated with numpy similar , Allows us to use an array of numbers . meanwhile , It also provides a rich set of tools and functions , Make machine learning easier .
stay PyTorch in , The basic unit of data is tensor （tensor）. Tensors can be multidimensional arrays 、 Simple two-dimensional matrix 、 One dimensional list , It can also be a single value .
PyTorch Its main feature is that it can automatically calculate the gradient of the function （gradient）. The calculation of gradient is the key to training neural network . So ,PyTorch You need to build a calculation diagram （computationgraph）, The graph contains multiple tensors and the relationship between them . In the code , This process is automatically completed when we define another tensor with one tensor .

版权声明
本文为[sunshinecxm_ BJTU]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/04/202204230610529584.html