(一)tensors
张量是一种特殊的数据结构,与数组矩阵类似,在pytoch中,使用tensors对模型的输入和输出进行编码
import torch
import numpy as np
1.tensor 初始化
data= [ [ 1 , 2 ] , [ 3 , 4 ] ]
x_data= torch. tensor( data)
np_array= np. array( data)
x_np= torch. from_numpy( np_array)
x_ones= torch. ones_like( x_data)
print ( f'ones tensor:\n{x_ones}\n' )
x_rands= torch. rand_like( x_data, dtype= torch. float )
print ( f'random tensor:\n{x_rands}\n' )
ones tensor:
tensor([[1, 1],
[1, 1]])
random tensor:
tensor([[0.3272, 0.3049],
[0.3315, 0.8603]])
shape是tensor维度
shape= ( 2 , 3 , )
rand_tensor= torch. rand( shape)
ones_tensor= torch. ones( shape)
zeros_tensor= torch. zeros( shape)
print ( rand_tensor)
print ( ones_tensor)
print ( zeros_tensor)
tensor([[0.3955, 0.7930, 0.1733],
[0.3849, 0.5444, 0.3754]])
tensor([[1., 1., 1.],
[1., 1., 1.]])
tensor([[0., 0., 0.],
[0., 0., 0.]])
2.tensor 性质
shape,datatype,device(存储位置)
tensor= torch. rand( 3 , 4 )
print ( tensor. shape, '\n' , tensor. dtype, '\n' , tensor. device)
torch.Size([3, 4])
torch.float32
cpu
3.tensor 运算
转置、索引、切片、数学运算、线性代数、随机采样
tensor= torch. ones( 4 , 4 )
tensor[ : , 1 ] = 0
print ( tensor)
tensor([[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.]])
t1= torch. cat( [ tensor, tensor, tensor] , dim= 1 )
t1
tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])
tensor. mul( tensor)
tensor* tensor
tensor. matmul( tensor. T)
tensor@tensor. T
tensor([[3., 3., 3., 3.],
[3., 3., 3., 3.],
[3., 3., 3., 3.],
[3., 3., 3., 3.]])
print ( tensor)
tensor. add_( 4 )
print ( tensor)
tensor([[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.]])
tensor([[5., 4., 5., 5.],
[5., 4., 5., 5.],
[5., 4., 5., 5.],
[5., 4., 5., 5.]])
4.bridge numpy
t= torch. ones( 5 )
print ( f't:{t}' )
n= t. numpy( )
print ( f'n:{n}' )
t:tensor([1., 1., 1., 1., 1.])
n:[1. 1. 1. 1. 1.]
t. add_( 1 )
print ( t)
print ( n)
tensor([2., 2., 2., 2., 2.])
[2. 2. 2. 2. 2.]
n= np. ones( 5 )
t= torch. from_numpy( n)
np. add( n, 1 , out= n)
print ( t)
print ( n)
tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
[2. 2. 2. 2. 2.]
(二)torch.autograd
pytorch自动差分引擎,可为神经网络训练提供支持
1.usage in pytorch
import ssl
ssl. _create_default_https_context = ssl. _create_unverified_context
import torch, torchvision
model= torchvision. models. resnet18( pretrained= True )
data= torch. rand( 1 , 3 , 64 , 64 )
labels= torch. rand( 1 , 1000 )
prediction= model( data)
loss= ( prediction- labels) . sum ( )
loss. backward( )
optim= torch. optim. SGD( model. parameters( ) , lr= 1e - 2 , momentum= 0.9 )
optim. step( )
2.differentiation in autograd
import torch
a= torch. tensor( [ 2 . , 3 . ] , requires_grad= True )
b= torch. tensor( [ 6 . , 4 . ] , requires_grad= True )
Q= 3 * a** 3 - b** 2
external_grad= torch. tensor( [ 1 , 1 ] )
Q. backward( gradient= external_grad)
print ( a. grad)
print ( b. grad)
tensor([36., 81.])
tensor([-12., -8.])
3.computational graph
autograd保留DAG(有向无环图,包含函数对象)中的所有数据(tensors)和操作
1.前向传播:计算结果tensor,记录gradient function(leaves–root)
2.反向传播:计算每个参数的梯度并保存在tensor.grad中,链式法则(root–leaves)
x= torch. rand( 5 , 5 )
y= torch. rand( 5 , 5 )
z= torch. rand( ( 5 , 5 ) , requires_grad= True )
a= x+ y
print ( a. requires_grad)
b= x+ z
print ( b. requires_grad)
False
True
frozen parameters:不计算梯度的参数,减少计算量
from torch import nn, optim
model= torchvision. models. resnet18( pretrained= True )
for param in model. parameters( ) :
param. requires_grad= False
model. fc= nn. Linear( 512 , 10 )
optimizer= optim. SGD( model. parameters( ) , lr= 1e - 2 , momentum= 0.9 )
(三)神经网络
torch.nn包构建神经网络
神经网络训练步骤:
1.定义神经网络(包含一些需要学习的参数/权重)
2.遍历输入数据集
3.通过网络处理输入
4.计算损失函数
5.网络参数梯度反向传播
6.通常使用简单的更新规则来更新网络的权重:weight = weight - learning_rate * gradient
1.define network
(1)Containers:
(2)Convolution Layers:
nn.Conv2d:Applies a 2D convolution over an input signal composed of several input planes
(3)Linear Layers
nn.Linear:Applies a linear transformation to the incoming data(y=wx+b)
import torch
import torch. nn as nn
import torch. nn. functional as F
class Net ( nn. Module) :
def __init__ ( self) :
super ( Net, self) . __init__( )
self. conv1 = nn. Conv2d( 1 , 6 , 5 )
self. conv2 = nn. Conv2d( 6 , 16 , 5 )
self. fc1 = nn. Linear( 16 * 5 * 5 , 120 )
self. fc2 = nn. Linear( 120 , 84 )
self. fc3 = nn. Linear( 84 , 10 )
def forward ( self, x) :
x = F. max_pool2d( F. relu( self. conv1( x) ) , ( 2 , 2 ) )
x = F. max_pool2d( F. relu( self. conv2( x) ) , 2 )
x = torch. flatten( x, 1 )
x = F. relu( self. fc1( x) )
x = F. relu( self. fc2( x) )
x = self. fc3( x)
return x
net = Net( )
print ( net)
Net(
(conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
(conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
(fc1): Linear(in_features=400, out_features=120, bias=True)
(fc2): Linear(in_features=120, out_features=84, bias=True)
(fc3): Linear(in_features=84, out_features=10, bias=True)
)
只需要定义forward函数,就可以使用autograd自定义backward函数
模型学习参数由net.parameters()返回
params = list ( net. parameters( ) )
print ( len ( params) )
print ( params[ 0 ] . size( ) )
10
torch.Size([6, 1, 5, 5])
input = torch. randn( 1 , 1 , 32 , 32 )
out = net( input )
print ( out)
tensor([[ 0.0735, -0.0377, 0.1258, -0.0828, -0.0173, -0.0726, -0.0875, -0.0256,
-0.0797, 0.0959]], grad_fn=<AddmmBackward0>)
使用随机梯度将所有参数和反向传播的梯度缓冲区归零
net. zero_grad
out. backward( torch. randn( 1 , 10 ) )
torch.nn仅支持小批量。 整个torch.nn包仅支持作为微型样本而不是单个样本的输入。例如,nn.Conv2d采用nSamples x nChannels x Height x Width的4D张量
目前为止看到的类:
torch.Tensor:一个多维数组,支持backward()的自动微分操作,保存张量梯度
nn.Module:神经网络模块,封装参数
nn.Parameter:一种张量,将其分配为Module的属性时,自动注册为参数
autograd.Function:实现自动微分操作的正向和反向定义,每个Tensor操作都会创建至少一个Function节点,该节点连接到创建Tensor的函数,并且编码其历史记录。
2.loss function
损失函数采用(输出,目标)作为输入,并计算一个值估计输出与目标之间的距离,nn包有好几种不同的损失函数,简单的如nn.MSELoss,计算均方误差
output = net( input )
target = torch. randn( 10 )
target = target. view( 1 , - 1 )
criterion = nn. MSELoss( )
loss = criterion( output, target)
print ( loss)
tensor(0.4356, grad_fn=<MseLossBackward0>)
使用.grad_fn属性向后跟随loss,将得到一个计算图,调用loss.backward()时,整个图被微分,图中具有requires_grad=True的所有张量将随梯度累积其.grad张量
print ( loss. grad_fn)
print ( loss. grad_fn. next_functions[ 0 ] [ 0 ] )
print ( loss. grad_fn. next_functions[ 0 ] [ 0 ] . next_functions[ 0 ] [ 0 ] )
<MseLossBackward0 object at 0x7fef4965df10>
<AddmmBackward0 object at 0x7fef4965d3a0>
<AccumulateGrad object at 0x7fef4965df10>
3.Backprop
反向传播,只需要loss.backward(),在此之前先清除现有梯度,否则梯度将累计到现在的梯度中
net. zero_grad( )
print ( 'conv1的前偏差梯度' )
print ( net. conv1. bias. grad)
loss. backward( )
print ( 'conv1的后偏差梯度' )
print ( net. conv1. bias. grad)
conv1的前偏差梯度
tensor([0., 0., 0., 0., 0., 0.])
conv1的后偏差梯度
tensor([ 0.0124, 0.0051, -0.0029, -0.0088, 0.0048, 0.0012])
4.Update the weights
最简单的更新规则是随机梯度下降(SGD)
weight = weight - learning_rate * gradient
learning_rate = 0.01
for f in net. parameters( ) :
f. data. sub_( f. grad. data* learning_rate)
但是使用神经网络时,可能需要用到不用的更新规则,如SGD,Nesterov-SGD,Adam,RMSProp等,使用torch.optim包可实现所有方法
import torch. optim as optim
optimizer = optim. SGD( net. parameters( ) , lr= 0.01 )
optimizer. zero_grad( )
output = net( input )
loss = criterion( output, target)
loss. backward( )
optimizer. step( )
print ( net. conv1. bias. grad)
tensor([ 0.0119, 0.0050, -0.0034, -0.0109, 0.0049, -0.0009])