0%

Eat-pytorch-3-hierarchy

1. Introduction

1.1 Preface

本系列博文是和鲸社区的活动《20天吃掉那只PyTorch》学习的笔记,本篇为系列笔记的第三篇—— Pytorch 的层次结构。该专栏是 Github2.8K 星的项目,在学习该书的过程中可以参考阅读《Python深度学习》一书的第一部分"深度学习基础"内容。

《Python深度学习》这本书是 Keras 之父 Francois Chollet 所著,该书假定读者无任何机器学习知识,以Keras 为工具,使用丰富的范例示范深度学习的最佳实践,该书通俗易懂,全书没有一个数学公式,注重培养读者的深度学习直觉。

《Python深度学习》一书的第一部分的 4 个章节内容如下,预计读者可以在 20 小时之内学完。

  1. 什么是深度学习
  2. 神经网络的数学基础
  3. 神经网络入门
  4. 机器学习基础

本系列博文的大纲如下:

  • 一、PyTorch的建模流程
  • 二、PyTorch的核心概念
  • 三、PyTorch的层次结构
  • 四、PyTorch的低阶API
  • 五、PyTorch的中阶API
  • 六、PyTorch的高阶API

最后,本博文提供所使用的全部数据,读者可以从下述连接中下载数据:

Download Now

1.2 Pytorch 的层次结构

本章我们介绍 Pytorch5 个不同的层次结构:

  • 硬件层
  • 内核层
  • 低阶 API
  • 中阶 `API ``
  • 高阶 APItorchkeras

并以线性回归和 DNN 二分类模型为例,直观对比展示在不同层级实现模型的特点。

Pytorch 的层次结构从低到高可以分成如下五层:

  1. 最底层为硬件层,Pytorch 支持 CPUGPU 加入计算资源池;

  2. 第二层为 C++ 实现的内核;

  3. 第三层为 Python 实现的操作符,提供了封装 C++ 内核的低级API指令,主要包括各种张量操作算子、自动微分、变量管理;

    torch.tensor , torch.cat, torch.autograd.grad, nn.Module。如果把模型比作一个房子,那么第三层 API 就是【模型之砖】。

  4. 第四层为 Python 实现的模型组件,对低级 API 进行了函数封装,主要包括各种模型层,损失函数,优化器,数据管道等等。

    torch.nn.Linear, torch.nn.BCE, torch.optim.Adam, torch.utils.data.DataLoader。如果把模型比作一个房子,那么第四层API就是【模型之墙】。

  5. 第五层为 Python 实现的模型接口。Pytorch 没有官方的高阶API。为了便于训练模型,作者仿照 keras 中的模型接口,使用了不到 300 行代码,封装了Pytorch的高阶模型接口 torchkeras.Model 。如果把模型比作一个房子,那么第五层API就是模型本身,即【模型之屋】。

2. 低阶 API 示范

下面的范例使用 Pytorch 的低阶 API 实现线性回归模型和 DNN 二分类模型。低阶 API 主要包括张量操作,计算图和自动微分。

2.1 Linear regression

2.1.1 Prepare data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import torch
from torch import nn

#样本数量
n = 400

# 生成测试用数据集
X = 10*torch.rand([n,2])-5.0 #torch.rand是均匀分布
w0 = torch.tensor([[2.0],[-3.0]])
b0 = torch.tensor([[10.0]])
Y = X@w0 + b0 + torch.normal( 0.0,2.0,size = [n,1]) # @表示矩阵乘法,增加正态扰动
  • 数据可视化

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    # 数据可视化

    %matplotlib inline
    %config InlineBackend.figure_format = 'svg'

    plt.figure(figsize = (12,5))
    ax1 = plt.subplot(121)
    ax1.scatter(X[:,0].numpy(),Y[:,0].numpy(), c = "b",label = "samples")
    ax1.legend()
    plt.xlabel("x1")
    plt.ylabel("y",rotation = 0)

    ax2 = plt.subplot(122)
    ax2.scatter(X[:,1].numpy(),Y[:,0].numpy(), c = "g",label = "samples")
    ax2.legend()
    plt.xlabel("x2")
    plt.ylabel("y",rotation = 0)
    plt.show()

    Results:

  • 构建数据管道迭代器

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    # 构建数据管道迭代器
    def data_iter(features, labels, batch_size=8):
    num_examples = len(features)
    indices = list(range(num_examples))
    np.random.shuffle(indices) #样本的读取顺序是随机的
    for i in range(0, num_examples, batch_size):
    indexs = torch.LongTensor(indices[i: min(i + batch_size, num_examples)])
    yield features.index_select(0, indexs), labels.index_select(0, indexs)

    # 测试数据管道效果
    batch_size = 8
    (features,labels) = next(data_iter(X,Y,batch_size))
    print(features)
    print(labels)

    Result:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    tensor([[ 1.9428, -0.7624],
    [-2.5625, 0.4411],
    [-0.7651, -3.8922],
    [ 3.1022, -2.6201],
    [-1.0578, -2.6963],
    [-1.9720, 3.8035],
    [-3.4711, -2.4106],
    [-0.6102, 2.6127]])
    tensor([[11.7814],
    [ 6.0209],
    [23.4428],
    [22.5369],
    [17.8275],
    [-8.7643],
    [ 7.5050],
    [ 0.5841]])

2.2 Model

2.2.1 Define model

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# 定义模型
class LinearRegression:

def __init__(self):
self.w = torch.randn_like(w0,requires_grad=True)
self.b = torch.zeros_like(b0,requires_grad=True)

#正向传播
def forward(self,x):
return x@self.w + self.b

# 损失函数
def loss_func(self,y_pred,y_true):
return torch.mean((y_pred - y_true)**2/2)

model = LinearRegression()

2.2.2 Training model

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def train_step(model, features, labels):

predictions = model.forward(features)
loss = model.loss_func(predictions,labels)

# 反向传播求梯度
loss.backward()

# 使用torch.no_grad()避免梯度记录,也可以通过操作 model.w.data 实现避免梯度记录
with torch.no_grad():
# 梯度下降法更新参数
model.w -= 0.001*model.w.grad
model.b -= 0.001*model.b.grad

# 梯度清零
model.w.grad.zero_()
model.b.grad.zero_()
return loss
  • 测试 train_step 效果

    1
    2
    3
    batch_size = 10
    (features,labels) = next(data_iter(X,Y,batch_size))
    train_step(model,features,labels)

    Results:

    1
    tensor(68.6391, grad_fn=<MeanBackward0>)

  • Train model

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    def train_model(model,epochs):
    for epoch in range(1,epochs+1):
    for features, labels in data_iter(X,Y,10):
    loss = train_step(model,features,labels)

    if epoch%200==0:
    print("epoch =",epoch,"loss = ",loss.item())
    print("model.w =",model.w.data)
    print("model.b =",model.b.data)

    train_model(model,epochs = 1000)

    Results:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    epoch = 200 loss =  3.2508397102355957
    model.w = tensor([[ 2.0401],
    [-2.9877]])
    model.b = tensor([[9.9169]])
    epoch = 400 loss = 3.0016872882843018
    model.w = tensor([[ 2.0435],
    [-2.9855]])
    model.b = tensor([[9.9173]])
    epoch = 600 loss = 2.7006335258483887
    model.w = tensor([[ 2.0418],
    [-2.9843]])
    model.b = tensor([[9.9174]])
    epoch = 800 loss = 1.280609369277954
    model.w = tensor([[ 2.0416],
    [-2.9869]])
    model.b = tensor([[9.9169]])
    epoch = 1000 loss = 2.169107675552368
    model.w = tensor([[ 2.0420],
    [-2.9852]])
    model.b = tensor([[9.9170]])

2.2.3 Visualization

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# 结果可视化

%matplotlib inline
%config InlineBackend.figure_format = 'svg'

plt.figure(figsize = (12,5))
ax1 = plt.subplot(121)
ax1.scatter(X[:,0].numpy(),Y[:,0].numpy(), c = "b",label = "samples")
ax1.plot(X[:,0].numpy(),(model.w[0].data*X[:,0]+model.b[0].data).numpy(),"-r",linewidth = 5.0,label = "model")
ax1.legend()
plt.xlabel("x1")
plt.ylabel("y",rotation = 0)

ax2 = plt.subplot(122)
ax2.scatter(X[:,1].numpy(),Y[:,0].numpy(), c = "g",label = "samples")
ax2.plot(X[:,1].numpy(),(model.w[1].data*X[:,1]+model.b[0].data).numpy(),"-r",linewidth = 5.0,label = "model")
ax2.legend()
plt.xlabel("x2")
plt.ylabel("y",rotation = 0)

plt.show()

Results:

2.3 DNN二分类模型

2.3.1 Prepare data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import torch
from torch import nn
%matplotlib inline
%config InlineBackend.figure_format = 'svg'

#正负样本数量
n_positive,n_negative = 2000,2000

#生成正样本, 小圆环分布
r_p = 5.0 + torch.normal(0.0,1.0,size = [n_positive,1])
theta_p = 2*np.pi*torch.rand([n_positive,1])
Xp = torch.cat([r_p*torch.cos(theta_p),r_p*torch.sin(theta_p)],axis = 1)
Yp = torch.ones_like(r_p)

#生成负样本, 大圆环分布
r_n = 8.0 + torch.normal(0.0,1.0,size = [n_negative,1])
theta_n = 2*np.pi*torch.rand([n_negative,1])
Xn = torch.cat([r_n*torch.cos(theta_n),r_n*torch.sin(theta_n)],axis = 1)
Yn = torch.zeros_like(r_n)

#汇总样本
X = torch.cat([Xp,Xn],axis = 0)
Y = torch.cat([Yp,Yn],axis = 0)

#可视化
plt.figure(figsize = (6,6))
plt.scatter(Xp[:,0].numpy(),Xp[:,1].numpy(),c = "r")
plt.scatter(Xn[:,0].numpy(),Xn[:,1].numpy(),c = "g")
plt.legend(["positive","negative"]);

Results:

  • 构建数据管道迭代器

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    # 构建数据管道迭代器
    def data_iter(features, labels, batch_size=8):
    num_examples = len(features)
    indices = list(range(num_examples))
    np.random.shuffle(indices) #样本的读取顺序是随机的
    for i in range(0, num_examples, batch_size):
    indexs = torch.LongTensor(indices[i: min(i + batch_size, num_examples)])
    yield features.index_select(0, indexs), labels.index_select(0, indexs)

    # 测试数据管道效果
    batch_size = 8
    (features,labels) = next(data_iter(X,Y,batch_size))
    print(features)
    print(labels)

    Results:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    tensor([[ 6.3216, -2.6834],
    [ 2.4433, 4.4928],
    [ 8.5585, 3.0958],
    [-1.0328, 3.3381],
    [-4.6885, -0.1144],
    [ 8.7589, -3.4486],
    [ 0.4830, 3.6482],
    [ 4.9465, 0.3443]])
    tensor([[0.],
    [1.],
    [0.],
    [1.],
    [1.],
    [0.],
    [1.],
    [1.]])

2.3.2 Define model

此处范例我们利用 nn.Module 来组织模型变量。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
class DNNModel(nn.Module):
def __init__(self):
super(DNNModel, self).__init__()
self.w1 = nn.Parameter(torch.randn(2,4))
self.b1 = nn.Parameter(torch.zeros(1,4))
self.w2 = nn.Parameter(torch.randn(4,8))
self.b2 = nn.Parameter(torch.zeros(1,8))
self.w3 = nn.Parameter(torch.randn(8,1))
self.b3 = nn.Parameter(torch.zeros(1,1))

# 正向传播
def forward(self,x):
x = torch.relu(x@self.w1 + self.b1)
x = torch.relu(x@self.w2 + self.b2)
y = torch.sigmoid(x@self.w3 + self.b3)
return y

# 损失函数(二元交叉熵)
def loss_func(self,y_pred,y_true):
#将预测值限制在1e-7以上, 1- (1e-7)以下,避免log(0)错误
eps = 1e-7
y_pred = torch.clamp(y_pred,eps,1.0-eps)
bce = - y_true*torch.log(y_pred) - (1-y_true)*torch.log(1-y_pred)
return torch.mean(bce)

# 评估指标(准确率)
def metric_func(self,y_pred,y_true):
y_pred = torch.where(y_pred>0.5,torch.ones_like(y_pred,dtype = torch.float32),
torch.zeros_like(y_pred,dtype = torch.float32))
acc = torch.mean(1-torch.abs(y_true-y_pred))
return acc

model = DNNModel()
  • 测试模型结构

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    # 测试模型结构
    batch_size = 10
    (features,labels) = next(data_iter(X,Y,batch_size))

    predictions = model(features)

    loss = model.loss_func(labels,predictions)
    metric = model.metric_func(labels,predictions)

    print("init loss:", loss.item())
    print("init metric:", metric.item())

    Results:

    1
    2
    init loss: 7.446216583251953
    init metric: 0.5362008810043335

    1
    len(list(model.parameters()))

    Results:

    1
    6

2.3.3 Trianing model

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
def train_step(model, features, labels):   

# 正向传播求损失
predictions = model.forward(features)
loss = model.loss_func(predictions,labels)
metric = model.metric_func(predictions,labels)

# 反向传播求梯度
loss.backward()

# 梯度下降法更新参数
for param in model.parameters():
#注意是对param.data进行重新赋值,避免此处操作引起梯度记录
param.data = (param.data - 0.01*param.grad.data)

# 梯度清零
model.zero_grad()

return loss.item(),metric.item()

def train_model(model,epochs):
for epoch in range(1,epochs+1):
loss_list,metric_list = [],[]
for features, labels in data_iter(X,Y,20):
lossi,metrici = train_step(model,features,labels)
loss_list.append(lossi)
metric_list.append(metrici)
loss = np.mean(loss_list)
metric = np.mean(metric_list)

if epoch%100==0:
print("epoch =",epoch,"loss = ",loss,"metric = ",metric)

train_model(model,epochs = 1000)

Results:

epoch = 100 loss =  0.1934373697731644 metric =  0.9207499933242798
epoch = 200 loss =  0.18901969484053552 metric =  0.918999993801117
epoch = 300 loss =  0.18451461097225547 metric =  0.9247499924898147
epoch = 400 loss =  0.18301934767514466 metric =  0.9247499933838844
epoch = 500 loss =  0.18300161071121693 metric =  0.9274999922513962
epoch = 600 loss =  0.18265636594966053 metric =  0.9219999933242797
epoch = 700 loss =  0.18221229410730302 metric =  0.9239999923110008
epoch = 800 loss =  0.1817048901133239 metric =  0.922749992609024
epoch = 900 loss =  0.18160937033127994 metric =  0.9259999924898148
epoch = 1000 loss =  0.1799963693227619 metric =  0.9282499927282334

2.3.4 Visualization

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 结果可视化
fig, (ax1,ax2) = plt.subplots(nrows=1,ncols=2,figsize = (12,5))
ax1.scatter(Xp[:,0],Xp[:,1], c="r")
ax1.scatter(Xn[:,0],Xn[:,1],c = "g")
ax1.legend(["positive","negative"]);
ax1.set_title("y_true");

Xp_pred = X[torch.squeeze(model.forward(X)>=0.5)]
Xn_pred = X[torch.squeeze(model.forward(X)<0.5)]

ax2.scatter(Xp_pred[:,0],Xp_pred[:,1],c = "r")
ax2.scatter(Xn_pred[:,0],Xn_pred[:,1],c = "g")
ax2.legend(["positive","negative"]);
ax2.set_title("y_pred");

Results:

3. 中阶API示范

下面的范例使用 Pytorch 的中阶 API 实现线性回归模型和和 DNN 二分类模型。

Pytorch 的中阶 API 主要包括:

  • 各种模型层;
  • 损失函数;
  • 优化器;
  • 数据管道等。

3.1 Linear regression

3.1.1 Prepare data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import torch
from torch import nn
import torch.nn.functional as F
from torch.utils.data import Dataset,DataLoader,TensorDataset

#样本数量
n = 400

# 生成测试用数据集
X = 10*torch.rand([n,2])-5.0 #torch.rand是均匀分布
w0 = torch.tensor([[2.0],[-3.0]])
b0 = torch.tensor([[10.0]])
Y = X@w0 + b0 + torch.normal( 0.0,2.0,size = [n,1]) # @表示矩阵乘法,增加正态扰动
  • Visualization

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    # 数据可视化

    %matplotlib inline
    %config InlineBackend.figure_format = 'svg'

    plt.figure(figsize = (12,5))
    ax1 = plt.subplot(121)
    ax1.scatter(X[:,0],Y[:,0], c = "b",label = "samples")
    ax1.legend()
    plt.xlabel("x1")
    plt.ylabel("y",rotation = 0)

    ax2 = plt.subplot(122)
    ax2.scatter(X[:,1],Y[:,0], c = "g",label = "samples")
    ax2.legend()
    plt.xlabel("x2")
    plt.ylabel("y",rotation = 0)
    plt.show()

    Results:

  • Data pipeline

    1
    2
    3
    #构建输入数据管道
    ds = TensorDataset(X,Y)
    dl = DataLoader(ds,batch_size = 10,shuffle=True,num_workers=2)

3.2 Model

3.2.1 Define model

1
2
3
4
model = nn.Linear(2,1) #线性层

model.loss_func = nn.MSELoss()
model.optimizer = torch.optim.SGD(model.parameters(),lr = 0.01)

3.2.2 Training model

  • Train step

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    def train_step(model, features, labels):

    predictions = model(features)
    loss = model.loss_func(predictions,labels)
    loss.backward()
    model.optimizer.step()
    model.optimizer.zero_grad()
    return loss.item()

    # 测试train_step效果
    features,labels = next(iter(dl))
    train_step(model,features,labels)

    Results:

    1
    415.08831787109375

  • Train model

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    def train_model(model,epochs):
    for epoch in range(1,epochs+1):
    for features, labels in dl:
    loss = train_step(model,features,labels)
    if epoch%50==0:
    w = model.state_dict()["weight"]
    b = model.state_dict()["bias"]
    print("epoch =",epoch,"loss = ",loss)
    print("w =",w)
    print("b =",b)

    train_model(model,epochs = 200)

    Results:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    epoch = 50 loss =  4.598311901092529
    w = tensor([[ 1.9602, -2.9793]])
    b = tensor([10.1778])
    epoch = 100 loss = 3.397813320159912
    w = tensor([[ 2.0284, -2.9681]])
    b = tensor([10.2230])
    epoch = 150 loss = 1.588686227798462
    w = tensor([[ 1.9387, -2.9690]])
    b = tensor([10.1770])
    epoch = 200 loss = 4.254576206207275
    w = tensor([[ 1.8670, -3.1228]])
    b = tensor([10.2100])

3.2.3 Visualization

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
w,b = model.state_dict()["weight"],model.state_dict()["bias"]

plt.figure(figsize = (12,5))
ax1 = plt.subplot(121)
ax1.scatter(X[:,0],Y[:,0], c = "b",label = "samples")
ax1.plot(X[:,0],w[0,0]*X[:,0]+b[0],"-r",linewidth = 5.0,label = "model")
ax1.legend()
plt.xlabel("x1")
plt.ylabel("y",rotation = 0)

ax2 = plt.subplot(122)
ax2.scatter(X[:,1],Y[:,0], c = "g",label = "samples")
ax2.plot(X[:,1],w[0,1]*X[:,1]+b[0],"-r",linewidth = 5.0,label = "model")
ax2.legend()
plt.xlabel("x2")
plt.ylabel("y",rotation = 0)

plt.show()

Results:

3.3 DNN二分类模型

3.3.1 Prepare data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import torch
from torch import nn
import torch.nn.functional as F
from torch.utils.data import Dataset,DataLoader,TensorDataset
%matplotlib inline
%config InlineBackend.figure_format = 'svg'

#正负样本数量
n_positive,n_negative = 2000,2000

#生成正样本, 小圆环分布
r_p = 5.0 + torch.normal(0.0,1.0,size = [n_positive,1])
theta_p = 2*np.pi*torch.rand([n_positive,1])
Xp = torch.cat([r_p*torch.cos(theta_p),r_p*torch.sin(theta_p)],axis = 1)
Yp = torch.ones_like(r_p)

#生成负样本, 大圆环分布
r_n = 8.0 + torch.normal(0.0,1.0,size = [n_negative,1])
theta_n = 2*np.pi*torch.rand([n_negative,1])
Xn = torch.cat([r_n*torch.cos(theta_n),r_n*torch.sin(theta_n)],axis = 1)
Yn = torch.zeros_like(r_n)

#汇总样本
X = torch.cat([Xp,Xn],axis = 0)
Y = torch.cat([Yp,Yn],axis = 0)

#可视化
plt.figure(figsize = (6,6))
plt.scatter(Xp[:,0],Xp[:,1],c = "r")
plt.scatter(Xn[:,0],Xn[:,1],c = "g")
plt.legend(["positive","negative"]);

Results:

  • Pipeline
1
2
3
#构建输入数据管道
ds = TensorDataset(X,Y)
dl = DataLoader(ds,batch_size = 10,shuffle=True,num_workers=2)

3.3.2 Define model

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class DNNModel(nn.Module):
def __init__(self):
super(DNNModel, self).__init__()
self.fc1 = nn.Linear(2,4)
self.fc2 = nn.Linear(4,8)
self.fc3 = nn.Linear(8,1)

# 正向传播
def forward(self,x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
y = nn.Sigmoid()(self.fc3(x))
return y

# 损失函数
def loss_func(self,y_pred,y_true):
return nn.BCELoss()(y_pred,y_true)

# 评估函数(准确率)
def metric_func(self,y_pred,y_true):
y_pred = torch.where(y_pred>0.5,torch.ones_like(y_pred,dtype = torch.float32),
torch.zeros_like(y_pred,dtype = torch.float32))
acc = torch.mean(1-torch.abs(y_true-y_pred))
return acc

# 优化器
@property
def optimizer(self):
return torch.optim.Adam(self.parameters(),lr = 0.001)

model = DNNModel()
  • Test pipeline

    1
    2
    3
    4
    5
    6
    7
    8
    9
    # 测试模型结构
    (features,labels) = next(iter(dl))
    predictions = model(features)

    loss = model.loss_func(predictions,labels)
    metric = model.metric_func(predictions,labels)

    print("init loss:",loss.item())
    print("init metric:",metric.item())

    Results:

    1
    2
    init loss: 0.8217536807060242
    init metric: 0.6000000238418579

3.3.3 Training model

  • Train step

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    def train_step(model, features, labels):

    # 正向传播求损失
    predictions = model(features)
    loss = model.loss_func(predictions,labels)
    metric = model.metric_func(predictions,labels)

    # 反向传播求梯度
    loss.backward()

    # 更新模型参数
    model.optimizer.step()
    model.optimizer.zero_grad()

    return loss.item(),metric.item()

    # 测试train_step效果
    features,labels = next(iter(dl))
    train_step(model,features,labels)

    Results:

    1
    (1.027471899986267, 0.4000000059604645)

  • Train model

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    def train_model(model,epochs):
    for epoch in range(1,epochs+1):
    loss_list,metric_list = [],[]
    for features, labels in dl:
    lossi,metrici = train_step(model,features,labels)
    loss_list.append(lossi)
    metric_list.append(metrici)
    loss = np.mean(loss_list)
    metric = np.mean(metric_list)

    if epoch%100==0:
    print("epoch =",epoch,"loss = ",loss,"metric = ",metric)

    train_model(model,epochs = 300)

    Results:

    1
    2
    3
    epoch = 100 loss =  0.2738241909684248 metric =  0.9302499929070472
    epoch = 200 loss = 0.27702247152624065 metric = 0.9312499925494194
    epoch = 300 loss = 0.27914922587944946 metric = 0.9309999929368495

3.3.4 Visualization

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 结果可视化
fig, (ax1,ax2) = plt.subplots(nrows=1,ncols=2,figsize = (12,5))
ax1.scatter(Xp[:,0],Xp[:,1], c="r")
ax1.scatter(Xn[:,0],Xn[:,1],c = "g")
ax1.legend(["positive","negative"]);
ax1.set_title("y_true");

Xp_pred = X[torch.squeeze(model.forward(X)>=0.5)]
Xn_pred = X[torch.squeeze(model.forward(X)<0.5)]

ax2.scatter(Xp_pred[:,0],Xp_pred[:,1],c = "r")
ax2.scatter(Xn_pred[:,0],Xn_pred[:,1],c = "g")
ax2.legend(["positive","negative"]);
ax2.set_title("y_pred");

Results:

4. 高阶API示范

Pytorch 没有官方的高阶 API,一般需要用户自己实现训练循环、验证循环、和预测循环。

torchkeras.Model 类是仿照 tf.keras.Model 的功能对 Pytorchnn.Module 进行了封装设计而成的,它实现了 fit, validatepredict, summary 方法,相当于用户自定义高阶 API。本章后面的内容借助它来实现线性回归模型。

此外,torchkeras.LightModel 类是借用 pytorch_lightning 的功能,封装了类Keras 接口的另外一种实现。本章后面的内容用它实现DNN二分类模型。

4.1 Linear regression

4.1.1 Prepare data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import torch
from torch import nn
import torch.nn.functional as F
from torch.utils.data import Dataset,DataLoader,TensorDataset

#样本数量
n = 400

# 生成测试用数据集
X = 10*torch.rand([n,2])-5.0 #torch.rand是均匀分布
w0 = torch.tensor([[2.0],[-3.0]])
b0 = torch.tensor([[10.0]])
Y = X@w0 + b0 + torch.normal( 0.0,2.0,size = [n,1]) # @表示矩阵乘法,增加正态扰动
  • Visualization

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    # 数据可视化

    %matplotlib inline
    %config InlineBackend.figure_format = 'svg'

    plt.figure(figsize = (12,5))
    ax1 = plt.subplot(121)
    ax1.scatter(X[:,0],Y[:,0], c = "b",label = "samples")
    ax1.legend()
    plt.xlabel("x1")
    plt.ylabel("y",rotation = 0)

    ax2 = plt.subplot(122)
    ax2.scatter(X[:,1],Y[:,0], c = "g",label = "samples")
    ax2.legend()
    plt.xlabel("x2")
    plt.ylabel("y",rotation = 0)
    plt.show()

    Results:

  • Data pipeline

    1
    2
    3
    4
    5
    #构建输入数据管道
    ds = TensorDataset(X,Y)
    ds_train,ds_valid = torch.utils.data.random_split(ds,[int(400*0.7),400-int(400*0.7)])
    dl_train = DataLoader(ds_train,batch_size = 10,shuffle=True,num_workers=2)
    dl_valid = DataLoader(ds_valid,batch_size = 10,num_workers=2)

4.2 Model

4.2.1 Define model

1
2
3
4
5
6
7
8
9
10
11
# 继承用户自定义模型
from torchkeras import Model
class LinearRegression(Model):
def __init__(self):
super(LinearRegression, self).__init__()
self.fc = nn.Linear(2,1)

def forward(self,x):
return self.fc(x)

model = LinearRegression()

4.2.2 Training model

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 使用fit方法进行训练

def mean_absolute_error(y_pred,y_true):
return torch.mean(torch.abs(y_pred-y_true))

def mean_absolute_percent_error(y_pred,y_true):
absolute_percent_error = (torch.abs(y_pred-y_true)+1e-7)/(torch.abs(y_true)+1e-7)
return torch.mean(absolute_percent_error)

model.compile(loss_func = nn.MSELoss(),
optimizer= torch.optim.Adam(model.parameters(),lr = 0.01),
metrics_dict={"mae":mean_absolute_error,"mape":mean_absolute_percent_error})

dfhistory = model.fit(200, dl_train = dl_train, dl_val = dl_valid,log_step_freq = 20)

Results:

Start Training ...

================================================================================2022-02-06 22:48:10
{'step': 20, 'loss': 208.126, 'mae': 11.994, 'mape': 1.195}

 +-------+---------+--------+-------+----------+---------+----------+
| epoch |   loss  |  mae   |  mape | val_loss | val_mae | val_mape |
+-------+---------+--------+-------+----------+---------+----------+
|   1   | 201.175 | 11.695 | 1.269 | 195.057  |  11.834 |  1.065   |
+-------+---------+--------+-------+----------+---------+----------+

...

 +-------+-------+-------+-------+----------+---------+----------+
| epoch |  loss |  mae  |  mape | val_loss | val_mae | val_mape |
+-------+-------+-------+-------+----------+---------+----------+
|   20  | 39.91 | 5.993 | 1.649 |  42.392  |  6.193  |  1.032   |
+-------+-------+-------+-------+----------+---------+----------+

================================================================================2022-02-06 22:49:56
Finished Training...

4.2.3 Visualization

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
w,b = model.state_dict()["fc.weight"],model.state_dict()["fc.bias"]

plt.figure(figsize = (12,5))
ax1 = plt.subplot(121)
ax1.scatter(X[:,0],Y[:,0], c = "b",label = "samples")
ax1.plot(X[:,0],w[0,0]*X[:,0]+b[0],"-r",linewidth = 5.0,label = "model")
ax1.legend()
plt.xlabel("x1")
plt.ylabel("y",rotation = 0)

ax2 = plt.subplot(122)
ax2.scatter(X[:,1],Y[:,0], c = "g",label = "samples")
ax2.plot(X[:,1],w[0,1]*X[:,1]+b[0],"-r",linewidth = 5.0,label = "model")
ax2.legend()
plt.xlabel("x2")
plt.ylabel("y",rotation = 0)

plt.show()

Results:

4.2.4 Evaluation

1
dfhistory.tail()

Results:

loss mae mape val_loss val_mae val_mape
15 51.618867 6.840317 1.773152 54.423827 7.038455 1.124349
16 48.355738 6.618555 1.744567 51.134396 6.821975 1.102371
17 45.444238 6.420669 1.726280 47.896852 6.605719 1.086570
18 42.519069 6.199411 1.682794 45.115399 6.398358 1.055073
19 39.909953 5.992503 1.649152 42.391730 6.192853 1.031992
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import matplotlib.pyplot as plt

def plot_metric(dfhistory, metric):
train_metrics = dfhistory[metric]
val_metrics = dfhistory['val_'+metric]
epochs = range(1, len(train_metrics) + 1)
plt.plot(epochs, train_metrics, 'bo--')
plt.plot(epochs, val_metrics, 'ro-')
plt.title('Training and validation '+ metric)
plt.xlabel("Epochs")
plt.ylabel(metric)
plt.legend(["train_"+metric, 'val_'+metric])
plt.show()

plot_metric(dfhistory,"loss")

Results:

1
plot_metric(dfhistory,"mape")

Results:

1
2
# 评估
model.evaluate(dl_valid)

Results:

{'val_loss': 42.391730308532715,
 'val_mae': 6.19285261631012,
 'val_mape': 1.0319924702246983}

4.2.5 Predict

1
2
3
# 预测
dl = DataLoader(TensorDataset(X))
model.predict(dl)[0:10]

Results:

tensor([[  8.9128],
        [  9.5116],
        [ 12.2481],
        [  0.1308],
        [ 16.1116],
        [-17.9351],
        [-14.6407],
        [  2.9675],
        [ 10.9686],
        [ 14.8227]])
  • Predict validate data

    1
    2
    # 预测
    model.predict(dl_valid)[0:10]

    Results:

    tensor([[ -4.9393],
            [-12.2253],
            [  3.5050],
            [  6.6128],
            [  2.7707],
            [  0.7076],
            [ -6.2700],
            [ -8.4491],
            [ -7.4038],
            [ 10.0306]])

4.3 DNN二分类模型

4.3.1 Prepare data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import torch
from torch import nn
import torch.nn.functional as F
from torch.utils.data import Dataset,DataLoader,TensorDataset
import torchkeras
import pytorch_lightning as pl
%matplotlib inline
%config InlineBackend.figure_format = 'svg'

#正负样本数量
n_positive,n_negative = 2000,2000

#生成正样本, 小圆环分布
r_p = 5.0 + torch.normal(0.0,1.0,size = [n_positive,1])
theta_p = 2*np.pi*torch.rand([n_positive,1])
Xp = torch.cat([r_p*torch.cos(theta_p),r_p*torch.sin(theta_p)],axis = 1)
Yp = torch.ones_like(r_p)

#生成负样本, 大圆环分布
r_n = 8.0 + torch.normal(0.0,1.0,size = [n_negative,1])
theta_n = 2*np.pi*torch.rand([n_negative,1])
Xn = torch.cat([r_n*torch.cos(theta_n),r_n*torch.sin(theta_n)],axis = 1)
Yn = torch.zeros_like(r_n)

#汇总样本
X = torch.cat([Xp,Xn],axis = 0)
Y = torch.cat([Yp,Yn],axis = 0)

#可视化
plt.figure(figsize = (6,6))
plt.scatter(Xp[:,0],Xp[:,1],c = "r")
plt.scatter(Xn[:,0],Xn[:,1],c = "g")
plt.legend(["positive","negative"]);

Results:

  • Dataloader

    1
    2
    3
    4
    5
    ds = TensorDataset(X,Y)

    ds_train,ds_valid = torch.utils.data.random_split(ds,[int(len(ds)*0.7),len(ds)-int(len(ds)*0.7)])
    dl_train = DataLoader(ds_train,batch_size = 100,shuffle=True,num_workers=2)
    dl_valid = DataLoader(ds_valid,batch_size = 100,num_workers=2)

4.3.2 Define model

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
import torchmetrics as metrics

class Net(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(2,4)
self.fc2 = nn.Linear(4,8)
self.fc3 = nn.Linear(8,1)

def forward(self,x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
y = nn.Sigmoid()(self.fc3(x))
return y

class Model(torchkeras.LightModel):

#loss,and optional metrics
def shared_step(self,batch)->dict:
x, y = batch
prediction = self(x)
loss = nn.BCELoss()(prediction,y)
preds = torch.where(prediction>0.5,torch.ones_like(prediction),torch.zeros_like(prediction))
acc = metrics.functional.accuracy(preds.int(), y.int())
# attention: there must be a key of "loss" in the returned dict
dic = {"loss":loss,"acc":acc}
return dic

#optimizer,and optional lr_scheduler
def configure_optimizers(self):
optimizer = torch.optim.Adam(self.parameters(), lr=1e-2)
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.0001)
return {"optimizer":optimizer,"lr_scheduler":lr_scheduler}

pl.seed_everything(1234)
net = Net()
model = Model(net)

torchkeras.summary(model,input_shape =(2,))

Results:

Global seed set to 1234

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Linear-1                    [-1, 4]              12
            Linear-2                    [-1, 8]              40
            Linear-3                    [-1, 1]               9
================================================================
Total params: 61
Trainable params: 61
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.000008
Forward/backward pass size (MB): 0.000099
Params size (MB): 0.000233
Estimated Total Size (MB): 0.000340
----------------------------------------------------------------

4.3.3 Training model

Note:下述代码,如果本机没有 gpu 会报 Runerror 错误:

1
RuntimeError: DataLoader worker (pid(s) 6088, 19424) exited unexpectedl

gpu=0 删掉能避免此错误。

1
2
3
4
5
6
7
8
9
10
11
12
ckpt_cb = pl.callbacks.ModelCheckpoint(monitor='val_loss')

# set gpus=0 will use cpu,
# set gpus=1 will use 1 gpu
# set gpus=2 will use 2gpus
# set gpus = -1 will use all gpus
# you can also set gpus = [0,1] to use the given gpus
# you can even set tpu_cores=2 to use two tpus

trainer = pl.Trainer(max_epochs=100, callbacks=[ckpt_cb])

trainer.fit(model,dl_train,dl_valid)

Results:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs

| Name | Type | Params
------------------------------
0 | net | Net | 61
------------------------------
61 Trainable params
0 Non-trainable params
61 Total params
0.000 Total estimated model params size (MB)

Validation sanity check: 0it [00:00, ?it/s]

Global seed set to 1234

================================================================================2022-02-07 09:45:32
epoch = 0
{'val_loss': 0.6725655794143677, 'val_acc': 0.5399999618530273}

Training: 0it [00:00, ?it/s]

Validating: 0it [00:00, ?it/s]

================================================================================2022-02-07 09:48:22
epoch = 0
{'val_loss': 0.6592584252357483, 'val_acc': 0.5483332872390747}
{'loss': 0.679371178150177, 'acc': 0.5324999690055847}

...

Validating: 0it [00:00, ?it/s]

================================================================================2022-02-07 10:16:49
epoch = 99
{'val_loss': 0.20280574262142181, 'val_acc': 0.9183333516120911}
{'loss': 0.20242063701152802, 'acc': 0.9210714101791382}

4.3.4 Visualization

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 结果可视化
fig, (ax1,ax2) = plt.subplots(nrows=1,ncols=2,figsize = (12,5))
ax1.scatter(Xp[:,0],Xp[:,1], c="r")
ax1.scatter(Xn[:,0],Xn[:,1],c = "g")
ax1.legend(["positive","negative"]);
ax1.set_title("y_true");

Xp_pred = X[torch.squeeze(model.forward(X)>=0.5)]
Xn_pred = X[torch.squeeze(model.forward(X)<0.5)]

ax2.scatter(Xp_pred[:,0],Xp_pred[:,1],c = "r")
ax2.scatter(Xn_pred[:,0],Xn_pred[:,1],c = "g")
ax2.legend(["positive","negative"]);
ax2.set_title("y_pred");

Results:

4.3.5 Evaluation

1
2
3
4
5
import pandas as pd

history = model.history
dfhistory = pd.DataFrame(history)
dfhistory

Results:

val_loss val_acc loss acc epoch
0 0.659258 0.548333 0.679371 0.532500 0
1 0.633105 0.712500 0.653128 0.617500 1
2 0.560715 0.705833 0.603827 0.702857 2
3 0.468437 0.794167 0.533967 0.737143 3
4 0.345662 0.820000 0.427476 0.795357 4
... ... ... ... ... ...
95 0.202806 0.918333 0.202421 0.921071 95
96 0.202806 0.918333 0.202421 0.921071 96
97 0.202806 0.918333 0.202421 0.921071 97
98 0.202806 0.918333 0.202421 0.921071 98
99 0.202806 0.918333 0.202421 0.921071 99

100 rows × 5 columns

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import matplotlib.pyplot as plt

def plot_metric(dfhistory, metric):
train_metrics = dfhistory[metric]
val_metrics = dfhistory['val_'+metric]
epochs = range(1, len(train_metrics) + 1)
plt.plot(epochs, train_metrics, 'bo--')
plt.plot(epochs, val_metrics, 'ro-')
plt.title('Training and validation '+ metric)
plt.xlabel("Epochs")
plt.ylabel(metric)
plt.legend(["train_"+metric, 'val_'+metric])
plt.show()
plot_metric(dfhistory,"loss")

Results:

1
plot_metric(dfhistory,"acc")

Results:

1
2
results = trainer.test(model, test_dataloaders=dl_valid, verbose = False)
print(results[0])

Results:

1
2
3
Testing: 0it [00:00, ?it/s]

{'test_loss': 0.20280574262142181, 'test_acc': 0.9183333516120911}

4.3.6 Predict

1
2
3
4
5
6
7
8
9
def predict(model,dl):
model.eval()
prediction = torch.cat([model.forward(t[0].to(model.device)) for t in dl])
result = torch.where(prediction>0.5,torch.ones_like(prediction),torch.zeros_like(prediction))
return(result.data)

result = predict(model,dl_valid)

result

Results:

tensor([[1.],
        [1.],
        [0.],
        ...,
        [0.],
        [0.],
        [1.]])
-------------This blog is over! Thanks for your reading-------------