pytorch实现模型搭建

一、何为模型？

模型就是获取设备，以方便后面的模型与变量进行内存迁移，设备名只有两种：'cuda'和'cpu'。通常是在你有GPU的情况下需要这样显式进行设备的设置，从而在需要时，你可以将变量从主存迁移到显存中。如果没有GPU，不获取也没事，pytorch会默认将参数都保存在主存中。

实现代码如下：

# 模型的层定义中，使用Sequential来统一管理的层集中表示为一层
import torch
import numpy as np
import matplotlib.pyplot as plt# 引入nn与optim优化模块
from torch import nn,optim
from torchsummary import summary
from keras.datasets import import mnist
from keras.utils import to_categorical# 自定义类，并引入nn.Module模块
class TorchModelTest(nn.Module):def _init_(self,device):super()._init__()# Sequential管理所有层self.layer1 = nn.Sequential(nn.Flatten(), nn.Linear(28*28, 512), nn.ReLU())self.layer2 = nn.Sequential(nn.Linear(512, 512), nn.ReLU())self.layer3 = nn.Sequential(nn.Linear(512, 512), nn.ReLU())self.layer4 = nn.Sequential(nn.Linear(512, 10),nn.Softmax(dim=-1))

定义设备：

device = torch.device('cpu')

若是GPU版本的使用的语句则是：

device = torch.device('cuda')

1.1、定义模型优化器

和TensorFlow不同，pytorch需要在定义时就将需要梯度下降的参数传入，也就是其中的self.parameters()，表示当前模型的所有参数。不用担心定义优化器和模型参数的顺序问题，因为self.parameters()的输出并不是模型参数的实例，而是整个模型参数对象的指针，所以即使你在定义优化器之后又定义了一个层，它依然能优化到。当然优化器你也可以在外部定义，传入model.parameters()也可。这里定义了一个随机梯度下降。

self.opt = optim.SGD(self.parameters(),lr=0.01)

1.2、定义向前传播

与tensorflow的call类似，定义的model就执行的就是此函数

def forward(self, inputs):x = self.layer1(inputs)x = self.layer2(x)x = self.layer3(x)x = self.layer4(x)
return x

1.3、交叉熵

想要获取的loss 函数集成在模型中，计算预测函数与真实标签之间的交叉熵

def get_loss(self, true_labels, predicts)loss = -true_lables * torch.log(predicts)loss = tprch.mean(loss)

1.4、优化

在pytorch中，参数梯度集成在各自的对应的参数当中，使用tensor.grad来查看。每次对loss执行的backward（），pytorch都会将之前loss执行的可训练参数关于loss的梯度叠加进去。若不想进行叠加，则要删除梯度，但待训练参数已经传入优化器，所以对优化器使用zero_gard()，将训练的梯度进行删除。梯度叠加在内存不足以计算整个梯度时，需要将其分成一部分一部分的来计算，每计算一部分得到的loss与backward（）一次，从而得到整个批量的梯度。计算好后，在执行优化器的setp()，优化器根据可训练参数的梯度对其执行一步的优化。

def train(self, imgs, labels):predicts = model(imgs)loss = self.get_loss(labels, predicts)self.opt.zero_gradloss.backward()self.opt.step()

1.5、使用torchsummary函数显示所见模型

model = TorchModelTest(device)
summary(model, (1, 28, 28), 3,, device = 'cpu')

二、模型训练以及可视化

定义好的模型后，就要对其进行训练，pytorch自带的MNIST数据集并不好用，所以使用Keras自带的，定义一个获取数据的生成器。100次迭代后显示一次准确率。

def get_data(device, is_train = True, batch = 1024, num = 10000):train_data,test_data = mnist.load_data()if is_train:imgs,labels = train_dataelse:imgs,labels = test_dataimgs = (imgs/255*2-1)[:, np.newaxis,...]labels = to_categorical(labels,10)imgs = torch.tensor(imgs, dtype=torch.float32).to(device)labels = torch.tensor(labels,dtype=torch.float32).to(device)i = 0while(True):i +=batchif i > num:i = batchyield imgs[i - batch : i],labels[i - batch : i]
train_dg = get_data(device, True,batch=4096, num=60000)
test_dg = get_data(device, False,batch=5000, num=10000)
model = TorchModelTest(device)
summary(model, (1,28,28), 3,device='cpu')ACCs = []
import time
start = time.time()
for j in range(20000):# 训练imgs, labels = next(train_dg)model.train(imgs, labels)# 验证img, label = next(test_dg)predicts = model(img)acc = 1 - torch.count_nonzero(torch.argmax(predicts, axis=1) - torch.argmax(label, axis=1)) / label.shape[0]if j % 50 == 0:t = time.time() - startstart = time.time()ACCs.append(acc.cpu().numpy())print(j, t, 'ACC: ', acc)
# 绘图
x = np.linspace(0, len(ACCs), len(ACCs))
plt.plot(x, ACCs)