尋夢新聞LINE@每日推播熱門推薦文章,趣聞不漏接❤️
Pytorch學習記錄-網路模型保存與加載
老生常談的問題,昨天用到了卻還是不會。就又看了一下教程。搞定了。
目前一個問題是我想對訓練結果進行可視化,使用pyplot顯示,但是在數據降維(比如MNIST數據)和轉存到GPU部分沒搞定。
很多網路模型保存和加載的教程都是相互抄,根本沒說清代碼的整體結構。
在這里說明一下,模型再次加載的時候,必須再次構建神經網路(你也可以黏過來)。
分兩個文件,我只做了簡單的加載,因為目前來看是夠用的,後面的參數啊凍結啊什麼的暫時不管。
ckpt格式是保存了模型的graph和各種狀態權重,官網上推薦只保存state_dict,保存全部model的話速度會比較慢。
但是我還是選擇保存所有的,畢竟是個小的lstm。
創建文件1,使用昨天的lstm模型
import torch import torchvision import torch.nn as nn import torchvision.transforms as transforms from torch.autograd import Variable device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') sequence_length = 28 input_size = 28 hidden_size = 128 num_layers = 2 num_classes = 10 batch_size = 100 num_epochs = 2 learning_rate = 0.01 train_dataset = torchvision.datasets.MNIST(root='./data/', train=True, transform=transforms.ToTensor(), download=True) test_dataset = torchvision.datasets.MNIST(root='./data/', train=False, transform=transforms.ToTensor()) train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True) test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False) class RNN(nn.Module): def __init__(self, input_size, hidden_size, num_layers, num_classes): super(RNN, self).__init__() self.hidder_size = hidden_size self.num_layers = num_layers self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True) self.fc = nn.Linear(hidden_size, num_classes) def forward(self, x): h0 = torch.zeros(self.num_layers, x.size(0), self.hidder_size).to(device) c0 = torch.zeros(self.num_layers, x.size(0), self.hidder_size).to(device) out, _ = self.lstm(x, (h0, c0)) out = self.fc(out[:, -1, :]) return out model = RNN(input_size, hidden_size, num_layers, num_classes).to(device) # 損失和優化函數 criterion = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) # 訓練模型 total_step = len(train_loader) for epoch in range(num_epochs): for i, (images, labels) in enumerate(train_loader): images = images.reshape(-1, sequence_length, input_size).to(device) labels = labels.to(device) # Forward pass outputs = model(images) loss = criterion(outputs, labels) # Backward and optimize optimizer.zero_grad() loss.backward() optimizer.step() if (i + 1) % 100 == 0: print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'.format(epoch + 1, num_epochs, i + 1, total_step, loss.item())) # 測試模型 with torch.no_grad(): correct = 0 total = 0 for images, labels in test_loader: images = images.reshape(-1, sequence_length, input_size).to(device) labels = labels.to(device) outputs = model(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() print('Test Accuracy of the model on the 10000 test images: {} %'.format(100 * correct / total)) torch.save(model, 'TestSave.ckpt') #Test Accuracy of the model on the 10000 test images: 97.37 %
這時你在根目錄下就可以看到生成的模型了。現在我們加載出來,用同一批數據測試看兩邊的正確率是否相同。
創建文件2,加載剛才保存的模型和參數
import torch import torch.nn as nn import torchvision import torchvision.transforms as transforms from torch.autograd import Variable import matplotlib.pyplot as plt device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') sequence_length = 28 input_size = 28 hidden_size = 128 num_layers = 2 num_classes = 10 batch_size = 100 num_epochs = 2 learning_rate = 0.01 train_dataset = torchvision.datasets.MNIST(root='./data/', train=True, transform=transforms.ToTensor(), download=True) test_dataset = torchvision.datasets.MNIST(root='./data/', train=False, transform=transforms.ToTensor()) train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True) test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False) # test_x = Variable(torch.unsqueeze(test_dataset.test_data, dim=1), volatile=True).type(torch.FloatTensor) / 255 # test_y = test_dataset.test_labels class RNN(nn.Module): def __init__(self, input_size, hidden_size, num_layers, num_classes): super(RNN, self).__init__() self.hidder_size = hidden_size self.num_layers = num_layers self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True) self.fc = nn.Linear(hidden_size, num_classes) def forward(self, x): h0 = torch.zeros(self.num_layers, x.size(0), self.hidder_size).to(device) c0 = torch.zeros(self.num_layers, x.size(0), self.hidder_size).to(device) out, _ = self.lstm(x, (h0, c0)) out = self.fc(out[:, -1, :]) return out model = RNN(input_size, hidden_size, num_layers, num_classes).to(device) # 損失和優化函數 criterion = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) model = torch.load('TestSave.ckpt') model.eval() # print(model.state_dict()) # for k, v in enumerate(model.state_dict()): # print(k, v) with torch.no_grad(): correct = 0 total = 0 for images, labels in test_loader: images = images.reshape(-1, sequence_length, input_size).to(device) labels = labels.to(device) outputs = model(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() print('Test Accuracy of the model on the 10000 test images: {} %'.format(100 * correct / total)) #Test Accuracy of the model on the 10000 test images: 97.31 %
都達到了97.3%。暫時先這樣,繼續學下面的了。