==============================================================
We'll be using this dataset of 102 flower categories, you can see a few examples below.
The project is broken down into multiple steps:
- Load and preprocess the image dataset
- Train the image classifier on your dataset
- Use the trained classifier to predict image content
We'll lead you through each part which you'll implement in Python.
When you've completed this project, you'll have an application that can be trained on any set of labeled images. Here your network will be learning about flowers and end up as a command line application. But, what you do with your new skills depends on your imagination and effort in building a dataset. For example, imagine an app where you take a picture of a car, it tells you what the make and model is, then looks up information about it. Go build your own dataset and make something new.
First up is importing the packages you'll need. It's good practice to keep all the imports at the beginning of your code. As you work through this notebook and find you need to import a package, make sure to add the import up here.
In [9]:
# Imports here
import matplotlib.pyplot as plt
import torch
from torch import nn
from torch import optim
import seaborn as sns
import torch.nn.functional as F
from torchvision import datasets, transforms, models
from utils import active_session
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
Here you'll use torchvision
to load the data
(documentation).
The data should be included alongside this notebook, otherwise you can
download it
here.
The dataset is split into three parts, training, validation, and
testing. For the training, you'll want to apply transformations such as
random scaling, cropping, and flipping. This will help the network
generalize leading to better performance. You'll also need to make sure
the input data is resized to 224x224 pixels as required by the
pre-trained networks.
The validation and testing sets are used to measure the model's performance on data it hasn't seen yet. For this you don't want any scaling or rotation transformations, but you'll need to resize then crop the images to the appropriate size.
The pre-trained networks you'll use were trained on the ImageNet dataset
where each color channel was normalized separately. For all three sets
you'll need to normalize the means and standard deviations of the images
to what the network expects. For the means, it's [0.485, 0.456, 0.406]
and for the standard deviations [0.229, 0.224, 0.225]
, calculated from
the ImageNet images. These values will shift each color channel to be
centered at 0 and range from -1 to 1.
In [10]:
data_dir = 'flowers'
train_dir = data_dir + '/train'
valid_dir = data_dir + '/valid'
test_dir = data_dir + '/test'
In [12]:
# TODO: Define your transforms for the training, validation, and testing sets
train_transforms = transforms.Compose([transforms.RandomRotation(30),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
test_transforms = transforms.Compose([transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
validation_transforms = transforms.Compose([transforms.Resize([224,224]),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
# TODO: Load the datasets with ImageFolder
train_dataset = datasets.ImageFolder(train_dir,transform=train_transforms)
test_dataset = datasets.ImageFolder(test_dir,transform=test_transforms)
validation_dataset = datasets.ImageFolder(valid_dir,transform=validation_transforms)
# TODO: Using the image datasets and the trainforms, define the dataloaders
train_data_loader = torch.utils.data.DataLoader(train_dataset,batch_size=64,shuffle=True)
test_data_loader = torch.utils.data.DataLoader(test_dataset,batch_size=64)
validation_data_loader = torch.utils.data.DataLoader(validation_dataset,batch_size=64)
You'll also need to load in a mapping from category label to category
name. You can find this in the file cat_to_name.json
. It's a JSON
object which you can read in with the json
module. This will give you
a dictionary mapping the integer encoded categories to the actual names
of the flowers.
In [11]:
import json
with open('cat_to_name.json', 'r') as f:
cat_to_name = json.load(f)
==============================================================================
Now that the data is ready, it's time to build and train the classifier.
As usual, you should use one of the pretrained models from
torchvision.models
to get the image features. Build and train a new
feed-forward classifier using those features.
We're going to leave this part up to you. Refer to the rubric for guidance on successfully completing this section. Things you'll need to do:
- Load a pre-trained network (If you need a starting point, the VGG networks work great and are straightforward to use)
- Define a new, untrained feed-forward network as a classifier, using ReLU activations and dropout
- Train the classifier layers using backpropagation using the pre-trained network to get the features
- Track the loss and accuracy on the validation set to determine the best hyperparameters
We've left a cell open for you below, but use as many as you need. Our advice is to break the problem up into smaller parts you can run separately. Check that each part is doing what you expect, then move on to the next. You'll likely find that as you work through each part, you'll need to go back and modify your previous code. This is totally normal!
When training make sure you're updating only the weights of the feed-forward network. You should be able to get the validation accuracy above 70% if you build everything right. Make sure to try different hyperparameters (learning rate, units in the classifier, epochs, etc) to find the best model. Save those hyperparameters to use as default values in the next part of the project.
One last important tip if you're using the workspace to run your code: To avoid having your workspace disconnect during the long-running tasks in this notebook, please read in the earlier page in this lesson called Intro to GPU Workspaces about Keeping Your Session Active. You'll want to include code from the workspace_utils.py module.
Note for Workspace users: If your network is over 1 GB when saved as
a checkpoint, there might be issues with saving backups in your
workspace. Typically this happens with wide dense layers after the
convolutional layers. If your saved checkpoint is larger than 1 GB (you
can open a terminal and check with ls -lh
), you should reduce the size
of your hidden layers and train again.
In [43]:
# TODO: Build and train your network
model = models.vgg19(pretrained=True)
for param in model.parameters():
param.requires_grad = False
classifier = nn.Sequential(nn.Linear(25088,4096),
nn.ReLU(),
nn.Dropout(0.4),
nn.Linear(4096,102),
nn.LogSoftmax(dim=1))
model.classifier = classifier
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.classifier.parameters(),lr=0.001)
device = torch.device("cuda")
model.to(device)
Out[43]:
VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace)
(16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(17): ReLU(inplace)
(18): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(19): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace)
(23): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(24): ReLU(inplace)
(25): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(26): ReLU(inplace)
(27): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace)
(30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(31): ReLU(inplace)
(32): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(33): ReLU(inplace)
(34): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(35): ReLU(inplace)
(36): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU()
(2): Dropout(p=0.4)
(3): Linear(in_features=4096, out_features=102, bias=True)
(4): LogSoftmax()
)
)
In [44]:
epochs = 2
steps = 0
train_loss=0
print_interval =5
with active_session():
for epoch in range(epochs):
for inputs,labels in train_data_loader:
steps += 1
inputs,labels = inputs.to(device),labels.to(device)
optimizer.zero_grad()
logps = model.forward(inputs)
loss = criterion(logps,labels)
loss.backward()
optimizer.step()
train_loss += loss.item()
if steps % print_interval == 0:
test_loss = 0
accuracy =0
model.eval()
with torch.no_grad():
for inputs,labels in validation_data_loader:
inputs, labels = inputs.to(device), labels.to(device)
log_test_ps = model.forward(inputs)
batch_loss = criterion(log_test_ps,labels)
test_loss += batch_loss.item()
ps = torch.exp(log_test_ps)
top_p,top_class = ps.topk(1,dim=1)
equals = top_class == labels.view(*top_class.shape)
accuracy += torch.mean(equals.type(torch.cuda.FloatTensor)).item()
print(f"Steps: {steps} Epoch {epoch+1}/{epochs}.. "
f"Train loss: {train_loss/print_interval:.3f}.. "
f"Validation loss: {test_loss/len(validation_data_loader):.3f}.. "
f"Validation accuracy: {accuracy/len(validation_data_loader):.3f}")
train_loss = 0
model.train()
Steps: 5 Epoch 1/2.. Train loss: 10.856.. Validation loss: 11.561.. Validation accuracy: 0.041
Steps: 10 Epoch 1/2.. Train loss: 10.738.. Validation loss: 8.053.. Validation accuracy: 0.139
Steps: 15 Epoch 1/2.. Train loss: 7.406.. Validation loss: 4.877.. Validation accuracy: 0.139
Steps: 20 Epoch 1/2.. Train loss: 4.492.. Validation loss: 3.606.. Validation accuracy: 0.246
Steps: 25 Epoch 1/2.. Train loss: 3.650.. Validation loss: 2.996.. Validation accuracy: 0.336
Steps: 30 Epoch 1/2.. Train loss: 3.098.. Validation loss: 2.694.. Validation accuracy: 0.390
Steps: 35 Epoch 1/2.. Train loss: 2.912.. Validation loss: 2.448.. Validation accuracy: 0.425
Steps: 40 Epoch 1/2.. Train loss: 2.888.. Validation loss: 2.095.. Validation accuracy: 0.502
Steps: 45 Epoch 1/2.. Train loss: 2.449.. Validation loss: 1.899.. Validation accuracy: 0.525
Steps: 50 Epoch 1/2.. Train loss: 2.282.. Validation loss: 1.805.. Validation accuracy: 0.565
Steps: 55 Epoch 1/2.. Train loss: 2.157.. Validation loss: 1.633.. Validation accuracy: 0.592
Steps: 60 Epoch 1/2.. Train loss: 2.022.. Validation loss: 1.524.. Validation accuracy: 0.613
Steps: 65 Epoch 1/2.. Train loss: 2.084.. Validation loss: 1.426.. Validation accuracy: 0.648
Steps: 70 Epoch 1/2.. Train loss: 1.993.. Validation loss: 1.441.. Validation accuracy: 0.621
Steps: 75 Epoch 1/2.. Train loss: 2.030.. Validation loss: 1.411.. Validation accuracy: 0.661
Steps: 80 Epoch 1/2.. Train loss: 2.022.. Validation loss: 1.427.. Validation accuracy: 0.644
Steps: 85 Epoch 1/2.. Train loss: 2.022.. Validation loss: 1.224.. Validation accuracy: 0.668
Steps: 90 Epoch 1/2.. Train loss: 1.884.. Validation loss: 1.147.. Validation accuracy: 0.692
Steps: 95 Epoch 1/2.. Train loss: 1.614.. Validation loss: 1.165.. Validation accuracy: 0.695
Steps: 100 Epoch 1/2.. Train loss: 1.882.. Validation loss: 1.179.. Validation accuracy: 0.687
Steps: 105 Epoch 2/2.. Train loss: 1.571.. Validation loss: 1.178.. Validation accuracy: 0.694
Steps: 110 Epoch 2/2.. Train loss: 1.710.. Validation loss: 1.191.. Validation accuracy: 0.684
Steps: 115 Epoch 2/2.. Train loss: 1.525.. Validation loss: 1.094.. Validation accuracy: 0.727
Steps: 120 Epoch 2/2.. Train loss: 1.554.. Validation loss: 1.163.. Validation accuracy: 0.687
Steps: 125 Epoch 2/2.. Train loss: 1.680.. Validation loss: 1.128.. Validation accuracy: 0.709
Steps: 130 Epoch 2/2.. Train loss: 1.418.. Validation loss: 0.997.. Validation accuracy: 0.741
Steps: 135 Epoch 2/2.. Train loss: 1.340.. Validation loss: 0.940.. Validation accuracy: 0.753
Steps: 140 Epoch 2/2.. Train loss: 1.252.. Validation loss: 0.950.. Validation accuracy: 0.746
Steps: 145 Epoch 2/2.. Train loss: 1.480.. Validation loss: 0.992.. Validation accuracy: 0.713
Steps: 150 Epoch 2/2.. Train loss: 1.467.. Validation loss: 1.052.. Validation accuracy: 0.717
Steps: 155 Epoch 2/2.. Train loss: 1.373.. Validation loss: 1.032.. Validation accuracy: 0.707
Steps: 160 Epoch 2/2.. Train loss: 1.419.. Validation loss: 0.990.. Validation accuracy: 0.736
Steps: 165 Epoch 2/2.. Train loss: 1.409.. Validation loss: 0.915.. Validation accuracy: 0.749
Steps: 170 Epoch 2/2.. Train loss: 1.414.. Validation loss: 0.935.. Validation accuracy: 0.749
Steps: 175 Epoch 2/2.. Train loss: 1.296.. Validation loss: 0.811.. Validation accuracy: 0.778
Steps: 180 Epoch 2/2.. Train loss: 1.318.. Validation loss: 0.923.. Validation accuracy: 0.760
Steps: 185 Epoch 2/2.. Train loss: 1.242.. Validation loss: 0.968.. Validation accuracy: 0.729
Steps: 190 Epoch 2/2.. Train loss: 1.437.. Validation loss: 0.969.. Validation accuracy: 0.738
Steps: 195 Epoch 2/2.. Train loss: 1.521.. Validation loss: 0.868.. Validation accuracy: 0.759
Steps: 200 Epoch 2/2.. Train loss: 1.439.. Validation loss: 0.852.. Validation accuracy: 0.761
Steps: 205 Epoch 2/2.. Train loss: 1.413.. Validation loss: 0.795.. Validation accuracy: 0.776
It's good practice to test your trained network on test data, images the network has never seen either in training or validation. This will give you a good estimate for the model's performance on completely new images. Run the test images through the network and measure the accuracy, the same way you did validation. You should be able to reach around 70% accuracy on the test set if the model has been trained well.
In [45]:
# TODO: Do validation on the test set
model.eval()
accuracy=0
with torch.no_grad():
for inputs,labels in test_data_loader:
inputs, labels = inputs.to(device), labels.to(device)
log_test_ps = model.forward(inputs)
ps = torch.exp(log_test_ps)
top_p,top_class = ps.topk(1,dim=1)
equals = top_class == labels.view(*top_class.shape)
accuracy += torch.mean(equals.type(torch.cuda.FloatTensor)).item()
print(f"Test accuracy: {accuracy/len(test_data_loader):.3f}")
Test accuracy: 0.806
Now that your network is trained, save the model so you can load it
later for making predictions. You probably want to save other things
such as the mapping of classes to indices which you get from one of the
image datasets: image_datasets['train'].class_to_idx
. You can attach
this to the model as an attribute which makes inference easier later on.
model.class_to_idx = image_datasets['train'].class_to_idx
Remember that you'll want to completely rebuild the model later so you
can use it for inference. Make sure to include any information you need
in the checkpoint. If you want to load the model and keep training,
you'll want to save the number of epochs as well as the optimizer state,
optimizer.state_dict
. You'll likely want to use this trained model in
the next part of the project, so best to save it now.
In [61]:
# TODO: Save the checkpoint
#train_dataset
model.cpu()
model.class_to_idx = train_dataset.class_to_idx
checkpoint = {'arch':'vgg19',
'state_dict':model.state_dict(),
'class_to_idx':model.class_to_idx,
'epochs':2,
'optimizer_state_dict':optimizer.state_dict,
'classifier':model.classifier }
torch.save(checkpoint,'checkpoint.pth')
At this point it's good to write a function that can load a checkpoint and rebuild the model. That way you can come back to this project and keep working on it without having to retrain the network.
In [19]:
# TODO: Write a function that loads a checkpoint and rebuilds the model
def load_checkpoint(file_path):
checkpoint = torch.load(file_path)
return checkpoint
def load_model(checkpoint):
if checkpoint:
model = models.vgg19(pretrained=True)
model.classifier = checkpoint['classifier']
model.load_state_dict(checkpoint['state_dict'])
model.class_to_idx = checkpoint['class_to_idx']
for param in model.parameters():
param.requires_grad = False
else:
print("Model can not be loaded")
return
return model
In [20]:
checkpoint = load_checkpoint('checkpoint.pth')
model = load_model(checkpoint)
model
Downloading: "https://download.pytorch.org/models/vgg19-dcbb9e9d.pth" to /root/.torch/models/vgg19-dcbb9e9d.pth
100%|██████████| 574673361/574673361 [00:09<00:00, 58826544.53it/s]
Out[20]:
VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace)
(16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(17): ReLU(inplace)
(18): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(19): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace)
(23): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(24): ReLU(inplace)
(25): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(26): ReLU(inplace)
(27): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace)
(30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(31): ReLU(inplace)
(32): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(33): ReLU(inplace)
(34): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(35): ReLU(inplace)
(36): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU()
(2): Dropout(p=0.4)
(3): Linear(in_features=4096, out_features=102, bias=True)
(4): LogSoftmax()
)
)
==============================================================
Now you'll write a function to use a trained network for inference. That
is, you'll pass an image into the network and predict the class of the
flower in the image. Write a function called predict
that takes an
image and a model, then returns the top
probs, classes = predict(image_path, model)
print(probs)
print(classes)
> [ 0.01558163 0.01541934 0.01452626 0.01443549 0.01407339]
> ['70', '3', '45', '62', '55']
First you'll need to handle processing the input image such that it can be used in your network.
You'll want to use PIL
to load the image
(documentation).
It's best to write a function that preprocesses the image so it can be
used as input for the model. This function should process the images in
the same manner used for training.
First, resize the images where the shortest side is 256 pixels, keeping
the aspect ratio. This can be done with the
thumbnail
or
resize
methods. Then you'll need to crop out the center 224x224 portion of the
image.
Color channels of images are typically encoded as integers 0-255, but
the model expected floats 0-1. You'll need to convert the values. It's
easiest with a Numpy array, which you can get from a PIL image like so
np_image = np.array(pil_image)
.
As before, the network expects the images to be normalized in a specific
way. For the means, it's [0.485, 0.456, 0.406]
and for the standard
deviations [0.229, 0.224, 0.225]
. You'll want to subtract the means
from each color channel, then divide by the standard deviation.
And finally, PyTorch expects the color channel to be the first dimension
but it's the third dimension in the PIL image and Numpy array. You can
reorder dimensions using
ndarray.transpose
.
The color channel needs to be first and retain the order of the other
two dimensions.
In [21]:
from PIL import Image
import numpy as np
from IPython.display import display
def process_image(image_path):
MAX_SIZE = 256
ratio = 0
img = Image.open(image_path)
width, height = img.size
print("first size: %s %s"%(img.size))
if width > height:
ratio = MAX_SIZE/float(height)
height = MAX_SIZE
width = int((float(width)*float(ratio)))
else:
ratio = MAX_SIZE/float(width)
width = MAX_SIZE
height = int((float(height)*float(ratio)))
print("second size: %s %s"%(width,height))
img = img.resize((width,height))
print(img.size)
left_margin = (img.width-224)/2
up_margin = (img.height-224)/2
right_margin = left_margin + 224
bottom_margin = up_margin + 224
# left, up, right, bottom
img = img.crop((left_margin,up_margin,right_margin,bottom_margin))
np_img = np.array(img)/255
mean = np.array([0.485, 0.456, 0.406]) #provided mean
std = np.array([0.229, 0.224, 0.225]) #provided std
np_img = (np_img - mean)/std
np_img = np_img.transpose((2, 0, 1))
return torch.from_numpy(np_img).type(torch.FloatTensor)
In [22]:
img_path = 'flowers/test/100/image_07896.jpg'
np_img = process_image(img_path)
first size: 603 500
second size: 308 256
(308, 256)
To check your work, the function below converts a PyTorch tensor and
displays it in the notebook. If your process_image
function works,
running the output through this function should return the original
image (except for the cropped out portions).
In [43]:
def imshow(image, ax=None, title=None):
"""Imshow for Tensor."""
if ax is None:
fig, ax = plt.subplots()
# PyTorch tensors assume the color channel is the first dimension
# but matplotlib assumes is the third dimension
image = image.numpy().transpose((1, 2, 0))
# Undo preprocessing
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
image = std * image + mean
# Image needs to be clipped between 0 and 1 or it looks like noise when displayed
image = np.clip(image, 0, 1)
ax.imshow(image)
return ax
In [44]:
imshow(np_img)
Out[44]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f6931629ef0>
Once you can get images in the correct format, it's time to write a
function for making predictions with your model. A common practice is to
predict the top 5 or so (usually called top-$K$) most probable
classes. You'll want to calculate the class probabilities then find the
To get the top x.topk(k)
.
This method returns both the highest k
probabilities and the indices
of those probabilities corresponding to the classes. You need to convert
from these indices to the actual class labels using class_to_idx
which
hopefully you added to the model or from an ImageFolder
you used to
load the data ([see here]
Make sure to invert the dictionary so you get a mapping from index to class as well.
Again, this method should take a path to an image and a model checkpoint, then return the probabilities and classes.
probs, classes = predict(image_path, model)
print(probs)
print(classes)
> [ 0.01558163 0.01541934 0.01452626 0.01443549 0.01407339]
> ['70', '3', '45', '62', '55']
In [45]:
def predict(image_path, model, topk=5):
model.eval()
np_img = process_image(image_path)
np_img.unsqueeze_(0)
with torch.no_grad():
output = model.forward(np_img)
ps = torch.exp(output)
top_k_prob,top_k_class = ps.topk(topk,dim=1)
return top_k_prob, top_k_class
In [46]:
def classToNames(class_to_idx,top_class):
idx_to_class = {val: key for key, val in class_to_idx.items()}
top_labels = [idx_to_class[lab] for lab in top_class[0].numpy()]
top_flowers = [cat_to_name[lab] for lab in top_labels]
return top_flowers
In [50]:
probs, classes = predict(img_path, model)
probs, classes
top_flower_names = classToNames(model.class_to_idx,classes)
print(probs)
print(classes)
print(top_flower_names)
first size: 603 500
second size: 308 256
(308, 256)
tensor([[ 0.8379, 0.1457, 0.0052, 0.0051, 0.0038]])
tensor([[ 2, 52, 47, 6, 71]])
['blanket flower', 'sunflower', 'english marigold', "colt's foot", 'gazania']
Now that you can use a trained model for predictions, check to make sure
it makes sense. Even if the testing accuracy is high, it's always good
to check that there aren't obvious bugs. Use matplotlib
to plot the
probabilities for the top 5 classes as a bar graph, along with the input
image. It should look like this:
You can convert from the class integer encoding to actual flower names
with the cat_to_name.json
file (should have been loaded earlier in the
notebook). To show a PyTorch tensor as an image, use the imshow
function defined above.
In [51]:
import seaborn as sns
def view_classify(img_path,probs,flowers):
plt.figure(figsize = (6,10))
ax = plt.subplot(2,1,1)
# Set up title
flower_num = img_path.split('/')[2]
title_ = cat_to_name[flower_num]
# Plot flower
img = process_image(img_path)
imshow(img, ax, title = title_);
# Plot bar chart
plt.subplot(2,1,2)
sns.barplot(x=probs, y=flowers, color=sns.color_palette()[0]);
plt.show()
In [52]:
view_classify(img_path,probs[0].numpy(),top_flower_names)
first size: 603 500
second size: 308 256
(308, 256)
In [ ]:
In [ ]: