Your Deep Learning Journey - Deep Learning Is Not Just for Image Classification - 《The fastai book》

Deep Learning Is Not Just for Image Classification

Deep Learning Is Not Just for Image Classification

Deep learning’s effectiveness for classifying images has been widely discussed in recent years, even showing superhuman results on complex tasks like recognizing malignant tumors in CT scans. But it can do a lot more than this, as we will show here.

For instance, let’s talk about something that is critically important for autonomous vehicles: localizing objects in a picture. If a self-driving car doesn’t know where a pedestrian is, then it doesn’t know how to avoid one! Creating a model that can recognize the content of every individual pixel in an image is called segmentation. Here is how we can train a segmentation model with fastai, using a subset of the Camvid dataset from the paper “Semantic Object Classes in Video: A High-Definition Ground Truth Database” by Gabruel J. Brostow, Julien Fauqueur, and Roberto Cipolla:

In [ ]:

path = untar_data(URLs.CAMVID_TINY)
dls = SegmentationDataLoaders.from_label_func(
    path, bs=8, fnames = get_image_files(path/"images"),
    label_func = lambda o: path/'labels'/f'{o.stem}_P{o.suffix}',
    codes = np.loadtxt(path/'codes.txt', dtype=str)
)
learn = unet_learner(dls, resnet34)
learn.fine_tune(8)

epoch	train_loss	valid_loss	time
0	2.641862	2.140568	00:02

epoch	train_loss	valid_loss	time
0	1.624964	1.464210	00:02
1	1.454148	1.284032	00:02
2	1.342955	1.048562	00:02
3	1.199765	0.852787	00:02
4	1.078090	0.838206	00:02
5	0.975496	0.746806	00:02
6	0.892793	0.725384	00:02
7	0.827645	0.726778	00:02

We are not even going to walk through this code line by line, because it is nearly identical to our previous example! (Although we will be doing a deep dive into segmentation models in <>, along with all of the other models that we are briefly introducing in this chapter, and many, many more.)

We can visualize how well it achieved its task, by asking the model to color-code each pixel of an image. As you can see, it nearly perfectly classifies every pixel in every object. For instance, notice that all of the cars are overlaid with the same color and all of the trees are overlaid with the same color (in each pair of images, the lefthand image is the ground truth label and the right is the prediction from the model):

In [ ]:

learn.show_results(max_n=6, figsize=(7,8))

One other area where deep learning has dramatically improved in the last couple of years is natural language processing (NLP). Computers can now generate text, translate automatically from one language to another, analyze comments, label words in sentences, and much more. Here is all of the code necessary to train a model that can classify the sentiment of a movie review better than anything that existed in the world just five years ago:

In [ ]:

from fastai.text.all import *
dls = TextDataLoaders.from_folder(untar_data(URLs.IMDB), valid='test')
learn = text_classifier_learner(dls, AWD_LSTM, drop_mult=0.5, metrics=accuracy)
learn.fine_tune(4, 1e-2)

epoch	train_loss	valid_loss	accuracy	time
0	0.878776	0.748753	0.500400	01:27

epoch	train_loss	valid_loss	accuracy	time
0	0.679118	0.674778	0.584040	02:45
1	0.653671	0.670396	0.618040	02:55
2	0.598665	0.551815	0.718920	05:28
3	0.556812	0.507450	0.752480	03:11