Deep Learning Is Not Just for Image Classification

Deep learning’s effectiveness for classifying images has been widely discussed in recent years, even showing superhuman results on complex tasks like recognizing malignant tumors in CT scans. But it can do a lot more than this, as we will show here.

For instance, let’s talk about something that is critically important for autonomous vehicles: localizing objects in a picture. If a self-driving car doesn’t know where a pedestrian is, then it doesn’t know how to avoid one! Creating a model that can recognize the content of every individual pixel in an image is called segmentation. Here is how we can train a segmentation model with fastai, using a subset of the Camvid dataset from the paper “Semantic Object Classes in Video: A High-Definition Ground Truth Database” by Gabruel J. Brostow, Julien Fauqueur, and Roberto Cipolla:

In [ ]:

  1. path = untar_data(URLs.CAMVID_TINY)
  2. dls = SegmentationDataLoaders.from_label_func(
  3. path, bs=8, fnames = get_image_files(path/"images"),
  4. label_func = lambda o: path/'labels'/f'{o.stem}_P{o.suffix}',
  5. codes = np.loadtxt(path/'codes.txt', dtype=str)
  6. )
  7. learn = unet_learner(dls, resnet34)
  8. learn.fine_tune(8)
epochtrain_lossvalid_losstime
02.6418622.14056800:02
epochtrain_lossvalid_losstime
01.6249641.46421000:02
11.4541481.28403200:02
21.3429551.04856200:02
31.1997650.85278700:02
41.0780900.83820600:02
50.9754960.74680600:02
60.8927930.72538400:02
70.8276450.72677800:02

We are not even going to walk through this code line by line, because it is nearly identical to our previous example! (Although we will be doing a deep dive into segmentation models in <>, along with all of the other models that we are briefly introducing in this chapter, and many, many more.)

We can visualize how well it achieved its task, by asking the model to color-code each pixel of an image. As you can see, it nearly perfectly classifies every pixel in every object. For instance, notice that all of the cars are overlaid with the same color and all of the trees are overlaid with the same color (in each pair of images, the lefthand image is the ground truth label and the right is the prediction from the model):

In [ ]:

  1. learn.show_results(max_n=6, figsize=(7,8))

Deep Learning Is Not Just for Image Classification - 图1

One other area where deep learning has dramatically improved in the last couple of years is natural language processing (NLP). Computers can now generate text, translate automatically from one language to another, analyze comments, label words in sentences, and much more. Here is all of the code necessary to train a model that can classify the sentiment of a movie review better than anything that existed in the world just five years ago:

In [ ]:

  1. from fastai.text.all import *
  2. dls = TextDataLoaders.from_folder(untar_data(URLs.IMDB), valid='test')
  3. learn = text_classifier_learner(dls, AWD_LSTM, drop_mult=0.5, metrics=accuracy)
  4. learn.fine_tune(4, 1e-2)
epochtrain_lossvalid_lossaccuracytime
00.8787760.7487530.50040001:27
epochtrain_lossvalid_lossaccuracytime
00.6791180.6747780.58404002:45
10.6536710.6703960.61804002:55
20.5986650.5518150.71892005:28
30.5568120.5074500.75248003:11