Learner for the vision applications

Open In Colab

All the functions necessary to build Learner suitable for transfer learning in computer vision

  1. /usr/local/lib/python3.8/dist-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
  2. return torch._C._cuda_getDeviceCount() > 0

The most important functions of this module are cnn_learner and unet_learner. They will help you define a Learner using a pretrained model. See the vision tutorial for examples of use.

Cut a pretrained model

By default, the fastai library cuts a pretrained model at the pooling layer. This function helps detecting it.

has_pool_type[source]

has_pool_type(m)

Return True if m is a pooling layer or has one in its children

  1. m = nn.Sequential(nn.AdaptiveAvgPool2d(5), nn.Linear(2,3), nn.Conv2d(2,3,1), nn.MaxPool3d(5))
  2. assert has_pool_type(m)
  3. test_eq([has_pool_type(m_) for m_ in m.children()], [True,False,False,True])

create_body[source]

create_body(arch, n_in=3, pretrained=True, cut=None)

Cut off the body of a typically pretrained arch as determined by cut

cut can either be an integer, in which case we cut the model at the corresponding layer, or a function, in which case, this function returns cut(model). It defaults to the first layer that contains some pooling otherwise.

  1. tst = lambda pretrained : nn.Sequential(nn.Conv2d(3,5,3), nn.BatchNorm2d(5), nn.AvgPool2d(1), nn.Linear(3,4))
  2. m = create_body(tst)
  3. test_eq(len(m), 2)
  4. m = create_body(tst, cut=3)
  5. test_eq(len(m), 3)
  6. m = create_body(tst, cut=noop)
  7. test_eq(len(m), 4)
  8. for n in range(1,5):
  9. m = create_body(tst, n_in=n)
  10. test_eq(_get_first_layer(m)[0].in_channels, n)

Head and model

create_head[source]

create_head(nf, n_out, lin_ftrs=None, ps=0.5, concat_pool=True, first_bn=True, bn_final=False, lin_first=False, y_range=None)

Model head that takes nf features, runs through lin_ftrs, and out n_out classes.

The head begins with fastai’s AdaptiveConcatPool2d if concat_pool=True otherwise, it uses traditional average pooling. Then it uses a Flatten layer before going on blocks of BatchNorm, Dropout and Linear layers (if lin_first=True, those are Linear, BatchNorm, Dropout).

Those blocks start at nf, then every element of lin_ftrs (defaults to [512]) and end at n_out. ps is a list of probabilities used for the dropouts (if you only pass 1, it will use half the value then that value as many times as necessary).

If first_bn=True, a BatchNorm added just after the pooling operations. If bn_final=True, a final BatchNorm layer is added. If y_range is passed, the function adds a SigmoidRange to that range.

  1. tst = create_head(5, 10)
  2. tst
  1. Sequential(
  2. (0): AdaptiveConcatPool2d(
  3. (ap): AdaptiveAvgPool2d(output_size=1)
  4. (mp): AdaptiveMaxPool2d(output_size=1)
  5. )
  6. (1): Flatten(full=False)
  7. (2): BatchNorm1d(10, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  8. (3): Dropout(p=0.25, inplace=False)
  9. (4): Linear(in_features=10, out_features=512, bias=False)
  10. (5): ReLU(inplace=True)
  11. (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  12. (7): Dropout(p=0.5, inplace=False)
  13. (8): Linear(in_features=512, out_features=10, bias=False)
  14. )
  1. # class ModelSplitter():
  2. # def __init__(self, idx): self.idx = idx
  3. # def split(self, m): return L(m[:self.idx], m[self.idx:]).map(params)
  4. # def __call__(self,): return {'cut':self.idx, 'split':self.split}

default_split[source]

default_split(m)

Default split of a model between body and head

To do transfer learning, you need to pass a splitter to Learner. This should be a function taking the model and returning a collection of parameter groups, e.g. a list of list of parameters.

create_cnn_model[source]

create_cnn_model(arch, n_out, pretrained=True, cut=None, n_in=3, init=kaiming_normal_, custom_head=None, concat_pool=True, lin_ftrs=None, ps=0.5, first_bn=True, bn_final=False, lin_first=False, y_range=None)

Create custom convnet architecture

The model is cut according to cut and it may be pretrained, in which case, the proper set of weights is downloaded then loaded. init is applied to the head of the model, which is either created by create_head (with lin_ftrs, ps, concat_pool, bn_final, lin_first and y_range) or is custom_head.

  1. tst = create_cnn_model(models.resnet18, 10, True)
  2. tst = create_cnn_model(models.resnet18, 10, True, n_in=1)
  1. pets = DataBlock(blocks=(ImageBlock, CategoryBlock),
  2. get_items=get_image_files,
  3. splitter=RandomSplitter(),
  4. get_y=RegexLabeller(pat = r'/([^/]+)_d+.jpg$'))
  5. dls = pets.dataloaders(untar_data(URLs.PETS)/"images", item_tfms=RandomResizedCrop(300, min_scale=0.5), bs=64,
  6. batch_tfms=[*aug_transforms(size=224)])

Learner convenience functions

cnn_learner[source]

cnn_learner(dls, arch, normalize=True, n_out=None, pretrained=True, config=None, loss_func=None, opt_func=Adam, lr=0.001, splitter=None, cbs=None, metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True, moms=(0.95, 0.85, 0.95), cut=None, n_in=3, init=kaiming_normal_, custom_head=None, concat_pool=True, lin_ftrs=None, ps=0.5, first_bn=True, bn_final=False, lin_first=False, y_range=None)

Build a convnet style learner from dls and arch

The model is built from arch using the number of final activations inferred from dls if possible (otherwise pass a value to n_out). It might be pretrained and the architecture is cut and split using the default metadata of the model architecture (this can be customized by passing a cut or a splitter).

If normalize and pretrained are True, this function adds a Normalization transform to the dls (if there is not already one) using the statistics of the pretrained model. That way, you won’t ever forget to normalize your data in transfer learning.

All other arguments are passed to Learner.

  1. path = untar_data(URLs.PETS)
  2. fnames = get_image_files(path/"images")
  3. pat = r'^(.*)_d+.jpg$'
  4. dls = ImageDataLoaders.from_name_re(path, fnames, pat, item_tfms=Resize(224))
  1. learn = cnn_learner(dls, models.resnet34, loss_func=CrossEntropyLossFlat(), ps=0.25)

create_unet_model[source]

create_unet_model(arch, n_out, img_size, pretrained=True, cut=None, n_in=3, blur=False, blur_final=True, self_attention=False, y_range=None, last_cross=True, bottle=False, act_cls=ReLU, init=kaiming_normal_, norm_type=None)

Create custom unet architecture

  1. tst = create_unet_model(models.resnet18, 10, (24,24), True, n_in=1)

unet_learner[source]

unet_learner(dls, arch, normalize=True, n_out=None, pretrained=True, config=None, loss_func=None, opt_func=Adam, lr=0.001, splitter=None, cbs=None, metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True, moms=(0.95, 0.85, 0.95), cut=None, n_in=3, blur=False, blur_final=True, self_attention=False, y_range=None, last_cross=True, bottle=False, act_cls=ReLU, init=kaiming_normal_, norm_type=None)

Build a unet learner from dls and arch

The model is built from arch using the number of final filters inferred from dls if possible (otherwise pass a value to n_out). It might be pretrained and the architecture is cut and split using the default metadata of the model architecture (this can be customized by passing a cut or a splitter).

If normalize and pretrained are True, this function adds a Normalization transform to the dls (if there is not already one) using the statistics of the pretrained model. That way, you won’t ever forget to normalize your data in transfer learning.

All other arguments are passed to Learner.

  1. path = untar_data(URLs.CAMVID_TINY)
  2. fnames = get_image_files(path/'images')
  3. def label_func(x): return path/'labels'/f'{x.stem}_P{x.suffix}'
  4. codes = np.loadtxt(path/'codes.txt', dtype=str)
  5. dls = SegmentationDataLoaders.from_label_func(path, fnames, label_func, codes=codes)
  1. learn = unet_learner(dls, models.resnet34, loss_func=CrossEntropyLossFlat(axis=1), y_range=(0,1))

Company logo

©2021 fast.ai. All rights reserved.
Site last generated: Mar 31, 2021