All the functions necessary to build `Learner` suitable for transfer learning in computer vision

The most important functions of this module are vision_learner and unet_learner. They will help you define a Learner using a pretrained model. See the vision tutorial for examples of use.

Cut a pretrained model

By default, the fastai library cuts a pretrained model at the pooling layer. This function helps detecting it.

has_pool_type[source]

has_pool_type(m)

Return True if m is a pooling layer or has one in its children

m = nn.Sequential(nn.AdaptiveAvgPool2d(5), nn.Linear(2,3), nn.Conv2d(2,3,1), nn.MaxPool3d(5))
assert has_pool_type(m)
test_eq([has_pool_type(m_) for m_ in m.children()], [True,False,False,True])

cut_model[source]

cut_model(model, cut)

Cut an instantiated model

create_body[source]

create_body(arch, n_in=3, pretrained=True, cut=None)

Cut off the body of a typically pretrained arch as determined by cut

cut can either be an integer, in which case we cut the model at the corresponding layer, or a function, in which case, this function returns cut(model). It defaults to the first layer that contains some pooling otherwise.

tst = lambda pretrained : nn.Sequential(nn.Conv2d(3,5,3), nn.BatchNorm2d(5), nn.AvgPool2d(1), nn.Linear(3,4))
m = create_body(tst)
test_eq(len(m), 2)

m = create_body(tst, cut=3)
test_eq(len(m), 3)

m = create_body(tst, cut=noop)
test_eq(len(m), 4)

for n in range(1,5):    
    m = create_body(tst, n_in=n)
    test_eq(_get_first_layer(m)[0].in_channels, n)

Head and model

create_head[source]

create_head(nf, n_out, lin_ftrs=None, ps=0.5, pool=True, concat_pool=True, first_bn=True, bn_final=False, lin_first=False, y_range=None)

Model head that takes nf features, runs through lin_ftrs, and out n_out classes.

The head begins with fastai's AdaptiveConcatPool2d if concat_pool=True otherwise, it uses traditional average pooling. Then it uses a Flatten layer before going on blocks of BatchNorm, Dropout and Linear layers (if lin_first=True, those are Linear, BatchNorm, Dropout).

Those blocks start at nf, then every element of lin_ftrs (defaults to [512]) and end at n_out. ps is a list of probabilities used for the dropouts (if you only pass 1, it will use half the value then that value as many times as necessary).

If first_bn=True, a BatchNorm added just after the pooling operations. If bn_final=True, a final BatchNorm layer is added. If y_range is passed, the function adds a SigmoidRange to that range.

tst = create_head(5, 10)
tst
Sequential(
  (0): AdaptiveConcatPool2d(
    (ap): AdaptiveAvgPool2d(output_size=1)
    (mp): AdaptiveMaxPool2d(output_size=1)
  )
  (1): Flatten(full=False)
  (2): BatchNorm1d(10, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (3): Dropout(p=0.25, inplace=False)
  (4): Linear(in_features=10, out_features=512, bias=False)
  (5): ReLU(inplace=True)
  (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (7): Dropout(p=0.5, inplace=False)
  (8): Linear(in_features=512, out_features=10, bias=False)
)
# class ModelSplitter():
#     def __init__(self, idx): self.idx = idx
#     def split(self, m): return L(m[:self.idx], m[self.idx:]).map(params)
#     def __call__(self,): return {'cut':self.idx, 'split':self.split}

default_split[source]

default_split(m)

Default split of a model between body and head

To do transfer learning, you need to pass a splitter to Learner. This should be a function taking the model and returning a collection of parameter groups, e.g. a list of list of parameters.

add_head[source]

add_head(body, nf, n_out, init=kaiming_normal_, head=None, concat_pool=True, pool=True, lin_ftrs=None, ps=0.5, first_bn=True, bn_final=False, lin_first=False, y_range=None)

Add a head to a vision body

create_vision_model[source]

create_vision_model(arch, n_out, pretrained=True, cut=None, n_in=3, init=kaiming_normal_, custom_head=None, concat_pool=True, pool=True, lin_ftrs=None, ps=0.5, first_bn=True, bn_final=False, lin_first=False, y_range=None)

Create custom vision architecture

The model is cut according to cut and it may be pretrained, in which case, the proper set of weights is downloaded then loaded. init is applied to the head of the model, which is either created by create_head (with lin_ftrs, ps, concat_pool, bn_final, lin_first and y_range) or is custom_head.

tst = create_vision_model(models.resnet18, 10, True)
tst = create_vision_model(models.resnet18, 10, True, n_in=1)

class TimmBody[source]

TimmBody(arch:str, pretrained:bool=True, cut=None, n_in:int=3, **kwargs) :: Module

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

create_timm_model[source]

create_timm_model(arch:str, n_out, cut=None, pretrained=True, n_in=3, init=kaiming_normal_, custom_head=None, concat_pool=True, pool=True, lin_ftrs=None, ps=0.5, first_bn=True, bn_final=False, lin_first=False, y_range=None, **kwargs)

Create custom architecture using arch, n_in and n_out from the timm library

Learner convenience functions

vision_learner[source]

vision_learner(dls, arch, normalize=True, n_out=None, pretrained=True, loss_func=None, opt_func=Adam, lr=0.001, splitter=None, cbs=None, metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True, moms=(0.95, 0.85, 0.95), cut=None, init=kaiming_normal_, custom_head=None, concat_pool=True, pool=True, lin_ftrs=None, ps=0.5, first_bn=True, bn_final=False, lin_first=False, y_range=None, n_in=3)

Build a vision learner from dls and arch

Type Default Details
dls No Content
normalize bool True No Content
loss_func None No Content
opt_func function Adam No Content
lr float 0.001 No Content
splitter None No Content
cbs None No Content
metrics None No Content
path None learner args
model_dir str models No Content
wd None No Content
wd_bn_bias bool False No Content
train_bn bool True No Content
moms tuple (0.95, 0.85, 0.95) No Content
pool bool True model & head args
Valid Keyword Arguments
arch Argument passed to create_vision_model
n_out None Argument passed to create_vision_model
pretrained bool True Argument passed to create_vision_model
cut None Argument passed to create_vision_model
init function kaiming_normal_ Argument passed to create_vision_model
custom_head None Argument passed to create_vision_model
concat_pool bool True Argument passed to create_vision_model
lin_ftrs None Argument passed to create_vision_model
ps float 0.5 Argument passed to create_vision_model
first_bn bool True Argument passed to create_vision_model
bn_final bool False Argument passed to create_vision_model
lin_first bool False Argument passed to create_vision_model
y_range None Argument passed to create_vision_model
n_in int 3 Argument passed to create_vision_model

The model is built from arch using the number of final activations inferred from dls if possible (otherwise pass a value to n_out). It might be pretrained and the architecture is cut and split using the default metadata of the model architecture (this can be customized by passing a cut or a splitter).

If normalize and pretrained are True, this function adds a Normalization transform to the dls (if there is not already one) using the statistics of the pretrained model. That way, you won't ever forget to normalize your data in transfer learning.

All other arguments are passed to Learner.

path = untar_data(URLs.PETS)
fnames = get_image_files(path/"images")
pat = r'^(.*)_\d+.jpg$'
dls = ImageDataLoaders.from_name_re(path, fnames, pat, item_tfms=Resize(224))
learn = vision_learner(dls, models.resnet18, loss_func=CrossEntropyLossFlat(), ps=0.25)

If you pass a str to arch, then a TIMM model will be created:

learn = vision_learner(dls, 'vit_tiny_patch16_224', loss_func=CrossEntropyLossFlat(), ps=0.25)

create_unet_model[source]

create_unet_model(arch, n_out, img_size, pretrained=True, cut=None, n_in=3, blur=False, blur_final=True, self_attention=False, y_range=None, last_cross=True, bottle=False, act_cls=ReLU, init=kaiming_normal_, norm_type=None)

Create custom unet architecture

tst = create_unet_model(models.resnet18, 10, (24,24), True, n_in=1)

unet_learner[source]

unet_learner(dls, arch, normalize=True, n_out=None, pretrained=True, config=None, loss_func=None, opt_func=Adam, lr=0.001, splitter=None, cbs=None, metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True, moms=(0.95, 0.85, 0.95), cut=None, n_in=3, blur=False, blur_final=True, self_attention=False, y_range=None, last_cross=True, bottle=False, act_cls=ReLU, init=kaiming_normal_, norm_type=None)

Build a unet learner from dls and arch

Type Default Details
dls No Content
normalize bool True No Content
config None No Content
loss_func None No Content
opt_func function Adam No Content
lr float 0.001 No Content
splitter None No Content
cbs None No Content
metrics None No Content
path None learner args
model_dir str models No Content
wd None No Content
wd_bn_bias bool False No Content
train_bn bool True No Content
moms tuple (0.95, 0.85, 0.95) No Content
Valid Keyword Arguments
arch Argument passed to create_unet_model
n_out None Argument passed to create_unet_model
pretrained bool True Argument passed to create_unet_model
cut None Argument passed to create_unet_model
n_in int 3 Argument passed to create_unet_model
blur bool False Argument passed to create_unet_model
blur_final bool True Argument passed to create_unet_model
self_attention bool False Argument passed to create_unet_model
y_range None Argument passed to create_unet_model
last_cross bool True Argument passed to create_unet_model
bottle bool False Argument passed to create_unet_model
act_cls type ReLU Argument passed to create_unet_model
init function kaiming_normal_ Argument passed to create_unet_model
norm_type None Argument passed to create_unet_model

The model is built from arch using the number of final filters inferred from dls if possible (otherwise pass a value to n_out). It might be pretrained and the architecture is cut and split using the default metadata of the model architecture (this can be customized by passing a cut or a splitter).

If normalize and pretrained are True, this function adds a Normalization transform to the dls (if there is not already one) using the statistics of the pretrained model. That way, you won't ever forget to normalize your data in transfer learning.

All other arguments are passed to Learner.

path = untar_data(URLs.CAMVID_TINY)
fnames = get_image_files(path/'images')
def label_func(x): return path/'labels'/f'{x.stem}_P{x.suffix}'
codes = np.loadtxt(path/'codes.txt', dtype=str)
    
dls = SegmentationDataLoaders.from_label_func(path, fnames, label_func, codes=codes)
/opt/conda/lib/python3.7/site-packages/torch/_tensor.py:1051: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  ret = func(*args, **kwargs)
learn = unet_learner(dls, models.resnet34, loss_func=CrossEntropyLossFlat(axis=1), y_range=(0,1))

create_cnn_model[source]

create_cnn_model(*args, **kwargs)

Deprecated name for create_vision_model -- do not use

cnn_learner[source]

cnn_learner(*args, **kwargs)

Deprecated name for vision_learner -- do not use