= nn.Sequential(nn.AdaptiveAvgPool2d(5), nn.Linear(2,3), nn.Conv2d(2,3,1), nn.MaxPool3d(5))
m assert has_pool_type(m)
for m_ in m.children()], [True,False,False,True]) test_eq([has_pool_type(m_)
Vision learner
Learner
suitable for transfer learning in computer vision
The most important functions of this module are vision_learner
and unet_learner
. They will help you define a Learner
using a pretrained model. See the vision tutorial for examples of use.
Cut a pretrained model
By default, the fastai library cuts a pretrained model at the pooling layer. This function helps detecting it.
has_pool_type
has_pool_type (m)
Return True
if m
is a pooling layer or has one in its children
cut_model
cut_model (model, cut)
Cut an instantiated model
create_body
create_body (model, n_in=3, pretrained=True, cut=None)
Cut off the body of a typically pretrained arch
as determined by cut
cut
can either be an integer, in which case we cut the model at the corresponding layer, or a function, in which case, this function returns cut(model)
. It defaults to the first layer that contains some pooling otherwise.
def tst(): return nn.Sequential(nn.Conv2d(3,5,3), nn.BatchNorm2d(5), nn.AvgPool2d(1), nn.Linear(3,4))
= create_body(tst())
m len(m), 2)
test_eq(
= create_body(tst(), cut=3)
m len(m), 3)
test_eq(
= create_body(tst(), cut=noop)
m len(m), 4)
test_eq(
for n in range(1,5):
= create_body(tst(), n_in=n)
m 0].in_channels, n) test_eq(_get_first_layer(m)[
Head and model
create_head
create_head (nf, n_out, lin_ftrs=None, ps=0.5, pool=True, concat_pool=True, first_bn=True, bn_final=False, lin_first=False, y_range=None)
Model head that takes nf
features, runs through lin_ftrs
, and out n_out
classes.
The head begins with fastai’s AdaptiveConcatPool2d
if concat_pool=True
otherwise, it uses traditional average pooling. Then it uses a Flatten
layer before going on blocks of BatchNorm
, Dropout
and Linear
layers (if lin_first=True
, those are Linear
, BatchNorm
, Dropout
).
Those blocks start at nf
, then every element of lin_ftrs
(defaults to [512]
) and end at n_out
. ps
is a list of probabilities used for the dropouts (if you only pass 1, it will use half the value then that value as many times as necessary).
If first_bn=True
, a BatchNorm
added just after the pooling operations. If bn_final=True
, a final BatchNorm
layer is added. If y_range
is passed, the function adds a SigmoidRange
to that range.
= create_head(5, 10)
tst tst
Sequential(
(0): AdaptiveConcatPool2d(
(ap): AdaptiveAvgPool2d(output_size=1)
(mp): AdaptiveMaxPool2d(output_size=1)
)
(1): fastai.layers.Flatten(full=False)
(2): BatchNorm1d(10, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.25, inplace=False)
(4): Linear(in_features=10, out_features=512, bias=False)
(5): ReLU(inplace=True)
(6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Dropout(p=0.5, inplace=False)
(8): Linear(in_features=512, out_features=10, bias=False)
)
#TODO: refactor, i.e. something like this?
# class ModelSplitter():
# def __init__(self, idx): self.idx = idx
# def split(self, m): return L(m[:self.idx], m[self.idx:]).map(params)
# def __call__(self,): return {'cut':self.idx, 'split':self.split}
default_split
default_split (m)
Default split of a model between body and head
To do transfer learning, you need to pass a splitter
to Learner
. This should be a function taking the model and returning a collection of parameter groups, e.g. a list of list of parameters.
add_head
add_head (body, nf, n_out, init=<function kaiming_normal_>, head=None, concat_pool=True, pool=True, lin_ftrs=None, ps=0.5, first_bn=True, bn_final=False, lin_first=False, y_range=None)
Add a head to a vision body
create_vision_model
create_vision_model (arch, n_out, pretrained=True, weights=None, cut=None, n_in=3, init=<function kaiming_normal_>, custom_head=None, concat_pool=True, pool=True, lin_ftrs=None, ps=0.5, first_bn=True, bn_final=False, lin_first=False, y_range=None)
Create custom vision architecture
The model is cut according to cut
and it may be pretrained
, in which case, the proper set of weights is downloaded then loaded. init
is applied to the head of the model, which is either created by create_head
(with lin_ftrs
, ps
, concat_pool
, bn_final
, lin_first
and y_range
) or is custom_head
.
= create_vision_model(models.resnet18, 10, True)
tst = create_vision_model(models.resnet18, 10, True, n_in=1) tst
TimmBody
TimmBody (model, pretrained:bool=True, cut=None, n_in:int=3)
*Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to
, etc.
.. note:: As per the example above, an __init__()
call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool*
create_timm_model
create_timm_model (arch, n_out, cut=None, pretrained=True, n_in=3, init=<function kaiming_normal_>, custom_head=None, concat_pool=True, pool=True, lin_ftrs=None, ps=0.5, first_bn=True, bn_final=False, lin_first=False, y_range=None, **kwargs)
Create custom architecture using arch
, n_in
and n_out
from the timm
library
# make sure that timm models can be scripted:
= create_timm_model('resnet34', 1)
tst, _ = torch.jit.script(tst)
scripted assert scripted, "model could not be converted to TorchScript"
Learner
convenience functions
vision_learner
vision_learner (dls, arch, normalize=True, n_out=None, pretrained=True, weights=None, loss_func=None, opt_func=<function Adam>, lr=0.001, splitter=None, cbs=None, metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True, moms=(0.95, 0.85, 0.95), cut=None, init=<function kaiming_normal_>, custom_head=None, concat_pool=True, pool=True, lin_ftrs=None, ps=0.5, first_bn=True, bn_final=False, lin_first=False, y_range=None, n_in=3)
Build a vision learner from dls
and arch
Type | Default | Details | |
---|---|---|---|
dls | |||
arch | |||
normalize | bool | True | |
n_out | NoneType | None | |
pretrained | bool | True | |
weights | NoneType | None | |
loss_func | NoneType | None | |
opt_func | function | Adam | |
lr | float | 0.001 | |
splitter | NoneType | None | |
cbs | NoneType | None | |
metrics | NoneType | None | |
path | NoneType | None | learner args |
model_dir | str | models | |
wd | NoneType | None | |
wd_bn_bias | bool | False | |
train_bn | bool | True | |
moms | tuple | (0.95, 0.85, 0.95) | |
cut | NoneType | None | |
init | function | kaiming_normal_ | |
custom_head | NoneType | None | |
concat_pool | bool | True | |
pool | bool | True | model & head args |
lin_ftrs | NoneType | None | |
ps | float | 0.5 | |
first_bn | bool | True | |
bn_final | bool | False | |
lin_first | bool | False | |
y_range | NoneType | None | |
n_in | int | 3 |
The model is built from arch
using the number of final activations inferred from dls
if possible (otherwise pass a value to n_out
). It might be pretrained
and the architecture is cut and split using the default metadata of the model architecture (this can be customized by passing a cut
or a splitter
).
If normalize
and pretrained
are True
, this function adds a Normalization
transform to the dls
(if there is not already one) using the statistics of the pretrained model. That way, you won’t ever forget to normalize your data in transfer learning.
All other arguments are passed to Learner
.
Starting with version 0.13, TorchVision supports multiple pretrained weights for the same model architecture. The vision_learner
default of pretrained=True, weights=None
will use the architecture’s default weights, which are currently IMAGENET1K_V2. If you are using an older version of TorchVision or creating a timm model, setting weights
will have no effect.
from torchvision.models import ResNet50_Weights
# Legacy weights with accuracy 76.130%
=True, weights=ResNet50_Weights.IMAGENET1K_V1, ...)
vision_learner(models.resnet50, pretrained
# New weights with accuracy 80.858%. Strings are also supported.
=True, weights='IMAGENET1K_V2', ...)
vision_learner(models.resnet50, pretrained
# Best available weights (currently an alias for IMAGENET1K_V2).
# Default weights if vision_learner weights isn't set.
=True, weights=ResNet50_Weights.DEFAULT, ...)
vision_learner(models.resnet50, pretrained
# No weights - random initialization
=False, weights=None, ...) vision_learner(models.resnet50, pretrained
The example above shows how to use the new TorchVision 0.13 multi-weight api with vision_learner
.
= untar_data(URLs.PETS)
path = get_image_files(path/"images")
fnames = r'^(.*)_\d+.jpg$'
pat = ImageDataLoaders.from_name_re(path, fnames, pat, item_tfms=Resize(224)) dls
= vision_learner(dls, models.resnet18, loss_func=CrossEntropyLossFlat(), ps=0.25) learn
If you pass a str
to arch
, then a timm model will be created:
= ImageDataLoaders.from_name_re(path, fnames, pat, item_tfms=Resize(224))
dls = vision_learner(dls, 'convnext_tiny', loss_func=CrossEntropyLossFlat(), ps=0.25) learn
create_unet_model
create_unet_model (arch, n_out, img_size, pretrained=True, weights=None, cut=None, n_in=3, blur=False, blur_final=True, self_attention=False, y_range=None, last_cross=True, bottle=False, act_cls=<class 'torch.nn.modules.activation.ReLU'>, init=<function kaiming_normal_>, norm_type=None)
Create custom unet architecture
= create_unet_model(models.resnet18, 10, (24,24), True, n_in=1) tst
unet_learner
unet_learner (dls, arch, normalize=True, n_out=None, pretrained=True, weights=None, config=None, loss_func=None, opt_func=<function Adam>, lr=0.001, splitter=None, cbs=None, metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True, moms=(0.95, 0.85, 0.95), cut=None, n_in=3, blur=False, blur_final=True, self_attention=False, y_range=None, last_cross=True, bottle=False, act_cls=<class 'torch.nn.modules.activation.ReLU'>, init=<function kaiming_normal_>, norm_type=None)
Build a unet learner from dls
and arch
Type | Default | Details | |
---|---|---|---|
dls | |||
arch | |||
normalize | bool | True | |
n_out | NoneType | None | |
pretrained | bool | True | |
weights | NoneType | None | |
config | NoneType | None | |
loss_func | NoneType | None | |
opt_func | function | Adam | |
lr | float | 0.001 | |
splitter | NoneType | None | |
cbs | NoneType | None | |
metrics | NoneType | None | |
path | NoneType | None | learner args |
model_dir | str | models | |
wd | NoneType | None | |
wd_bn_bias | bool | False | |
train_bn | bool | True | |
moms | tuple | (0.95, 0.85, 0.95) | |
cut | NoneType | None | |
n_in | int | 3 | |
blur | bool | False | |
blur_final | bool | True | |
self_attention | bool | False | |
y_range | NoneType | None | |
last_cross | bool | True | |
bottle | bool | False | |
act_cls | type | ReLU | |
init | function | kaiming_normal_ | |
norm_type | NoneType | None |
The model is built from arch
using the number of final filters inferred from dls
if possible (otherwise pass a value to n_out
). It might be pretrained
and the architecture is cut and split using the default metadata of the model architecture (this can be customized by passing a cut
or a splitter
).
If normalize
and pretrained
are True
, this function adds a Normalization
transform to the dls
(if there is not already one) using the statistics of the pretrained model. That way, you won’t ever forget to normalize your data in transfer learning.
All other arguments are passed to Learner
.
unet_learner
also supports TorchVision’s new multi-weight API via weights
. See vision_learner
for more details.
= untar_data(URLs.CAMVID_TINY)
path = get_image_files(path/'images')
fnames def label_func(x): return path/'labels'/f'{x.stem}_P{x.suffix}'
= np.loadtxt(path/'codes.txt', dtype=str)
codes = SegmentationDataLoaders.from_label_func(path, fnames, label_func, codes=codes) dls
= unet_learner(dls, models.resnet34, loss_func=CrossEntropyLossFlat(axis=1), y_range=(0,1)) learn
create_cnn_model
create_cnn_model (*args, **kwargs)
Deprecated name for create_vision_model
– do not use
cnn_learner
cnn_learner (*args, **kwargs)
Deprecated name for vision_learner
– do not use