Provides essential functions to building and modifying `Model` architectures.

Model Layers

This module contains many layer classes that we might be interested in using in our models. These layers complement the default Pytorch layers which we can also use as predefined layers.

Custom fastai modules

class AdaptiveConcatPool2d[source]

AdaptiveConcatPool2d(`sz`:Optional[int]=`None`) :: Module

Layer that concats AdaptiveAvgPool2d and AdaptiveMaxPool2d.

The output will be 2*sz, or just 2 if sz is None.

The AdaptiveConcatPool2d object uses adaptive average pooling and adaptive max pooling and concatenates them both. We use this because it provides the model with the information of both methods and improves performance. This technique is called adaptive because it allows us to decide on what output dimensions we want, instead of choosing the input's dimensions to fit a desired output size.

Let's try training with Adaptive Average Pooling first, then with Adaptive Max Pooling and finally with the concatenation of them both to see how they fare in performance.

We will first define a simple_cnn using Adapative Max Pooling by changing the source code a bit.

path = untar_data(URLs.MNIST_SAMPLE)
data = ImageDataBunch.from_folder(path)
def simple_cnn_max(actns:Collection[int], kernel_szs:Collection[int]=None,
               strides:Collection[int]=None) -> nn.Sequential:
    "CNN with `conv2d_relu` layers defined by `actns`, `kernel_szs` and `strides`"
    nl = len(actns)-1
    kernel_szs = ifnone(kernel_szs, [3]*nl)
    strides    = ifnone(strides   , [2]*nl)
    layers = [conv_layer(actns[i], actns[i+1], kernel_szs[i], stride=strides[i])
        for i in range(len(strides))]
    layers.append(nn.Sequential(nn.AdaptiveMaxPool2d(1), Flatten()))
    return nn.Sequential(*layers)
model = simple_cnn_max((3,16,16,2))
learner = Learner(data, model, metrics=[accuracy])
learner.fit(1)
Total time: 00:02

epoch train_loss valid_loss accuracy
1 0.104029 0.073855 0.982336

Now let's try with Adapative Average Pooling now.

def simple_cnn_avg(actns:Collection[int], kernel_szs:Collection[int]=None,
               strides:Collection[int]=None) -> nn.Sequential:
    "CNN with `conv2d_relu` layers defined by `actns`, `kernel_szs` and `strides`"
    nl = len(actns)-1
    kernel_szs = ifnone(kernel_szs, [3]*nl)
    strides    = ifnone(strides   , [2]*nl)
    layers = [conv_layer(actns[i], actns[i+1], kernel_szs[i], stride=strides[i])
        for i in range(len(strides))]
    layers.append(nn.Sequential(nn.AdaptiveAvgPool2d(1), Flatten()))
    return nn.Sequential(*layers)
model = simple_cnn_avg((3,16,16,2))
learner = Learner(data, model, metrics=[accuracy])
learner.fit(1)
Total time: 00:02

epoch train_loss valid_loss accuracy
1 0.326732 0.269070 0.967125

Finally we will try with the concatenation of them both AdaptiveConcatPool2d. We will see that, in fact, it increases our accuracy and decreases our loss considerably!

def simple_cnn(actns:Collection[int], kernel_szs:Collection[int]=None,
               strides:Collection[int]=None) -> nn.Sequential:
    "CNN with `conv2d_relu` layers defined by `actns`, `kernel_szs` and `strides`"
    nl = len(actns)-1
    kernel_szs = ifnone(kernel_szs, [3]*nl)
    strides    = ifnone(strides   , [2]*nl)
    layers = [conv_layer(actns[i], actns[i+1], kernel_szs[i], stride=strides[i])
        for i in range(len(strides))]
    layers.append(nn.Sequential(AdaptiveConcatPool2d(1), Flatten()))
    return nn.Sequential(*layers)
model = simple_cnn((3,16,16,2))
learner = Learner(data, model, metrics=[accuracy])
learner.fit(1)
Total time: 00:02

epoch train_loss valid_loss accuracy
1 0.172055 0.113439 0.975957

class Lambda[source]

Lambda(`func`:LambdaFunc) :: Module

An easy way to create a pytorch layer for a simple func.

This is very useful to use functions as layers in our networks inside a Sequential object. So, for example, say we want to apply a log_softmax loss and we need to change the shape of our output batches to be able to use this loss. We can add a layer that applies the necessary change in shape by calling:

Lambda(lambda x: x.view(x.size(0),-1))

Let's see an example of how the shape of our output can change when we add this layer.

model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
)

model.cuda()

for xb, yb in data.train_dl:
    out = (model(*[xb]))
    print(out.size())
    break
torch.Size([64, 10, 1, 1])
model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
    Lambda(lambda x: x.view(x.size(0),-1))
)

model.cuda()

for xb, yb in data.train_dl:
    out = (model(*[xb]))
    print(out.size())
    break
torch.Size([64, 10])

class Flatten[source]

Flatten(`full`:bool=`False`) :: Module

Flatten x to a single dimension, often used at the end of a model. full for rank-1 tensor

The function we build above is actually implemented in our library as Flatten. We can see that it returns the same size when we run it.

model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
    Flatten(),
)

model.cuda()

for xb, yb in data.train_dl:
    out = (model(*[xb]))
    print(out.size())
    break
torch.Size([64, 10])

PoolFlatten[source]

PoolFlatten() → Sequential

Apply nn.AdaptiveAvgPool2d to x and then flatten the result.

We can combine these two final layers (AdaptiveAvgPool2d and Flatten) by using PoolFlatten.

model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    PoolFlatten()
)

model.cuda()

for xb, yb in data.train_dl:
    out = (model(*[xb]))
    print(out.size())
    break
torch.Size([64, 10])

Another use we give to the Lambda function is to resize batches with ResizeBatch when we have a layer that expects a different input than what comes from the previous one.

ResizeBatch[source]

ResizeBatch(`size`:int) → Tensor

Layer that resizes x to size, good for connecting mismatched layers.

a = torch.tensor([[1., -1.], [1., -1.]])
print(a)
tensor([[ 1., -1.],
        [ 1., -1.]])
out = ResizeBatch(4)
print(out(a))
tensor([[ 1., -1.,  1., -1.]])

class Debugger[source]

Debugger() :: Module

A module to debug inside a model.

The debugger module allows us to peek inside a network while its training and see in detail what is going on. We can see inputs, ouputs and sizes at any point in the network.

For instance, if you run the following:

model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    Debugger(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
)

model.cuda()

learner = Learner(data, model, metrics=[accuracy])
learner.fit(5)

... you'll see something like this:

/home/ubuntu/fastai/fastai/layers.py(74)forward()
     72     def forward(self,x:Tensor) -> Tensor:
     73         set_trace()
---> 74         return x
     75 
     76 class StdUpsample(nn.Module):

ipdb>

class PixelShuffle_ICNR[source]

PixelShuffle_ICNR(`ni`:int, `nf`:int=`None`, `scale`:int=`2`, `blur`:bool=`False`, `norm_type`=``, `leaky`:float=`None`) :: Module

Upsample by scale from ni filters to nf (default ni), using nn.PixelShuffle, icnr init, and weight_norm.

class MergeLayer[source]

MergeLayer(`dense`:bool=`False`) :: Module

Merge a shortcut with the result of the module by adding them or concatenating thme if dense=True.

class PartialLayer[source]

PartialLayer(`func`, `kwargs`) :: Module

Layer that applies partial(func, **kwargs).

class SigmoidRange[source]

SigmoidRange(`low`, `high`) :: Module

Sigmoid module with range (low,x_max)

class SequentialEx[source]

SequentialEx(`layers`) :: Module

Like nn.Sequential, but with ModuleList semantics, and can access module input

class SelfAttention[source]

SelfAttention(`n_channels`:int) :: Module

Self attention layer for 2d.

Loss functions

class FlattenedLoss[source]

FlattenedLoss(`func`, `args`, `axis`:int=`-1`, `floatify`:bool=`False`, `is_2d`:bool=`True`, `kwargs`)

Same as func, but flattens input and target.

Create an instance of func with args and kwargs. When passing an output and target, it

  • puts axis first in output and target with a transpose
  • casts the target to float is floatify=True
  • squeezes the output to two dimensions if is_2d, otherwise one dimension, squeezes the target to one dimension
  • applied the instance of func.

BCEFlat[source]

BCEFlat(`args`, `axis`:int=`-1`, `floatify`:bool=`True`, `kwargs`)

Same as nn.BCELoss, but flattens input and target.

BCEWithLogitsFlat[source]

BCEWithLogitsFlat(`args`, `axis`:int=`-1`, `floatify`:bool=`True`, `kwargs`)

Same as nn.BCEWithLogitsLoss, but flattens input and target.

CrossEntropyFlat[source]

CrossEntropyFlat(`args`, `axis`:int=`-1`, `kwargs`)

Same as nn.CrossEntropyLoss, but flattens input and target.

MSELossFlat[source]

MSELossFlat(`args`, `axis`:int=`-1`, `floatify`:bool=`True`, `kwargs`)

Same as nn.MSELoss, but flattens input and target.

class NoopLoss[source]

NoopLoss() :: Module

Just returns the mean of the output.

class WassersteinLoss[source]

WassersteinLoss() :: Module

For WGAN.

Helper functions to create modules

bn_drop_lin[source]

bn_drop_lin(`n_in`:int, `n_out`:int, `bn`:bool=`True`, `p`:float=`0.0`, `actn`:Optional[Module]=`None`)

The bn_drop_lin function returns a sequence of batch normalization, dropout and a linear layer. This custom layer is usually used at the end of a model.

n_in represents the number of size of the input n_out the size of the output, bn whether we want batch norm or not, p is how much dropout and actn is an optional parameter to add an activation function at the end.

conv2d[source]

conv2d(`ni`:int, `nf`:int, `ks`:int=`3`, `stride`:int=`1`, `padding`:int=`None`, `bias`=`False`, `init`:LayerFunc=`'kaiming_normal_'`) → Conv2d

Create and initialize nn.Conv2d layer. padding defaults to ks//2.

conv2d_trans[source]

conv2d_trans(`ni`:int, `nf`:int, `ks`:int=`2`, `stride`:int=`2`, `padding`:int=`0`, `bias`=`False`) → ConvTranspose2d

Create nn.ConvTranspose2d layer.

conv_layer[source]

conv_layer(`ni`:int, `nf`:int, `ks`:int=`3`, `stride`:int=`1`, `padding`:int=`None`, `bias`:bool=`None`, `is_1d`:bool=`False`, `norm_type`:Optional[NormType]=``, `use_activ`:bool=`True`, `leaky`:float=`None`, `transpose`:bool=`False`, `init`:Callable=`'kaiming_normal_'`, `self_attention`:bool=`False`)

The conv_layer function returns a sequence of nn.Conv2D, BatchNorm and a ReLU or leaky RELU activation function.

n_in represents the number of size of the input n_out the size of the output, ks kernel size, stride the stride with which we want to apply the convolutions. bias will decide if they have bias or not (if None, defaults to True unless using batchnorm). norm_type selects type of normalization (or None). If leaky is None, the activation is a standard ReLU, otherwise it's a LearkyReLU of slope leaky. Finally if transpose=True, the convolution is replaced by a ConvTranspose2D.

embedding[source]

embedding(`ni`:int, `nf`:int) → Module

Create an embedding layer with input size ni and output size nf.

relu[source]

relu(`inplace`:bool=`False`, `leaky`:float=`None`)

Return a relu activation, maybe leaky and inplace.

res_block[source]

res_block(`nf`, `dense`:bool=`False`, `norm_type`:Optional[NormType]=``, `bottle`:bool=`False`, `kwargs`)

Resnet block of nf features.

sigmoid_range[source]

sigmoid_range(`x`, `low`, `high`)

Sigmoid function with range (low, high)

simple_cnn[source]

simple_cnn(`actns`:Collection[int], `kernel_szs`:Collection[int]=`None`, `strides`:Collection[int]=`None`, `bn`=`False`) → Sequential

CNN with conv_layer defined by actns, kernel_szs and strides, plus batchnorm if bn.

Initialization of modules

batchnorm_2d[source]

batchnorm_2d(`nf`:int, `norm_type`:NormType=``)

A batchnorm2d layer with nf features initialized depending on norm_type.

icnr[source]

icnr(`x`, `scale`=`2`, `init`=`'kaiming_normal_'`)

ICNR init of x, with scale and init function.

trunc_normal_[source]

trunc_normal_(`x`:Tensor, `mean`:float=`0.0`, `std`:float=`1.0`) → Tensor

Truncated normal initialization.

icnr[source]

icnr(`x`, `scale`=`2`, `init`=`'kaiming_normal_'`)

ICNR init of x, with scale and init function.

`NormType`

Enum = [Batch, BatchZero, Weight, Spectral]

An enumeration.