# Layers


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Basic manipulations and resize

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L26"
target="_blank" style="float:right; font-size:smaller">source</a>

### module

``` python

def module(
    flds:VAR_POSITIONAL, defaults:VAR_KEYWORD
):

```

*Decorator to create an `nn.Module` using `f` as `forward` method*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L51"
target="_blank" style="float:right; font-size:smaller">source</a>

### Identity

``` python

def Identity(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

```

*Do nothing at all*

``` python
test_eq(Identity()(1), 1)
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L57"
target="_blank" style="float:right; font-size:smaller">source</a>

### Lambda

``` python

def Lambda(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

```

*An easy way to create a pytorch layer for a simple `func`*

``` python
def _add2(x): return x+2
tst = Lambda(_add2)
x = torch.randn(10,20)
test_eq(tst(x), x+2)
tst2 = pickle.loads(pickle.dumps(tst))
test_eq(tst2(x), x+2)
tst
```

    Lambda(func=<function _add2>)

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L62"
target="_blank" style="float:right; font-size:smaller">source</a>

### PartialLambda

``` python

def PartialLambda(
    func, kwargs:VAR_KEYWORD
):

```

*Layer that applies `partial(func, **kwargs)`*

``` python
def test_func(a,b=2): return a+b
tst = PartialLambda(test_func, b=5)
test_eq(tst(x), x+5)
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L73"
target="_blank" style="float:right; font-size:smaller">source</a>

### Flatten

``` python

def Flatten(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

```

*Flatten `x` to a single dimension, e.g. at end of a model. `full` for
rank-1 tensor*

``` python
tst = Flatten()
x = torch.randn(10,5,4)
test_eq(tst(x).shape, [10,20])
tst = Flatten(full=True)
test_eq(tst(x).shape, [200])
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L79"
target="_blank" style="float:right; font-size:smaller">source</a>

### ToTensorBase

``` python

def ToTensorBase(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

```

*Convert x to TensorBase class*

``` python
ttb = ToTensorBase()
timg = TensorImage(torch.rand(1,3,32,32))
test_eq(type(ttb(timg)), TensorBase)
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L84"
target="_blank" style="float:right; font-size:smaller">source</a>

### View

``` python

def View(
    size:VAR_POSITIONAL
):

```

*Reshape `x` to `size`*

``` python
tst = View(10,5,4)
test_eq(tst(x).shape, [10,5,4])
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L90"
target="_blank" style="float:right; font-size:smaller">source</a>

### ResizeBatch

``` python

def ResizeBatch(
    size:VAR_POSITIONAL
):

```

*Reshape `x` to `size`, keeping batch dim the same size*

``` python
tst = ResizeBatch(5,4)
test_eq(tst(x).shape, [10,5,4])
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L97"
target="_blank" style="float:right; font-size:smaller">source</a>

### Debugger

``` python

def Debugger(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

```

*A module to debug inside a model.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L103"
target="_blank" style="float:right; font-size:smaller">source</a>

### sigmoid_range

``` python

def sigmoid_range(
    x, low, high
):

```

*Sigmoid function with range `(low, high)`*

``` python
test = tensor([-10.,0.,10.])
assert torch.allclose(sigmoid_range(test, -1,  2), tensor([-1.,0.5, 2.]), atol=1e-4, rtol=1e-4)
assert torch.allclose(sigmoid_range(test, -5, -1), tensor([-5.,-3.,-1.]), atol=1e-4, rtol=1e-4)
assert torch.allclose(sigmoid_range(test,  2,  4), tensor([2.,  3., 4.]), atol=1e-4, rtol=1e-4)
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L109"
target="_blank" style="float:right; font-size:smaller">source</a>

### SigmoidRange

``` python

def SigmoidRange(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

```

*Sigmoid module with range `(low, high)`*

``` python
tst = SigmoidRange(-1, 2)
assert torch.allclose(tst(test), tensor([-1.,0.5, 2.]), atol=1e-4, rtol=1e-4)
```

## Pooling layers

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L114"
target="_blank" style="float:right; font-size:smaller">source</a>

### AdaptiveConcatPool1d

``` python

def AdaptiveConcatPool1d(
    size:NoneType=None
):

```

*Layer that concats `AdaptiveAvgPool1d` and `AdaptiveMaxPool1d`*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L123"
target="_blank" style="float:right; font-size:smaller">source</a>

### AdaptiveConcatPool2d

``` python

def AdaptiveConcatPool2d(
    size:NoneType=None
):

```

*Layer that concats `AdaptiveAvgPool2d` and `AdaptiveMaxPool2d`*

If the input is `bs x nf x h x h`, the output will be
`bs x 2*nf x 1 x 1` if no size is passed or `bs x 2*nf x size x size`

``` python
tst = AdaptiveConcatPool2d()
x = torch.randn(10,5,4,4)
test_eq(tst(x).shape, [10,10,1,1])
max1 = torch.max(x,    dim=2, keepdim=True)[0]
maxp = torch.max(max1, dim=3, keepdim=True)[0]
test_eq(tst(x)[:,:5], maxp)
test_eq(tst(x)[:,5:], x.mean(dim=[2,3], keepdim=True))
tst = AdaptiveConcatPool2d(2)
test_eq(tst(x).shape, [10,10,2,2])
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L132"
target="_blank" style="float:right; font-size:smaller">source</a>

### PoolType

``` python

def PoolType(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

```

*Initialize self. See help(type(self)) for accurate signature.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L135"
target="_blank" style="float:right; font-size:smaller">source</a>

### adaptive_pool

``` python

def adaptive_pool(
    pool_type
):

```

*Call self as a function.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L139"
target="_blank" style="float:right; font-size:smaller">source</a>

### PoolFlatten

``` python

def PoolFlatten(
    pool_type:str='Avg'
):

```

*Combine `nn.AdaptiveAvgPool2d` and
[`Flatten`](https://docs.fast.ai/layers.html#flatten).*

``` python
tst = PoolFlatten()
test_eq(tst(x).shape, [10,5])
test_eq(tst(x), x.mean(dim=[2,3]))
```

## BatchNorm layers

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L158"
target="_blank" style="float:right; font-size:smaller">source</a>

### BatchNorm

``` python

def BatchNorm(
    nf, ndim:int=2, norm_type:NormType=<NormType.Batch: 1>, eps:float=1e-05, momentum:float | None=0.1,
    affine:bool=True, track_running_stats:bool=True, device:NoneType=None, dtype:NoneType=None, bias:bool=True
):

```

*BatchNorm layer with `nf` features and `ndim` initialized depending on
`norm_type`.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L164"
target="_blank" style="float:right; font-size:smaller">source</a>

### InstanceNorm

``` python

def InstanceNorm(
    nf, ndim:int=2, norm_type:NormType=<NormType.Instance: 5>, affine:bool=True, eps:float=1e-05, momentum:float=0.1,
    track_running_stats:bool=False, device:NoneType=None, dtype:NoneType=None,
    bias:bool=True, # for backward compatibility
):

```

*InstanceNorm layer with `nf` features and `ndim` initialized depending
on `norm_type`.*

`kwargs` are passed to `nn.BatchNorm` and can be `eps`, `momentum`,
`affine` and `track_running_stats`.

``` python
tst = BatchNorm(15)
assert isinstance(tst, nn.BatchNorm2d)
test_eq(tst.weight, torch.ones(15))
tst = BatchNorm(15, norm_type=NormType.BatchZero)
test_eq(tst.weight, torch.zeros(15))
tst = BatchNorm(15, ndim=1)
assert isinstance(tst, nn.BatchNorm1d)
tst = BatchNorm(15, ndim=3)
assert isinstance(tst, nn.BatchNorm3d)
```

``` python
tst = InstanceNorm(15)
assert isinstance(tst, nn.InstanceNorm2d)
test_eq(tst.weight, torch.ones(15))
tst = InstanceNorm(15, norm_type=NormType.InstanceZero)
test_eq(tst.weight, torch.zeros(15))
tst = InstanceNorm(15, ndim=1)
assert isinstance(tst, nn.InstanceNorm1d)
tst = InstanceNorm(15, ndim=3)
assert isinstance(tst, nn.InstanceNorm3d)
```

If `affine` is false the weight should be `None`

``` python
test_eq(BatchNorm(15, affine=False).weight, None)
test_eq(InstanceNorm(15, affine=False).weight, None)
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L169"
target="_blank" style="float:right; font-size:smaller">source</a>

### BatchNorm1dFlat

``` python

def BatchNorm1dFlat(
    num_features:int, eps:float=1e-05, momentum:float | None=0.1, affine:bool=True, track_running_stats:bool=True,
    device:NoneType=None, dtype:NoneType=None, bias:bool=True
)->None:

```

*`nn.BatchNorm1d`, but first flattens leading dimensions*

``` python
tst = BatchNorm1dFlat(15)
x = torch.randn(32, 64, 15)
y = tst(x)
mean = x.mean(dim=[0,1])
test_close(tst.running_mean, 0*0.9 + mean*0.1)
var = (x-mean).pow(2).mean(dim=[0,1])
test_close(tst.running_var, 1*0.9 + var*0.1, eps=1e-4)
test_close(y, (x-mean)/torch.sqrt(var+1e-5) * tst.weight + tst.bias, eps=1e-4)
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L178"
target="_blank" style="float:right; font-size:smaller">source</a>

### LinBnDrop

``` python

def LinBnDrop(
    n_in, n_out, bn:bool=True, p:float=0.0, act:NoneType=None, lin_first:bool=False
):

```

*Module grouping `BatchNorm1d`, `Dropout` and `Linear` layers*

The [`BatchNorm`](https://docs.fast.ai/layers.html#batchnorm) layer is
skipped if `bn=False`, as is the dropout if `p=0.`. Optionally, you can
add an activation for after the linear layer with `act`.

``` python
tst = LinBnDrop(10, 20)
mods = list(tst.children())
test_eq(len(mods), 2)
assert isinstance(mods[0], nn.BatchNorm1d)
assert isinstance(mods[1], nn.Linear)

tst = LinBnDrop(10, 20, p=0.1)
mods = list(tst.children())
test_eq(len(mods), 3)
assert isinstance(mods[0], nn.BatchNorm1d)
assert isinstance(mods[1], nn.Dropout)
assert isinstance(mods[2], nn.Linear)

tst = LinBnDrop(10, 20, act=nn.ReLU(), lin_first=True)
mods = list(tst.children())
test_eq(len(mods), 3)
assert isinstance(mods[0], nn.Linear)
assert isinstance(mods[1], nn.ReLU)
assert isinstance(mods[2], nn.BatchNorm1d)

tst = LinBnDrop(10, 20, bn=False)
mods = list(tst.children())
test_eq(len(mods), 1)
assert isinstance(mods[0], nn.Linear)
```

## Inits

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L189"
target="_blank" style="float:right; font-size:smaller">source</a>

### sigmoid

``` python

def sigmoid(
    input, eps:float=1e-07
):

```

*Same as `torch.sigmoid`, plus clamping to \`(eps,1-eps)*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L194"
target="_blank" style="float:right; font-size:smaller">source</a>

### sigmoid\_

``` python

def sigmoid_(
    input, eps:float=1e-07
):

```

*Same as `torch.sigmoid_`, plus clamping to \`(eps,1-eps)*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L202"
target="_blank" style="float:right; font-size:smaller">source</a>

### vleaky_relu

``` python

def vleaky_relu(
    input, inplace:bool=True
):

```

*`F.leaky_relu` with 0.3 slope*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L215"
target="_blank" style="float:right; font-size:smaller">source</a>

### init_default

``` python

def init_default(
    m, func:function=kaiming_normal_
):

```

*Initialize `m` weights with `func` and set `bias` to 0.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L222"
target="_blank" style="float:right; font-size:smaller">source</a>

### init_linear

``` python

def init_linear(
    m, act_func:NoneType=None, init:str='auto', bias_std:float=0.01
):

```

*Call self as a function.*

## Convolutions

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L242"
target="_blank" style="float:right; font-size:smaller">source</a>

### ConvLayer

``` python

def ConvLayer(
    ni, nf, ks:int=3, stride:int=1, padding:NoneType=None, bias:NoneType=None, ndim:int=2,
    norm_type:NormType=<NormType.Batch: 1>, bn_1st:bool=True, act_cls:type=ReLU, transpose:bool=False,
    init:str='auto', xtra:NoneType=None, bias_std:float=0.01, dilation:Union=1, groups:int=1,
    padding_mode:Literal='zeros', device:NoneType=None, dtype:NoneType=None
):

```

*Create a sequence of convolutional (`ni` to `nf`), ReLU (if
`use_activ`) and `norm_type` layers.*

The convolution uses `ks` (kernel size) `stride`, `padding` and `bias`.
`padding` will default to the appropriate value (`(ks-1)//2` if it’s not
a transposed conv) and `bias` will default to `True` the `norm_type` is
`Spectral` or `Weight`, `False` if it’s `Batch` or `BatchZero`. Note
that if you don’t want any normalization, you should pass
`norm_type=None`.

This defines a conv layer with `ndim` (1,2 or 3) that will be a
ConvTranspose if `transpose=True`. `act_cls` is the class of the
activation function to use (instantiated inside). Pass `act=None` if you
don’t want an activation function. If you quickly want to change your
default activation, you can change the value of `defaults.activation`.

`init` is used to initialize the weights (the bias are initialized to 0)
and `xtra` is an optional layer to add at the end.

``` python
tst = ConvLayer(16, 32)
mods = list(tst.children())
test_eq(len(mods), 3)
test_eq(mods[1].weight, torch.ones(32))
test_eq(mods[0].padding, (1,1))
```

``` python
x = torch.randn(64, 16, 8, 8)#.cuda()
```

``` python
#Padding is selected to make the shape the same if stride=1
test_eq(tst(x).shape, [64,32,8,8])
```

``` python
#Padding is selected to make the shape half if stride=2
tst = ConvLayer(16, 32, stride=2)
test_eq(tst(x).shape, [64,32,4,4])
```

``` python
#But you can always pass your own padding if you want
tst = ConvLayer(16, 32, padding=0)
test_eq(tst(x).shape, [64,32,6,6])
```

``` python
#No bias by default for Batch NormType
assert mods[0].bias is None
#But can be overridden with `bias=True`
tst = ConvLayer(16, 32, bias=True)
assert first(tst.children()).bias is not None
#For no norm, or spectral/weight, bias is True by default
for t in [None, NormType.Spectral, NormType.Weight]:
    tst = ConvLayer(16, 32, norm_type=t)
    assert first(tst.children()).bias is not None
```

``` python
#Various n_dim/tranpose
tst = ConvLayer(16, 32, ndim=3)
assert isinstance(list(tst.children())[0], nn.Conv3d)
tst = ConvLayer(16, 32, ndim=1, transpose=True)
assert isinstance(list(tst.children())[0], nn.ConvTranspose1d)
```

``` python
#No activation/leaky
tst = ConvLayer(16, 32, ndim=3, act_cls=None)
mods = list(tst.children())
test_eq(len(mods), 2)
tst = ConvLayer(16, 32, ndim=3, act_cls=partial(nn.LeakyReLU, negative_slope=0.1))
mods = list(tst.children())
test_eq(len(mods), 3)
assert isinstance(mods[2], nn.LeakyReLU)
```

``` python
# #export
# def linear(in_features, out_features, bias=True, act_cls=None, init='auto'):
#     "Linear layer followed by optional activation, with optional auto-init"
#     res = nn.Linear(in_features, out_features, bias=bias)
#     if act_cls: act_cls = act_cls()
#     init_linear(res, act_cls, init=init)
#     if act_cls: res = nn.Sequential(res, act_cls)
#     return res
```

``` python
# #export
# @delegates(ConvLayer)
# def conv1d(ni, nf, ks, stride=1, ndim=1, norm_type=None, **kwargs):
#     "Convolutional layer followed by optional activation, with optional auto-init"
#     return ConvLayer(ni, nf, ks, stride=stride, ndim=ndim, norm_type=norm_type, **kwargs)
```

``` python
# #export
# @delegates(ConvLayer)
# def conv2d(ni, nf, ks, stride=1, ndim=2, norm_type=None, **kwargs):
#     "Convolutional layer followed by optional activation, with optional auto-init"
#     return ConvLayer(ni, nf, ks, stride=stride, ndim=ndim, norm_type=norm_type, **kwargs)
```

``` python
# #export
# @delegates(ConvLayer)
# def conv3d(ni, nf, ks, stride=1, ndim=3, norm_type=None, **kwargs):
#     "Convolutional layer followed by optional activation, with optional auto-init"
#     return ConvLayer(ni, nf, ks, stride=stride, ndim=ndim, norm_type=norm_type, **kwargs)
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L268"
target="_blank" style="float:right; font-size:smaller">source</a>

### AdaptiveAvgPool

``` python

def AdaptiveAvgPool(
    sz:int=1, ndim:int=2
):

```

*nn.AdaptiveAvgPool layer for `ndim`*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L274"
target="_blank" style="float:right; font-size:smaller">source</a>

### MaxPool

``` python

def MaxPool(
    ks:int=2, stride:NoneType=None, padding:int=0, ndim:int=2, ceil_mode:bool=False
):

```

*nn.MaxPool layer for `ndim`*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L280"
target="_blank" style="float:right; font-size:smaller">source</a>

### AvgPool

``` python

def AvgPool(
    ks:int=2, stride:NoneType=None, padding:int=0, ndim:int=2, ceil_mode:bool=False
):

```

*nn.AvgPool layer for `ndim`*

## Embeddings

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L286"
target="_blank" style="float:right; font-size:smaller">source</a>

### trunc_normal\_

``` python

def trunc_normal_(
    x, mean:float=0.0, std:float=1.0
):

```

*Truncated normal initialization (approximation)*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L292"
target="_blank" style="float:right; font-size:smaller">source</a>

### Embedding

``` python

def Embedding(
    ni, nf, std:float=0.01
):

```

*Embedding layer with truncated normal initialization*

Truncated normal initialization bounds the distribution to avoid large
value. For a given standard deviation `std`, the bounds are roughly
`-2*std`, `2*std`.

``` python
std = 0.02
tst = Embedding(10, 30, std)
assert tst.weight.min() > -2*std
assert tst.weight.max() < 2*std
test_close(tst.weight.mean(), 0, 1e-2)
test_close(tst.weight.std(), std, 0.1)
```

## Self attention

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L299"
target="_blank" style="float:right; font-size:smaller">source</a>

### SelfAttention

``` python

def SelfAttention(
    n_channels
):

```

*Self attention layer for `n_channels`.*

Self-attention layer as introduced in [Self-Attention Generative
Adversarial Networks](https://arxiv.org/abs/1805.08318).

Initially, no change is done to the input. This is controlled by a
trainable parameter named `gamma` as we return `x + gamma * out`.

``` python
tst = SelfAttention(16)
x = torch.randn(32, 16, 8, 8)
test_eq(tst(x),x)
```

Then during training `gamma` will probably change since it’s a trainable
parameter. Let’s see what’s happening when it gets a nonzero value.

``` python
tst.gamma.data.fill_(1.)
y = tst(x)
test_eq(y.shape, [32,16,8,8])
```

The attention mechanism requires three matrix multiplications (here
represented by 1x1 convs). The multiplications are done on the channel
level (the second dimension in our tensor) and we flatten the feature
map (which is 8x8 here). As in the paper, we note `f`, `g` and `h` the
results of those multiplications.

``` python
q,k,v = tst.query[0].weight.data,tst.key[0].weight.data,tst.value[0].weight.data
test_eq([q.shape, k.shape, v.shape], [[2, 16, 1], [2, 16, 1], [16, 16, 1]])
f,g,h = map(lambda m: x.view(32, 16, 64).transpose(1,2) @ m.squeeze().t(), [q,k,v])
test_eq([f.shape, g.shape, h.shape], [[32,64,2], [32,64,2], [32,64,16]])
```

The key part of the attention layer is to compute attention weights for
each of our location in the feature map (here 8x8 = 64). Those are
positive numbers that sum to 1 and tell the model to pay attention to
this or that part of the picture. We make the product of `f` and the
transpose of `g` (to get something of size bs by 64 by 64) then apply a
softmax on the first dimension (to get the positive numbers that sum up
to 1). The result can then be multiplied with `h` transposed to get an
output of size bs by channels by 64, which we can then be viewed as an
output the same size as the original input.

The final result is then `x + gamma * out` as we saw before.

``` python
beta = F.softmax(torch.bmm(f, g.transpose(1,2)), dim=1)
test_eq(beta.shape, [32, 64, 64])
out = torch.bmm(h.transpose(1,2), beta)
test_eq(out.shape, [32, 16, 64])
test_close(y, x + out.view(32, 16, 8, 8), eps=1e-4)
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L318"
target="_blank" style="float:right; font-size:smaller">source</a>

### PooledSelfAttention2d

``` python

def PooledSelfAttention2d(
    n_channels
):

```

*Pooled self attention layer for 2d.*

Self-attention layer used in the [Big GAN
paper](https://arxiv.org/abs/1809.11096).

It uses the same attention as in
[`SelfAttention`](https://docs.fast.ai/layers.html#selfattention) but
adds a max pooling of stride 2 before computing the matrices `g` and
`h`: the attention is ported on one of the 2x2 max-pooled window, not
the whole feature map. There is also a final matrix product added at the
end to the output, before retuning `gamma * out + x`.

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L347"
target="_blank" style="float:right; font-size:smaller">source</a>

### SimpleSelfAttention

``` python

def SimpleSelfAttention(
    n_in:int, ks:int=1, sym:bool=False
):

```

*Same as `nn.Module`, but no need for subclasses to call
`super().__init__`*

## PixelShuffle

PixelShuffle introduced in [this
article](https://arxiv.org/pdf/1609.05158.pdf) to avoid checkerboard
artifacts when upsampling images. If we want an output with `ch_out`
filters, we use a convolution with `ch_out * (r**2)` filters, where `r`
is the upsampling factor. Then we reorganize those filters like in the
picture below:

<img src="images/pixelshuffle.png" alt="Pixelshuffle" width="800" />

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L369"
target="_blank" style="float:right; font-size:smaller">source</a>

### icnr_init

``` python

def icnr_init(
    x, scale:int=2, init:function=kaiming_normal_
):

```

*ICNR init of `x`, with `scale` and `init` function*

ICNR init was introduced in [this
article](https://arxiv.org/abs/1707.02937). It suggests to initialize
the convolution that will be used in PixelShuffle so that each of the
`r**2` channels get the same weight (so that in the picture above, the 9
colors in a 3 by 3 window are initially the same).

<div>

> **Note**
>
> This is done on the first dimension because PyTorch stores the weights
> of a convolutional layer in this format: `ch_out x ch_in x ks x ks`.

</div>

``` python
tst = torch.randn(16*4, 32, 1, 1)
tst = icnr_init(tst)
for i in range(0,16*4,4):
    test_eq(tst[i],tst[i+1])
    test_eq(tst[i],tst[i+2])
    test_eq(tst[i],tst[i+3])
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L379"
target="_blank" style="float:right; font-size:smaller">source</a>

### PixelShuffle_ICNR

``` python

def PixelShuffle_ICNR(
    ni, nf:NoneType=None, scale:int=2, blur:bool=False, norm_type:NormType=<NormType.Weight: 3>, act_cls:type=ReLU
):

```

*Upsample by `scale` from `ni` filters to `nf` (default `ni`), using
`nn.PixelShuffle`.*

The convolutional layer is initialized with
[`icnr_init`](https://docs.fast.ai/layers.html#icnr_init) and passed
`act_cls` and `norm_type` (the default of weight normalization seemed to
be what’s best for super-resolution problems, in our experiments).

The `blur` option comes from [Super-Resolution using Convolutional
Neural Networks without Any Checkerboard
Artifacts](https://arxiv.org/abs/1806.02658) where the authors add a
little bit of blur to completely get rid of checkerboard artifacts.

``` python
psfl = PixelShuffle_ICNR(16)
x = torch.randn(64, 16, 8, 8)
y = psfl(x)
test_eq(y.shape, [64, 16, 16, 16])
#ICNR init makes every 2x2 window (stride 2) have the same elements
for i in range(0,16,2):
    for j in range(0,16,2):
        test_eq(y[:,:,i,j],y[:,:,i+1,j])
        test_eq(y[:,:,i,j],y[:,:,i  ,j+1])
        test_eq(y[:,:,i,j],y[:,:,i+1,j+1])
```

``` python
psfl = PixelShuffle_ICNR(16, norm_type=None)
x = torch.randn(64, 16, 8, 8)
y = psfl(x)
test_eq(y.shape, [64, 16, 16, 16])
#ICNR init makes every 2x2 window (stride 2) have the same elements
for i in range(0,16,2):
    for j in range(0,16,2):
        test_eq(y[:,:,i,j],y[:,:,i+1,j])
        test_eq(y[:,:,i,j],y[:,:,i  ,j+1])
        test_eq(y[:,:,i,j],y[:,:,i+1,j+1])
```

``` python
psfl = PixelShuffle_ICNR(16, norm_type=NormType.Spectral)
x = torch.randn(64, 16, 8, 8)
y = psfl(x)
test_eq(y.shape, [64, 16, 16, 16])
#ICNR init makes every 2x2 window (stride 2) have the same elements
for i in range(0,16,2):
    for j in range(0,16,2):
        test_eq(y[:,:,i,j],y[:,:,i+1,j])
        test_eq(y[:,:,i,j],y[:,:,i  ,j+1])
        test_eq(y[:,:,i,j],y[:,:,i+1,j+1])
```

## Sequential extensions

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L395"
target="_blank" style="float:right; font-size:smaller">source</a>

### sequential

``` python

def sequential(
    args:VAR_POSITIONAL
):

```

*Create an `nn.Sequential`, wrapping items with
[`Lambda`](https://docs.fast.ai/layers.html#lambda) if needed*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L404"
target="_blank" style="float:right; font-size:smaller">source</a>

### SequentialEx

``` python

def SequentialEx(
    layers:VAR_POSITIONAL
):

```

*Like `nn.Sequential`, but with ModuleList semantics, and can access
module input*

This is useful to write layers that require to remember the input (like
a resnet block) in a sequential way.

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L424"
target="_blank" style="float:right; font-size:smaller">source</a>

### MergeLayer

``` python

def MergeLayer(
    dense:bool=False
):

```

*Merge a shortcut with the result of the module by adding them or
concatenating them if `dense=True`.*

``` python
res_block = SequentialEx(ConvLayer(16, 16), ConvLayer(16,16))
res_block.append(MergeLayer()) # just to test append - normally it would be in init params
x = torch.randn(32, 16, 8, 8)
y = res_block(x)
test_eq(y.shape, [32, 16, 8, 8])
test_eq(y, x + res_block[1](res_block[0](x)))
```

``` python
x = TensorBase(torch.randn(32, 16, 8, 8))
y = res_block(x)
test_is(y.orig, None)
```

## Concat

Equivalent to keras.layers.Concatenate, it will concat the outputs of a
ModuleList over a given dimension (default the filter dimension)

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L430"
target="_blank" style="float:right; font-size:smaller">source</a>

### Cat

``` python

def Cat(
    layers, dim:int=1
):

```

*Concatenate layers outputs over a given dim*

``` python
layers = [ConvLayer(2,4), ConvLayer(2,4), ConvLayer(2,4)] 
x = torch.rand(1,2,8,8) 
cat = Cat(layers) 
test_eq(cat(x).shape, [1,12,8,8]) 
test_eq(cat(x), torch.cat([l(x) for l in layers], dim=1))
```

## Ready-to-go models

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L438"
target="_blank" style="float:right; font-size:smaller">source</a>

### SimpleCNN

``` python

def SimpleCNN(
    filters, kernel_szs:NoneType=None, strides:NoneType=None, bn:bool=True
):

```

*Create a simple CNN with `filters`.*

The model is a succession of convolutional layers from
`(filters[0],filters[1])` to `(filters[n-2],filters[n-1])` (if `n` is
the length of the `filters` list) followed by a
[`PoolFlatten`](https://docs.fast.ai/layers.html#poolflatten).
`kernel_szs` and `strides` defaults to a list of 3s and a list of 2s. If
`bn=True` the convolutional layers are successions of
conv-relu-batchnorm, otherwise conv-relu.

``` python
tst = SimpleCNN([8,16,32])
mods = list(tst.children())
test_eq(len(mods), 3)
test_eq([[m[0].in_channels, m[0].out_channels] for m in mods[:2]], [[8,16], [16,32]])
```

Test kernel sizes

``` python
tst = SimpleCNN([8,16,32], kernel_szs=[1,3])
mods = list(tst.children())
test_eq([m[0].kernel_size for m in mods[:2]], [(1,1), (3,3)])
```

Test strides

``` python
tst = SimpleCNN([8,16,32], strides=[1,2])
mods = list(tst.children())
test_eq([m[0].stride for m in mods[:2]], [(1,1),(2,2)])
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L450"
target="_blank" style="float:right; font-size:smaller">source</a>

### ProdLayer

``` python

def ProdLayer(
    
):

```

*Merge a shortcut with the result of the module by multiplying them.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L458"
target="_blank" style="float:right; font-size:smaller">source</a>

### SEModule

``` python

def SEModule(
    ch, reduction, act_cls:type=ReLU
):

```

*Call self as a function.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L466"
target="_blank" style="float:right; font-size:smaller">source</a>

### ResBlock

``` python

def ResBlock(
    expansion, ni, nf, stride:int=1, groups:int=1, reduction:NoneType=None, nh1:NoneType=None, nh2:NoneType=None,
    dw:bool=False, g2:int=1, sa:bool=False, sym:bool=False, norm_type:NormType=<NormType.Batch: 1>,
    act_cls:type=ReLU, ndim:int=2, ks:int=3, pool:function=AvgPool, pool_first:bool=True, padding:NoneType=None,
    bias:NoneType=None, bn_1st:bool=True, transpose:bool=False, init:str='auto', xtra:NoneType=None,
    bias_std:float=0.01, dilation:Union=1, padding_mode:Literal='zeros', device:NoneType=None, dtype:NoneType=None
):

```

*Resnet block from `ni` to `nh` with `stride`*

This is a resnet block (normal or bottleneck depending on `expansion`, 1
for the normal block and 4 for the traditional bottleneck) that
implements the tweaks from [Bag of Tricks for Image Classification with
Convolutional Neural Networks](https://arxiv.org/abs/1812.01187). In
particular, the last batchnorm layer (if that is the selected
`norm_type`) is initialized with a weight (or gamma) of zero to
facilitate the flow from the beginning to the end of the network. It
also implements optional [Squeeze and
Excitation](https://arxiv.org/abs/1709.01507) and grouped convs for
[ResNeXT](https://arxiv.org/abs/1611.05431) and similar models (use
`dw=True` for depthwise convs).

The `kwargs` are passed to
[`ConvLayer`](https://docs.fast.ai/layers.html#convlayer) along with
`norm_type`.

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L497"
target="_blank" style="float:right; font-size:smaller">source</a>

### SEBlock

``` python

def SEBlock(
    expansion, ni, nf, groups:int=1, reduction:int=16, stride:int=1, kwargs:VAR_KEYWORD
):

```

*Call self as a function.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L501"
target="_blank" style="float:right; font-size:smaller">source</a>

### SEResNeXtBlock

``` python

def SEResNeXtBlock(
    expansion, ni, nf, groups:int=32, reduction:int=16, stride:int=1, base_width:int=4, kwargs:VAR_KEYWORD
):

```

*Call self as a function.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L506"
target="_blank" style="float:right; font-size:smaller">source</a>

### SeparableBlock

``` python

def SeparableBlock(
    expansion, ni, nf, reduction:int=16, stride:int=1, base_width:int=4, kwargs:VAR_KEYWORD
):

```

*Call self as a function.*

## Time Distributed Layer

Equivalent to Keras
[`TimeDistributed`](https://docs.fast.ai/layers.html#timedistributed)
Layer, enables computing pytorch
[`Module`](https://docs.fast.ai/torch_core.html#module) over an axis.

``` python
bs, seq_len = 2, 5
x, y = torch.rand(bs,seq_len,3,2,2), torch.rand(bs,seq_len,3,2,2)
```

``` python
tconv = TimeDistributed(nn.Conv2d(3,4,1))
test_eq(tconv(x).shape, (2,5,4,2,2))
tconv.low_mem=True
test_eq(tconv(x).shape, (2,5,4,2,2))
```

``` python
class Mod(Module):
    def __init__(self):
        self.conv = nn.Conv2d(3,4,1)
    def forward(self, x, y):
        return self.conv(x) + self.conv(y)
tmod = TimeDistributed(Mod())
```

``` python
out = tmod(x,y)
test_eq(out.shape, (2,5,4,2,2))
tmod.low_mem=True
out_low_mem = tmod(x,y)
test_eq(out_low_mem.shape, (2,5,4,2,2))
test_eq(out, out_low_mem)
```

``` python
class Mod2(Module):
    def __init__(self):
        self.conv = nn.Conv2d(3,4,1)
    def forward(self, x, y):
        return self.conv(x), self.conv(y)
tmod2 = TimeDistributed(Mod2())
```

``` python
out = tmod2(x,y)
test_eq(len(out), 2)
test_eq(out[0].shape, (2,5,4,2,2))
tmod2.low_mem=True
out_low_mem = tmod2(x,y)
test_eq(out_low_mem[0].shape, (2,5,4,2,2))
test_eq(out, out_low_mem)
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L515"
target="_blank" style="float:right; font-size:smaller">source</a>

### TimeDistributed

``` python

def TimeDistributed(
    module, low_mem:bool=False, tdim:int=1
):

```

*Applies [`module`](https://docs.fast.ai/layers.html#module) over `tdim`
identically for each step, use `low_mem` to compute one at a time.*

This module is equivalent to [Keras TimeDistributed
Layer](https://keras.io/api/layers/recurrent_layers/time_distributed/).
This wrapper allows to apply a layer to every temporal slice of an
input. By default it is assumed the time axis (`tdim`) is the 1st one
(the one after the batch size). A typical usage would be to encode a
sequence of images using an image encoder.

The `forward` function of
[`TimeDistributed`](https://docs.fast.ai/layers.html#timedistributed)
supports `*args` and `**kkwargs` but only `args` will be split and
passed to the underlying module independently for each timestep,
`kwargs` will be passed as they are. This is useful when you have module
that take multiple arguments as inputs, this way, you can put all
tensors you need spliting as `args` and other arguments that don’t need
split as `kwargs`.

> This module is heavy on memory, as it will try to pass mutiple
> timesteps at the same time on the batch dimension, if you get out of
> memorey errors, try first reducing your batch size by the number of
> timesteps.

``` python
from fastai.vision.all import *
```

``` python
encoder = create_body(resnet18())
```

A resnet18 will encode a feature map of 512 channels. Height and Width
will be divided by 32.

``` python
time_resnet = TimeDistributed(encoder)
```

a synthetic batch of 2 image-sequences of lenght 5.
`(bs, seq_len, ch, w, h)`

``` python
image_sequence = torch.rand(2, 5, 3, 64, 64)
```

``` python
time_resnet(image_sequence).shape
```

    torch.Size([2, 5, 512, 2, 2])

This way, one can encode a sequence of images on feature space. There is
also a `low_mem_forward` that will pass images one at a time to reduce
GPU memory consumption.

``` python
time_resnet.low_mem_forward(image_sequence).shape
```

    torch.Size([2, 5, 512, 2, 2])

## Swish and Mish

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L575"
target="_blank" style="float:right; font-size:smaller">source</a>

### swish

``` python

def swish(
    x, inplace:bool=False
):

```

*Call self as a function.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L578"
target="_blank" style="float:right; font-size:smaller">source</a>

### SwishJit

``` python

def SwishJit(
    
):

```

*Same as `nn.Module`, but no need for subclasses to call
`super().__init__`*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L591"
target="_blank" style="float:right; font-size:smaller">source</a>

### MishJitAutoFn

``` python

def MishJitAutoFn(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

```

*Base class to create custom `autograd.Function`.*

To create a custom `autograd.Function`, subclass this class and
implement the :meth:`forward` and :meth:`backward` static methods. Then,
to use your custom op in the forward pass, call the class method
`[`apply`](https://docs.fast.ai/torch_core.html#apply)`. Do not call
:meth:`forward` directly.

To ensure correctness and best performance, make sure you are calling
the correct methods on `ctx` and validating your backward function using
:func:`torch.autograd.gradcheck`.

See :ref:`extending-autograd` for more details on how to use this class.

Examples::

    >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_AUTOGRAD)
    >>> class Exp(Function):
    >>>     @staticmethod
    >>>     def forward(ctx, i):
    >>>         result = i.exp()
    >>>         ctx.save_for_backward(result)
    >>>         return result
    >>>
    >>>     @staticmethod
    >>>     def backward(ctx, grad_output):
    >>>         result, = ctx.saved_tensors
    >>>         return grad_output * result
    >>>
    >>> # Use it by calling the apply method:
    >>> # xdoctest: +SKIP
    >>> output = Exp.apply(input)

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L603"
target="_blank" style="float:right; font-size:smaller">source</a>

### mish

``` python

def mish(
    x, inplace:bool=False
):

```

*Call self as a function.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L606"
target="_blank" style="float:right; font-size:smaller">source</a>

### MishJit

``` python

def MishJit(
    
):

```

*Same as `nn.Module`, but no need for subclasses to call
`super().__init__`*

## Helper functions for submodules

It’s easy to get the list of all parameters of a given model. For when
you want all submodules (like linear/conv layers) without forgetting
lone parameters, the following class wraps those in fake modules.

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L617"
target="_blank" style="float:right; font-size:smaller">source</a>

### ParameterModule

``` python

def ParameterModule(
    p
):

```

*Register a lone parameter `p` in a module.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L623"
target="_blank" style="float:right; font-size:smaller">source</a>

### children_and_parameters

``` python

def children_and_parameters(
    m
):

```

*Return the children of `m` and its direct parameters not registered in
modules.*

``` python
class TstModule(Module):
    def __init__(self): self.a,self.lin = nn.Parameter(torch.randn(1)),nn.Linear(5,10)

tst = TstModule()
children = children_and_parameters(tst)
test_eq(len(children), 2)
test_eq(children[0], tst.lin)
assert isinstance(children[1], ParameterModule)
test_eq(children[1].val, tst.a)
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L632"
target="_blank" style="float:right; font-size:smaller">source</a>

### has_children

``` python

def has_children(
    m
):

```

*Call self as a function.*

``` python
class A(Module): pass
assert not has_children(A())
assert has_children(TstModule())
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L638"
target="_blank" style="float:right; font-size:smaller">source</a>

### flatten_model

``` python

def flatten_model(
    m
):

```

*Return the list of all submodules and parameters of `m`*

``` python
tst = nn.Sequential(TstModule(), TstModule())
children = flatten_model(tst)
test_eq(len(children), 4)
assert isinstance(children[1], ParameterModule)
assert isinstance(children[3], ParameterModule)
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L643"
target="_blank" style="float:right; font-size:smaller">source</a>

### NoneReduce

``` python

def NoneReduce(
    loss_func
):

```

*A context manager to evaluate `loss_func` with none reduce.*

``` python
x,y = torch.randn(5),torch.randn(5)
loss_fn = nn.MSELoss()
with NoneReduce(loss_fn) as loss_func:
    loss = loss_func(x,y)
test_eq(loss.shape, [5])
test_eq(loss_fn.reduction, 'mean')

loss_fn = F.mse_loss
with NoneReduce(loss_fn) as loss_func:
    loss = loss_func(x,y)
test_eq(loss.shape, [5])
test_eq(loss_fn, F.mse_loss)
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/layers.py#L658"
target="_blank" style="float:right; font-size:smaller">source</a>

### in_channels

``` python

def in_channels(
    m
):

```

*Return the shape of the first weight layer in `m`.*

``` python
test_eq(in_channels(nn.Sequential(nn.Conv2d(5,4,3), nn.Conv2d(4,3,3))), 5)
test_eq(in_channels(nn.Sequential(nn.AvgPool2d(4), nn.Conv2d(4,3,3))), 4)
test_eq(in_channels(nn.Sequential(BatchNorm(4), nn.Conv2d(4,3,3))), 4)
test_eq(in_channels(nn.Sequential(InstanceNorm(4), nn.Conv2d(4,3,3))), 4)
test_eq(in_channels(nn.Sequential(InstanceNorm(4, affine=False), nn.Conv2d(4,3,3))), 4)
test_fail(lambda : in_channels(nn.Sequential(nn.AvgPool2d(4))))
```
