# Training callbacks


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/callback/training.py#L16"
target="_blank" style="float:right; font-size:smaller">source</a>

### ShortEpochCallback

``` python

def ShortEpochCallback(
    pct:float=0.01, short_valid:bool=True
):

```

*Fit just `pct` of an epoch, then stop*

``` python
learn = synth_learner()
learn.fit(1, cbs=ShortEpochCallback())
```

<style>
    /* Turns off some styling */
    progress {
        /* gets rid of default border in Firefox and Opera. */
        border: none;
        /* Needs to be in here for Safari polyfill so background images work as expected. */
        background-size: auto;
    }
    .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
        background: #F44336;
    }
</style>

<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: left;">
<th data-quarto-table-cell-role="th">epoch</th>
<th data-quarto-table-cell-role="th">train_loss</th>
<th data-quarto-table-cell-role="th">valid_loss</th>
<th data-quarto-table-cell-role="th">time</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>00:00</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

``` python
learn = synth_learner()
learn.fit(1, cbs=ShortEpochCallback(short_valid=False))
```

<style>
    /* Turns off some styling */
    progress {
        /* gets rid of default border in Firefox and Opera. */
        border: none;
        /* Needs to be in here for Safari polyfill so background images work as expected. */
        background-size: auto;
    }
    .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
        background: #F44336;
    }
</style>

<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: left;">
<th data-quarto-table-cell-role="th">epoch</th>
<th data-quarto-table-cell-role="th">train_loss</th>
<th data-quarto-table-cell-role="th">valid_loss</th>
<th data-quarto-table-cell-role="th">time</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>8.432135</td>
<td>00:00</td>
<td></td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/callback/training.py#L25"
target="_blank" style="float:right; font-size:smaller">source</a>

### GradientAccumulation

``` python

def GradientAccumulation(
    n_acc:int=32
):

```

*Accumulate gradients before updating weights*

When the number of steps per accumulation is higher than the number of
batches, the parameters (and therefore validation loss) don’t change at
all:

``` python
learn = synth_learner()
learn.fit(1, lr=0.01, cbs=GradientAccumulation(n_acc=1000))
# ensure valid_loss didn't change
assert learn.recorder.values[-1][1] == learn.recorder.values[0][1]
```

<style>
    /* Turns off some styling */
    progress {
        /* gets rid of default border in Firefox and Opera. */
        border: none;
        /* Needs to be in here for Safari polyfill so background images work as expected. */
        background-size: auto;
    }
    .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
        background: #F44336;
    }
</style>

<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: left;">
<th data-quarto-table-cell-role="th">epoch</th>
<th data-quarto-table-cell-role="th">train_loss</th>
<th data-quarto-table-cell-role="th">valid_loss</th>
<th data-quarto-table-cell-role="th">time</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>20.987558</td>
<td>26.849480</td>
<td>00:00</td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/callback/training.py#L39"
target="_blank" style="float:right; font-size:smaller">source</a>

### GradientClip

``` python

def GradientClip(
    max_norm:float=1.0, norm_type:float=2.0
):

```

*Clip norm of gradients*

Normally if we use a learning rate that is too high, our training will
diverge. This even happens if we use mixed precision training, which
avoid infinities by using dynamic loss scaling, but still diverges:

``` python
fp16 = MixedPrecision()
```

``` python
set_seed(99)
learn = synth_learner(lr=1.1, cuda=True)
learn.fit(3, cbs=fp16)
```

<style>
    /* Turns off some styling */
    progress {
        /* gets rid of default border in Firefox and Opera. */
        border: none;
        /* Needs to be in here for Safari polyfill so background images work as expected. */
        background-size: auto;
    }
    .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
        background: #F44336;
    }
</style>

<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: left;">
<th data-quarto-table-cell-role="th">epoch</th>
<th data-quarto-table-cell-role="th">train_loss</th>
<th data-quarto-table-cell-role="th">valid_loss</th>
<th data-quarto-table-cell-role="th">time</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>38.214138</td>
<td>25.269005</td>
<td>00:00</td>
</tr>
<tr>
<td>1</td>
<td>377.145508</td>
<td>890.010376</td>
<td>00:00</td>
</tr>
<tr>
<td>2</td>
<td>839.392883</td>
<td>9965.747070</td>
<td>00:00</td>
</tr>
</tbody>
</table>

By adding the
[`GradientClip`](https://docs.fast.ai/callback.training.html#gradientclip)
callback, the gradient `norm_type` (default:2) norm is clipped to at
most `max_norm` (default:1) using `nn.utils.clip_grad_norm_`, which can
avoid loss divergence:

``` python
set_seed(99)
learn = synth_learner(lr=1.1, cuda=True)
learn.fit(3, cbs=[GradientClip,fp16])
```

<style>
    /* Turns off some styling */
    progress {
        /* gets rid of default border in Firefox and Opera. */
        border: none;
        /* Needs to be in here for Safari polyfill so background images work as expected. */
        background-size: auto;
    }
    .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
        background: #F44336;
    }
</style>

<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: left;">
<th data-quarto-table-cell-role="th">epoch</th>
<th data-quarto-table-cell-role="th">train_loss</th>
<th data-quarto-table-cell-role="th">valid_loss</th>
<th data-quarto-table-cell-role="th">time</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>2.039428</td>
<td>2.372177</td>
<td>00:00</td>
</tr>
<tr>
<td>1</td>
<td>1.402425</td>
<td>0.300728</td>
<td>00:00</td>
</tr>
<tr>
<td>2</td>
<td>1.013548</td>
<td>0.332610</td>
<td>00:00</td>
</tr>
</tbody>
</table>

## BnFreeze

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/callback/training.py#L56"
target="_blank" style="float:right; font-size:smaller">source</a>

### BnFreeze

``` python

def BnFreeze(
    after_create:NoneType=None, before_fit:NoneType=None, before_epoch:NoneType=None, before_train:NoneType=None,
    before_batch:NoneType=None, after_pred:NoneType=None, after_loss:NoneType=None, before_backward:NoneType=None,
    after_cancel_backward:NoneType=None, after_backward:NoneType=None, before_step:NoneType=None,
    after_cancel_step:NoneType=None, after_step:NoneType=None, after_cancel_batch:NoneType=None,
    after_batch:NoneType=None, after_cancel_train:NoneType=None, after_train:NoneType=None,
    before_validate:NoneType=None, after_cancel_validate:NoneType=None, after_validate:NoneType=None,
    after_cancel_epoch:NoneType=None, after_epoch:NoneType=None, after_cancel_fit:NoneType=None,
    after_fit:NoneType=None
):

```

*Basic class handling tweaks of the training loop by changing a
[`Learner`](https://docs.fast.ai/learner.html#learner) in various
events*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/callback/training.py#L48"
target="_blank" style="float:right; font-size:smaller">source</a>

### set_bn_eval

``` python

def set_bn_eval(
    m:Module, use_eval:bool=True
)->None:

```

*Set bn layers in eval mode for all recursive children of `m`.*

[`BnFreeze`](https://docs.fast.ai/callback.training.html#bnfreeze) is
useful when you’d like to train two separate models that have a common
feature extractor / body. The only part of the model that’s different is
the head that you attach for transfer learning. <br>

[`Learner.freeze()`](https://docs.fast.ai/learner.html#learner.freeze)
doesn’t suffice here as the
[`BatchNorm`](https://docs.fast.ai/layers.html#batchnorm) layers are
trainable by default, and running mean and std of batches are tracked.
For feature extractors to fully match, you need to set `train_bn=False`
and these stats need to be frozen as well, which is precisely the
function of
[`BnFreeze`](https://docs.fast.ai/callback.training.html#bnfreeze).

``` python
path = untar_data(URLs.MNIST_TINY)
dls  = ImageDataLoaders.from_folder(path, valid_pct=0.2)
```

https://pytorch.org/tutorials/intermediate/memory_format_tutorial.htmlWe
first demonstrate the mismatch of the running stats when using only
`train_bn=False`, by creating a
[`Learner`](https://docs.fast.ai/learner.html#learner)…:

``` python
learn1 = vision_learner(deepcopy(dls), resnet18, pretrained=True, train_bn=False)
```

…and grab the first
[`BatchNorm`](https://docs.fast.ai/layers.html#batchnorm) layer, and
store its running mean:

``` python
m = learn1.model[0][1].running_mean.clone()
```

You can see that now that running mean has changed:

``` python
learn1.fit(1, lr=0.02)
test_ne(to_detach(learn1.model[0][1].running_mean), m)
```

<style>
    /* Turns off some styling */
    progress {
        /* gets rid of default border in Firefox and Opera. */
        border: none;
        /* Needs to be in here for Safari polyfill so background images work as expected. */
        background-size: auto;
    }
    .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
        background: #F44336;
    }
</style>

<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: left;">
<th data-quarto-table-cell-role="th">epoch</th>
<th data-quarto-table-cell-role="th">train_loss</th>
<th data-quarto-table-cell-role="th">valid_loss</th>
<th data-quarto-table-cell-role="th">time</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1.148303</td>
<td>0.739404</td>
<td>00:12</td>
</tr>
</tbody>
</table>

When we use the
[`BnFreeze`](https://docs.fast.ai/callback.training.html#bnfreeze)
callback, the running statistics will not be changed during training.
This is often important for getting good results from transfer learning.

``` python
learn1 = vision_learner(deepcopy(dls), resnet18, pretrained=True, train_bn=False, cbs=BnFreeze)
m = learn1.model[0][1].running_mean.detach().clone()
learn1.fit(1, lr=0.02)
test_eq(to_detach(learn1.model[0][1].running_mean), m)
```

<style>
    /* Turns off some styling */
    progress {
        /* gets rid of default border in Firefox and Opera. */
        border: none;
        /* Needs to be in here for Safari polyfill so background images work as expected. */
        background-size: auto;
    }
    .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
        background: #F44336;
    }
</style>

<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: left;">
<th data-quarto-table-cell-role="th">epoch</th>
<th data-quarto-table-cell-role="th">train_loss</th>
<th data-quarto-table-cell-role="th">valid_loss</th>
<th data-quarto-table-cell-role="th">time</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0.478594</td>
<td>0.270772</td>
<td>00:10</td>
</tr>
</tbody>
</table>
