# Core vision


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

``` python
from fastai.data.external import *
```

## Helpers

``` python
im = Image.open(TEST_IMAGE).resize((30,20))
```

------------------------------------------------------------------------

### Image.n_px

``` python

def n_px(
    x:Image
):

```

*Call self as a function.*

``` python
test_eq(im.n_px, 30*20)
```

------------------------------------------------------------------------

### Image.shape

``` python

def shape(
    x:Image
):

```

*Call self as a function.*

``` python
test_eq(im.shape, (20,30))
```

------------------------------------------------------------------------

### Image.aspect

``` python

def aspect(
    x:Image
):

```

*Call self as a function.*

``` python
test_eq(im.aspect, 30/20)
```

------------------------------------------------------------------------

### Image.reshape

``` python

def reshape(
    x:Image, h, w, resample:int=0
):

```

*`resize` `x` to `(w,h)`*

------------------------------------------------------------------------

### Image.reshape

``` python

def reshape(
    x:Image, h, w, resample:int=0
):

```

*`resize` `x` to `(w,h)`*

``` python
test_eq(im.reshape(12,10).shape, (12,10))
```

------------------------------------------------------------------------

### Image.to_bytes_format

``` python

def to_bytes_format(
    im:Image, format:str='png'
):

```

*Convert to bytes, default to PNG format*

------------------------------------------------------------------------

### Image.to_bytes_format

``` python

def to_bytes_format(
    im:Image, format:str='png'
):

```

*Convert to bytes, default to PNG format*

------------------------------------------------------------------------

### Image.to_thumb

``` python

def to_thumb(
    h, w:NoneType=None
):

```

*Same as `thumbnail`, but uses a copy*

------------------------------------------------------------------------

### Image.to_thumb

``` python

def to_thumb(
    h, w:NoneType=None
):

```

*Same as `thumbnail`, but uses a copy*

------------------------------------------------------------------------

### Image.resize_max

``` python

def resize_max(
    x:Image, resample:int=0, max_px:NoneType=None, max_h:NoneType=None, max_w:NoneType=None
):

```

*`resize` `x` to `max_px`, or `max_h`, or `max_w`*

``` python
test_eq(im.resize_max(max_px=20*30).shape, (20,30))
test_eq(im.resize_max(max_px=300).n_px, 294)
test_eq(im.resize_max(max_px=500, max_h=10, max_w=20).shape, (10,15))
test_eq(im.resize_max(max_h=14, max_w=15).shape, (10,15))
test_eq(im.resize_max(max_px=300, max_h=10, max_w=25).shape, (10,15))
```

------------------------------------------------------------------------

### Image.resize_max

``` python

def resize_max(
    x:Image, resample:int=0, max_px:NoneType=None, max_h:NoneType=None, max_w:NoneType=None
):

```

*`resize` `x` to `max_px`, or `max_h`, or `max_w`*

## Basic types

This section regroups the basic types used in vision with the transform
that create objects of those types.

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L91"
target="_blank" style="float:right; font-size:smaller">source</a>

### to_image

``` python

def to_image(
    x
):

```

*Convert a tensor or array to a PIL int8 Image*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L99"
target="_blank" style="float:right; font-size:smaller">source</a>

### load_image

``` python

def load_image(
    fn, mode:NoneType=None
):

```

*Open and load a `PIL.Image` and convert to `mode`*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L107"
target="_blank" style="float:right; font-size:smaller">source</a>

### image2tensor

``` python

def image2tensor(
    img
):

```

*Transform image to byte tensor in `c*h*w` dim order.*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L114"
target="_blank" style="float:right; font-size:smaller">source</a>

### PILBase

``` python

def PILBase(
    
)->None:

```

*Base class for a Pillow `Image` that can show itself and convert to a
Tensor*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L120"
target="_blank" style="float:right; font-size:smaller">source</a>

### PILBase.create

``` python

def create(
    fn:pathlib.Path | str | torch.Tensor | numpy.ndarray | bytes | PIL.Image.Image, kwargs:VAR_KEYWORD
):

```

*Return an Image from `fn`*

Images passed to
[`PILBase`](https://docs.fast.ai/vision.core.html#pilbase) or inherited
classes’ `create` as a PyTorch `Tensor`, NumPy `ndarray`, or Pillow
`Image` must already be in the correct Pillow image format. For example,
`uint8`, and RGB or BW for
[`PILImage`](https://docs.fast.ai/vision.core.html#pilimage) or
[`PILImageBW`](https://docs.fast.ai/vision.core.html#pilimagebw),
respectively.

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L130"
target="_blank" style="float:right; font-size:smaller">source</a>

### PILBase.show

``` python

def show(
    ctx:NoneType=None, kwargs:VAR_KEYWORD
):

```

*Show image using `merge(self._show_args, kwargs)`*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L137"
target="_blank" style="float:right; font-size:smaller">source</a>

### PILImage

``` python

def PILImage(
    
)->None:

```

*A RGB Pillow `Image` that can show itself and converts to
[`TensorImage`](https://docs.fast.ai/torch_core.html#tensorimage)*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L142"
target="_blank" style="float:right; font-size:smaller">source</a>

### PILImageBW

``` python

def PILImageBW(
    
)->None:

```

*A BW Pillow `Image` that can show itself and converts to
[`TensorImageBW`](https://docs.fast.ai/torch_core.html#tensorimagebw)*

``` python
im = PILImage.create(TEST_IMAGE)
test_eq(type(im), PILImage)
test_eq(im.mode, 'RGB')
test_eq(str(im), 'PILImage mode=RGB size=1200x803')
```

``` python
im2 = PILImage.create(im)
test_eq(type(im2), PILImage)
test_eq(im2.mode, 'RGB')
test_eq(str(im2), 'PILImage mode=RGB size=1200x803')
```

``` python
im.resize((64,64))
```

![](07_vision.core_files/figure-commonmark/cell-30-output-1.png)

``` python
ax = im.show(figsize=(1,1))
```

![](07_vision.core_files/figure-commonmark/cell-31-output-1.png)

``` python
test_fig_exists(ax)
```

``` python
timg = TensorImage(image2tensor(im))
tpil = PILImage.create(timg)
```

``` python
tpil.resize((64,64))
```

![](07_vision.core_files/figure-commonmark/cell-34-output-1.png)

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L147"
target="_blank" style="float:right; font-size:smaller">source</a>

### PILMask

``` python

def PILMask(
    
)->None:

```

*A Pillow `Image` Mask that can show itself and converts to
[`TensorMask`](https://docs.fast.ai/torch_core.html#tensormask)*

``` python
im = PILMask.create(TEST_IMAGE)
test_eq(type(im), PILMask)
test_eq(im.mode, 'L')
test_eq(str(im), 'PILMask mode=L size=1200x803')
```

### Images

``` python
mnist = untar_data(URLs.MNIST_TINY)
fns = get_image_files(mnist)
mnist_fn = TEST_IMAGE_BW
```

``` python
timg = Transform(PILImageBW.create)
mnist_img = timg(mnist_fn)
test_eq(mnist_img.size, (28,28))
assert isinstance(mnist_img, PILImageBW)
mnist_img
```

![](07_vision.core_files/figure-commonmark/cell-38-output-1.png)

### Segmentation masks

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L157"
target="_blank" style="float:right; font-size:smaller">source</a>

### AddMaskCodes

``` python

def AddMaskCodes(
    codes:NoneType=None
):

```

*Add the code metadata to a
[`TensorMask`](https://docs.fast.ai/torch_core.html#tensormask)*

``` python
camvid = untar_data(URLs.CAMVID_TINY)
fns = get_image_files(camvid/'images')
cam_fn = fns[0]
mask_fn = camvid/'labels'/f'{cam_fn.stem}_P{cam_fn.suffix}'
```

``` python
cam_img = PILImage.create(cam_fn)
test_eq(cam_img.size, (128,96))
tmask = Transform(PILMask.create)
mask = tmask(mask_fn)
test_eq(type(mask), PILMask)
test_eq(mask.size, (128,96))
```

``` python
_,axs = plt.subplots(1,3, figsize=(12,3))
cam_img.show(ctx=axs[0], title='image')
mask.show(alpha=1, ctx=axs[1], vmin=1, vmax=30, title='mask')
cam_img.show(ctx=axs[2], title='superimposed')
mask.show(ctx=axs[2], vmin=1, vmax=30);
```

![](07_vision.core_files/figure-commonmark/cell-42-output-1.png)

### Points

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L168"
target="_blank" style="float:right; font-size:smaller">source</a>

### TensorPoint

``` python

def TensorPoint(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

```

*Basic type for points in an image*

Points are expected to come as an array/tensor of shape `(n,2)` or as a
list of lists with two elements. Unless you change the defaults in
[`PointScaler`](https://docs.fast.ai/vision.core.html#pointscaler) (see
later on), coordinates should go from 0 to width/height, with the first
one being the column index (so from 0 to width) and the second one being
the row index (so from 0 to height).

<div>

> **Note**
>
> This is different from the usual indexing convention for arrays in
> numpy or in PyTorch, but it’s the way points are expected by
> matplotlib or the internal functions in PyTorch like `F.grid_sample`.

</div>

``` python
pnt_img = TensorImage(mnist_img.resize((28,35)))
pnts = np.array([[0,0], [0,35], [28,0], [28,35], [9, 17]])
tfm = Transform(TensorPoint.create)
tpnts = tfm(pnts)
test_eq(tpnts.shape, [5,2])
test_eq(tpnts.dtype, torch.float32)
```

``` python
ctx = pnt_img.show(figsize=(1,1), cmap='Greys')
tpnts.show(ctx=ctx);
```

![](07_vision.core_files/figure-commonmark/cell-45-output-1.png)

### Bounding boxes

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L189"
target="_blank" style="float:right; font-size:smaller">source</a>

### get_annotations

``` python

def get_annotations(
    fname, prefix:NoneType=None
):

```

*Open a COCO style json in `fname` and returns the lists of filenames
(with maybe `prefix`) and labelled bboxes.*

Test
[`get_annotations`](https://docs.fast.ai/vision.core.html#get_annotations)
on the coco_tiny dataset against both image filenames and bounding box
labels.

``` python
coco = untar_data(URLs.COCO_TINY)
test_images, test_lbl_bbox = get_annotations(coco/'train.json')
annotations = json.load(open(coco/'train.json'))
categories, images, annots = map(lambda x:L(x),annotations.values())

test_eq(test_images, images.attrgot('file_name'))

def bbox_lbls(file_name):
    img = images.filter(lambda img:img['file_name']==file_name)[0]
    bbs = annots.filter(lambda a:a['image_id'] == img['id'])
    i2o = {k['id']:k['name'] for k in categories}
    lbls = [i2o[cat] for cat in bbs.attrgot('category_id')]
    bboxes = [[bb[0],bb[1], bb[0]+bb[2], bb[1]+bb[3]] for bb in bbs.attrgot('bbox')]
    return [bboxes, lbls]

for idx in random.sample(range(len(images)),5): 
    test_eq(test_lbl_bbox[idx], bbox_lbls(test_images[idx]))
```

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L220"
target="_blank" style="float:right; font-size:smaller">source</a>

### TensorBBox

``` python

def TensorBBox(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

```

*Basic type for a tensor of bounding boxes in an image*

Bounding boxes are expected to come as tuple with an array/tensor of
shape `(n,4)` or as a list of lists with four elements and a list of
corresponding labels. Unless you change the defaults in
[`PointScaler`](https://docs.fast.ai/vision.core.html#pointscaler) (see
later on), coordinates for each bounding box should go from 0 to
width/height, with the following convention: x1, y1, x2, y2 where
(x1,y1) is your top-left corner and (x2,y2) is your bottom-right corner.

<div>

> **Note**
>
> We use the same convention as for points with x going from 0 to width
> and y going from 0 to height.

</div>

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L231"
target="_blank" style="float:right; font-size:smaller">source</a>

### LabeledBBox

``` python

def LabeledBBox(
    items:NoneType=None, rest:VAR_POSITIONAL, use_list:bool=False, match:NoneType=None
):

```

*Basic type for a list of bounding boxes in an image*

``` python
coco = untar_data(URLs.COCO_TINY)
images, lbl_bbox = get_annotations(coco/'train.json')
idx=2
coco_fn,bbox = coco/'train'/images[idx],lbl_bbox[idx]
coco_img = timg(coco_fn)
```

``` python
tbbox = LabeledBBox(TensorBBox(bbox[0]), bbox[1])
ctx = coco_img.show(figsize=(3,3), cmap='Greys')
tbbox.show(ctx=ctx);
```

![](07_vision.core_files/figure-commonmark/cell-51-output-1.png)

## Basic Transforms

Unless specifically mentioned, all the following transforms can be used
as single-item transforms (in one of the list in the `tfms` you pass to
a `TfmdDS` or a `Datasource`) or tuple transforms (in the `tuple_tfms`
you pass to a `TfmdDS` or a `Datasource`). The safest way that will work
across applications is to always use them as `tuple_tfms`. For instance,
if you have points or bounding boxes as targets and use
[`Resize`](https://docs.fast.ai/vision.augment.html#resize) as a
single-item transform, when you get to
[`PointScaler`](https://docs.fast.ai/vision.core.html#pointscaler)
(which is a tuple transform) you won’t have the correct size of the
image to properly scale your points.

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/data/transforms.py#LNone"
target="_blank" style="float:right; font-size:smaller">source</a>

### ToTensor

``` python

def ToTensor(
    enc:NoneType=None, dec:NoneType=None, split_idx:NoneType=None, order:NoneType=None
):

```

*Convert item to appropriate tensor class*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/data/transforms.py#LNone"
target="_blank" style="float:right; font-size:smaller">source</a>

### ToTensor

``` python

def ToTensor(
    enc:NoneType=None, dec:NoneType=None, split_idx:NoneType=None, order:NoneType=None
):

```

*Convert item to appropriate tensor class*

Any data augmentation transform that runs on PIL Images must be run
before this transform.

``` python
tfm = ToTensor()
print(tfm)
print(type(mnist_img))
print(type(tfm(mnist_img)))
```

    ToTensor(enc:2,dec:0)
    <class '__main__.PILImageBW'>
    <class 'fastai.torch_core.TensorImageBW'>

``` python
tfm = ToTensor()
test_eq(tfm(mnist_img).shape, (1,28,28))
test_eq(type(tfm(mnist_img)), TensorImageBW)
test_eq(tfm(mask).shape, (96,128))
test_eq(type(tfm(mask)), TensorMask)
```

Let’s confirm we can pipeline this with `PILImage.create`.

``` python
pipe_img = Pipeline([PILImageBW.create, ToTensor()])
img = pipe_img(mnist_fn)
test_eq(type(img), TensorImageBW)
pipe_img.show(img, figsize=(1,1));
```

![](07_vision.core_files/figure-commonmark/cell-56-output-1.png)

``` python
def _cam_lbl(x): return mask_fn
cam_tds = Datasets([cam_fn], [[PILImage.create, ToTensor()], [_cam_lbl, PILMask.create, ToTensor()]])
show_at(cam_tds, 0);
```

![](07_vision.core_files/figure-commonmark/cell-57-output-1.png)

To work with data augmentation, and in particular the `grid_sample`
method, points need to be represented with coordinates going from -1 to
1 (-1 being top or left, 1 bottom or right), which will be done unless
you pass `do_scale=False`. We also need to make sure they are following
our convention of points being x,y coordinates, so pass along
`y_first=True` if you have your data in an y,x format to add a flip.

<div>

> **Warning**
>
> This transform needs to run on the tuple level, before any transform
> that changes the image size.

</div>

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L260"
target="_blank" style="float:right; font-size:smaller">source</a>

### PointScaler

``` python

def PointScaler(
    do_scale:bool=True, y_first:bool=False
):

```

*Scale a tensor representing points*

To work with data augmentation, and in particular the `grid_sample`
method, points need to be represented with coordinates going from -1 to
1 (-1 being top or left, 1 bottom or right), which will be done unless
you pass `do_scale=False`. We also need to make sure they are following
our convention of points being x,y coordinates, so pass along
`y_first=True` if you have your data in an y,x format to add a flip.

<div>

> **Note**
>
> This transform automatically grabs the sizes of the images it sees
> before a <code>TensorPoint</code> object and embeds it in them. For
> this to work, those images need to be before any points in the order
> of your final tuple. If you don’t have such images, you need to embed
> the size of the corresponding image when creating a
> <code>TensorPoint</code> by passing it with `sz=...`.

</div>

``` python
def _pnt_lbl(x): return TensorPoint.create(pnts)
def _pnt_open(fn): return PILImage(PILImage.create(fn).resize((28,35)))
pnt_tds = Datasets([mnist_fn], [_pnt_open, [_pnt_lbl]])
pnt_tdl = TfmdDL(pnt_tds, bs=1, after_item=[PointScaler(), ToTensor()])
```

``` python
test_eq(pnt_tdl.after_item.c, 10)
```

``` python
x,y = pnt_tdl.one_batch()
#Scaling and flipping properly done
#NB: we added a point earlier at (9,17); formula below scales to (-1,1) coords
test_close(y[0], tensor([[-1., -1.], [-1.,  1.], [1.,  -1.], [1., 1.], [9/14-1, 17/17.5-1]]))
a,b = pnt_tdl.decode_batch((x,y))[0]
test_eq(b, tensor(pnts).float())
#Check types
test_eq(type(x), TensorImage)
test_eq(type(y), TensorPoint)
test_eq(type(a), TensorImage)
test_eq(type(b), TensorPoint)
test_eq(b.img_size, (28,35)) #Automatically picked the size of the input
```

``` python
pnt_tdl.show_batch(figsize=(2,2), cmap='Greys');
```

![](07_vision.core_files/figure-commonmark/cell-62-output-1.png)

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L281"
target="_blank" style="float:right; font-size:smaller">source</a>

### BBoxLabeler

``` python

def BBoxLabeler(
    enc:NoneType=None, dec:NoneType=None, split_idx:NoneType=None, order:NoneType=None
):

```

*Delegates (`__call__`,`decode`,`setup`) to
(<code>encodes</code>,<code>decodes</code>,<code>setups</code>) if
`split_idx` matches*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/data/transforms.py#LNone"
target="_blank" style="float:right; font-size:smaller">source</a>

### MultiCategorize

``` python

def MultiCategorize(
    vocab:NoneType=None, add_na:bool=False
):

```

*Reversible transform of multi-category strings to `vocab` id*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L260"
target="_blank" style="float:right; font-size:smaller">source</a>

### PointScaler

``` python

def PointScaler(
    do_scale:bool=True, y_first:bool=False
):

```

*Scale a tensor representing points*

------------------------------------------------------------------------

<a
href="https://github.com/fastai/fastai/blob/main/fastai/vision/core.py#L260"
target="_blank" style="float:right; font-size:smaller">source</a>

### PointScaler

``` python

def PointScaler(
    do_scale:bool=True, y_first:bool=False
):

```

*Scale a tensor representing points*

``` python
def _coco_bb(x):  return TensorBBox.create(bbox[0])
def _coco_lbl(x): return bbox[1]

coco_tds = Datasets([coco_fn], [PILImage.create, [_coco_bb], [_coco_lbl, MultiCategorize(add_na=True)]], n_inp=1)
coco_tdl = TfmdDL(coco_tds, bs=1, after_item=[BBoxLabeler(), PointScaler(), ToTensor()])
```

``` python
Categorize(add_na=True)
```

    Categorize -- {'vocab': None, 'sort': True, 'add_na': True}
    (enc:1,dec:1)

``` python
coco_tds.tfms
```

    (#3) [Pipeline: PILBase.create,Pipeline: _coco_bb,Pipeline: _coco_lbl -> MultiCategorize -- {'vocab': None, 'sort': True, 'add_na': True}]

``` python
x,y,z
```

    (PILImage mode=RGB size=128x128,
     TensorBBox([[-0.9011, -0.4606,  0.1416,  0.6764],
                 [ 0.2000, -0.2405,  1.0000,  0.9102],
                 [ 0.4909, -0.9325,  0.9284, -0.5011]]),
     TensorMultiCategory([1, 1, 1]))

``` python
x,y,z = coco_tdl.one_batch()
test_close(y[0], -1+tensor(bbox[0])/64)
test_eq(z[0], tensor([1,1,1]))
a,b,c = coco_tdl.decode_batch((x,y,z))[0]
test_close(b, tensor(bbox[0]).float())
test_eq(c.bbox, b)
test_eq(c.lbl, bbox[1])

#Check types
test_eq(type(x), TensorImage)
test_eq(type(y), TensorBBox)
test_eq(type(z), TensorMultiCategory)
test_eq(type(a), TensorImage)
test_eq(type(b), TensorBBox)
test_eq(type(c), LabeledBBox)
test_eq(y.img_size, (128,128))
```

``` python
coco_tdl.show_batch();
```

![](07_vision.core_files/figure-commonmark/cell-72-output-1.png)
