Basic dataset for computer vision and helper function to get a DataBunch

Computer vision data

This module contains the classes that define datasets handling Image objects and their tranformations. As usual, we'll start with a quick overview, before we get in to the detailed API docs.

Quickly get your data ready for training

To get you started as easily as possible, the fastai provides two helper functions to create a DataBunch object that you can directly use for training a classifier. To demonstrate them you'll first need to download and untar the file by executing the following cell. This will create a data folder containing an MNIST subset in data/mnist_sample.

path = untar_data(URLs.MNIST_SAMPLE); path
PosixPath('/home/ubuntu/.fastai/data/mnist_sample')

There are a number of ways to create an ImageDataBunch. One common approach is to use Imagenet-style folders (see a ways down the page below for details) with ImageDataBunch.from_folder:

tfms = get_transforms(do_flip=False)
data = ImageDataBunch.from_folder(path, ds_tfms=tfms, size=24)

Here the datasets will be automatically created in the structure of Imagenet-style folders. The parameters specified:

  • the transforms to apply to the images in ds_tfms (here with do_flip=False because we don't want to flip numbers),
  • the target size of our pictures (here 24).

As with all DataBunch usage, a train_dl and a valid_dl are created that are of the type PyTorch DataLoader.

If you want to have a look at a few images inside a batch, you can use ImageDataBunch.show_batch. The rows argument is the number of rows and columns to display.

data.show_batch(rows=3, figsize=(5,5))

The second way to define the data for a classifier requires a structure like this:

path\
  train\
  test\
  labels.csv

where the labels.csv file defines the label(s) of each image in the training set. This is the format you will need to use when each image can have multiple labels. It also works with single labels:

pd.read_csv(path/'labels.csv').head()
name label
0 train/3/7463.png 0
1 train/3/21102.png 0
2 train/3/31559.png 0
3 train/3/46882.png 0
4 train/3/26209.png 0

You can then use ImageDataBunch.from_csv:

data = ImageDataBunch.from_csv(path, ds_tfms=tfms, size=28)
data.show_batch(rows=3, figsize=(5,5))

An example of multiclassification can be downloaded with the following cell. It's a sample of the planet dataset.

planet = untar_data(URLs.PLANET_SAMPLE)

If we open the labels files, we seach that each image has one or more tags, separated by a space.

df =pd.read_csv(planet/'labels.csv')
df.head()
image_name tags
0 train_21983 partly_cloudy primary
1 train_9516 clear cultivation primary water
2 train_12664 haze primary
3 train_36960 clear primary
4 train_5302 haze primary road
data = ImageDataBunch.from_csv(planet, folder='train', size=128, suffix='.jpg', sep=' ',
    ds_tfms=get_transforms(flip_vert=True, max_lighting=0.1, max_zoom=1.05, max_warp=0.))

The show_batchmethod will then print all the labels that correspond to each image.

data.show_batch(rows=3, figsize=(10,8), ds_type=DatasetType.Valid)

You can find more ways to build an ImageDataBunch without the factory methods in data_block.

class ImageDataBunch[source]

ImageDataBunch(train_dl:DataLoader, valid_dl:DataLoader, test_dl:Optional[DataLoader]=None, device:device=None, tfms:Optional[Collection[Callable]]=None, path:PathOrStr='.', collate_fn:Callable='data_collate') :: DataBunch

Factory methods

Normally we'll use one of the convenience wrappers below. However, these wrappers all accept a kwargs that is passed to the general DataBunch.create method (like bs, num_workers...)

If you quickly want to get a ImageDataBunch and train a model, you should process your data to have it in one of the formats the following functions handle.

from_folder[source]

from_folder(path:PathOrStr, train:PathOrStr='train', valid:PathOrStr='valid', valid_pct=None, classes:Collection=None, kwargs:Any) → ImageDataBunch

Create from imagenet style dataset in path with train,valid,test subfolders (or provide valid_pct).

"Imagenet-style" datasets look something like this (note that the test folder is optional):

path\
  train\
    clas1\
    clas2\
    ...
  valid\
    clas1\
    clas2\
    ...
  test\

For example:

data = ImageDataBunch.from_folder(path, ds_tfms=tfms, size=24)

Note that this (and all factory methods in this section) pass any kwargs to ImageDataBunch.create.

from_csv[source]

from_csv(path:PathOrStr, folder:PathOrStr='.', sep=None, csv_labels:PathOrStr='labels.csv', valid_pct:float=0.2, fn_col:int=0, label_col:int=1, suffix:str='', header:Union[int, str, NoneType]='infer', kwargs:Any) → ImageDataBunch

Create from a csv file.

Create ImageDataBunch from path by splitting the data in folder and labelled in a file csv_labels between a training and validation set. Use valid_pct to indicate the percentage of the total images for the validation set. An optional test folder contains unlabelled data and suffix contains an optional suffix to add to the filenames in csv_labels (such as '.jpg'). For example:

data = ImageDataBunch.from_csv(path, ds_tfms=tfms, size=24);

from_df[source]

from_df(path:PathOrStr, df:DataFrame, folder:PathOrStr='.', sep=None, valid_pct:float=0.2, fn_col:Union[int, Collection[int], str, StrList]=0, label_col:Union[int, Collection[int], str, StrList]=1, suffix:str='', kwargs:Any) → ImageDataBunch

Create from a DataFrame.

Same as ImageDataBunch.from_csv, but passing in a DataFrame instead of a csv file. E.gL

df = pd.read_csv(path/'labels.csv', header='infer')
df.head()
name label
0 train/3/7463.png 0
1 train/3/21102.png 0
2 train/3/31559.png 0
3 train/3/46882.png 0
4 train/3/26209.png 0
data = ImageDataBunch.from_df(path, df, ds_tfms=tfms, size=24)

Different datasets are labeled in many different ways. The following methods can help extract the labels from the dataset in a wide variety of situations. The way they are built in fastai is constructive: there are methods which do a lot for you but apply in specific circumstances and there are methods which do less for you but give you more flexibility.

In this case the hierachy is:

  1. ImageDataBunch.from_name_re: Gets the labels from the filenames using a regular expression
  2. ImageDataBunch.from_name_func: Gets the labels from the filenames using any function
  3. ImageDataBunch.from_lists: Labels need to be provided as an input in a list

from_name_re[source]

from_name_re(path:PathOrStr, fnames:FilePathList, pat:str, valid_pct:float=0.2, kwargs)

Creates an ImageDataBunch from fnames, calling a regular expression (containing one re group) on the file names to get the labels, putting aside valid_pct for the validation. In the same way as ImageDataBunch.from_csv, an optional test folder contains unlabelled data.

Our previously created dataframe contains the labels in the filenames so we can leverage it to test this new method. ImageDataBunch.from_name_re needs the exact path of each file so we will append the data path to each filename before creating our ImageDataBunch object.

fn_paths = [path/name for name in df['name']]; fn_paths[:2]
[PosixPath('/home/ubuntu/.fastai/data/mnist_sample/train/3/7463.png'),
 PosixPath('/home/ubuntu/.fastai/data/mnist_sample/train/3/21102.png')]
pat = r"/(\d)/\d+\.png$"
data = ImageDataBunch.from_name_re(path, fn_paths, pat=pat, ds_tfms=tfms, size=24)
data.classes
['3', '7']

from_name_func[source]

from_name_func(path:PathOrStr, fnames:FilePathList, label_func:Callable, valid_pct:float=0.2, kwargs)

Works in the same way as ImageDataBunch.from_name_re, but instead of a regular expression it expects a function that will determine how to extract the labels from the filenames. (Note that from_name_re uses this function in its implementation).

To test it we could build a function with our previous regex. Let's try another, similar approach to show that the labels can be obtained in a different way.

def get_labels(file_path): return '3' if '/3/' in str(file_path) else '7'
data = ImageDataBunch.from_name_func(path, fn_paths, label_func=get_labels, ds_tfms=tfms, size=24)
data.classes
['3', '7']

from_lists[source]

from_lists(path:PathOrStr, fnames:FilePathList, labels:StrList, valid_pct:float=0.2, kwargs)

The most flexible factory function; pass in a list of labels that correspond to each of the filenames in fnames.

To show an example we have to build the labels list outside our ImageDataBunch object and give it as an argument when we call from_lists. Let's use our previously created function to create our labels list.

labels_ls = list(map(get_labels, fn_paths))
data = ImageDataBunch.from_lists(path, fn_paths, labels=labels_ls, ds_tfms=tfms, size=24)
data.classes
['3', '7']

single_from_classes[source]

single_from_classes(path:PathOrStr, classes:StrList, tfms:Union[Callable, Collection[Callable]]=None, kwargs)

Create an empty ImageDataBunch in path with classes. Typically used for inference.

Methods

show_batch[source]

show_batch(rows:int=None, ds_type:DatasetType=<DatasetType.Train: 1>, kwargs)

Show a batch of data in ds_type on a few rows.

Create a rows by rows grid of images from dataset ds_type for a figsize figure. This function works for all type of computer vision data (see data_block for more examples).

Once you have your ImageDataBunch, you can have a quick look at your data by using this:

data.show_batch(rows=3, figsize=(6,6))

labels_to_csv[source]

labels_to_csv(dest:str)

Save file names and labels in data as CSV to file name dest.

This is a functional version of ImageDataBunch.show_batch.

In the next two methods we will use a new dataset, CIFAR. This is because the second method will get the statistics for our dataset and we want to be able to show different statistics per channel. If we were to use MNIST, these statistics would be the same for every channel. White pixels are [255,255,255] and black pixels are [0,0,0] (or in normalized form [1,1,1] and [0,0,0]) so there is no variance between channels.

path = untar_data(URLs.CIFAR); path
PosixPath('/home/ubuntu/.fastai/data/cifar10')

channel_view[source]

channel_view(x:Tensor) → Tensor

Make channel the first axis of x and flatten remaining axes

data = ImageDataBunch.from_folder(path, ds_tfms=tfms, valid='test', size=24)
def channel_view(x:Tensor)->Tensor:
    "Make channel the first axis of `x` and flatten remaining axes"
    return x.transpose(0,1).contiguous().view(x.shape[1],-1) 

This function takes a tensor and flattens all dimensions except the channels, which it keeps as the first axis. This function is used to feed ImageDataBunch.batch_stats so that it can get the pixel statistics of a whole batch.

Let's take as an example the dimensions our MNIST batches: 128, 3, 24, 24.

t = torch.Tensor(128, 3, 24, 24)
t.size()
torch.Size([128, 3, 24, 24])
tensor = channel_view(t)
tensor.size()
torch.Size([3, 73728])

batch_stats[source]

batch_stats(funcs:Collection[Callable]=None) → Tensor

Grab a batch of data and call reduction function func per channel

Gets the statistics of each channel of a batch of data. If no functions are specified, default statistics are mean and standard deviation.

data.batch_stats()
[tensor([0.5115, 0.5096, 0.4376]), tensor([0.2202, 0.2218, 0.2428])]

normalize[source]

normalize(stats:Collection[Tensor]=None)

Add normalize transform using stats (defaults to DataBunch.batch_stats)

Adds the normalize transform to the set of transforms associated with the data. In the fast.ai library we have imagenet_stats, cifar_stats and mnist_stats so we can add normalization easily with any of these datasets. Let's see an example with our dataset of choice: MNIST.

data.normalize(cifar_stats)
ImageDataBunch;
Train: LabelList
y: CategoryList (50000 items)
['bird' 'bird' 'bird' 'bird' ... 'dog' 'dog' 'dog' 'dog']
Path: .
x: ImageItemList (50000 items)
[PosixPath('/home/ubuntu/.fastai/data/cifar10/train/bird/31882_bird.png')
 PosixPath('/home/ubuntu/.fastai/data/cifar10/train/bird/5417_bird.png')
 PosixPath('/home/ubuntu/.fastai/data/cifar10/train/bird/13477_bird.png')
 PosixPath('/home/ubuntu/.fastai/data/cifar10/train/bird/27823_bird.png') ...
 PosixPath('/home/ubuntu/.fastai/data/cifar10/train/dog/37310_dog.png')
 PosixPath('/home/ubuntu/.fastai/data/cifar10/train/dog/43790_dog.png')
 PosixPath('/home/ubuntu/.fastai/data/cifar10/train/dog/5503_dog.png')
 PosixPath('/home/ubuntu/.fastai/data/cifar10/train/dog/2260_dog.png')]
Path: /home/ubuntu/.fastai/data/cifar10;
Valid: LabelList
y: CategoryList (10000 items)
['bird' 'bird' 'bird' 'bird' ... 'dog' 'dog' 'dog' 'dog']
Path: .
x: ImageItemList (10000 items)
[PosixPath('/home/ubuntu/.fastai/data/cifar10/test/bird/8371_bird.png')
 PosixPath('/home/ubuntu/.fastai/data/cifar10/test/bird/113_bird.png')
 PosixPath('/home/ubuntu/.fastai/data/cifar10/test/bird/5665_bird.png')
 PosixPath('/home/ubuntu/.fastai/data/cifar10/test/bird/4858_bird.png') ...
 PosixPath('/home/ubuntu/.fastai/data/cifar10/test/dog/5222_dog.png')
 PosixPath('/home/ubuntu/.fastai/data/cifar10/test/dog/6674_dog.png')
 PosixPath('/home/ubuntu/.fastai/data/cifar10/test/dog/9551_dog.png')
 PosixPath('/home/ubuntu/.fastai/data/cifar10/test/dog/2650_dog.png')]
Path: /home/ubuntu/.fastai/data/cifar10;
Test: None
data.batch_stats()
[tensor([ 0.0832,  0.1136, -0.0359]), tensor([0.8914, 0.9128, 0.9304])]

Data normalization

You may also want to normalize your data, which can be done by using the following functions.

normalize[source]

normalize(x:Tensor, mean:FloatTensor, std:FloatTensor) → Tensor

Normalize x with mean and std.

denormalize[source]

denormalize(x:Tensor, mean:FloatTensor, std:FloatTensor) → Tensor

Denormalize x with mean and std.

normalize_funcs[source]

normalize_funcs(mean:FloatTensor, std:FloatTensor) → Tuple[Callable, Callable]

Create normalize and denormalize functions using mean and std. device will store them on the device specified. do_y determines if the target should also be normaized or not.

On MNIST the mean and std are 0.1307 and 0.3081 respectively (looked on Google). If you're using a pretrained model, you'll need to use the normalization that was used to train the model. The imagenet norm and denorm functions are stored as constants inside the library named imagenet_norm and imagenet_denorm. If you're training a model on CIFAR-10, you can also use cifar_norm and cifar_denorm.

You may sometimes see warnings about clipping input data when plotting normalized data. That's because even although it's denormalized when plotting automatically, sometimes floating point errors may make some values slightly out or the correct range. You can safely ignore these warnings in this case.

data = ImageDataBunch.from_folder(untar_data(URLs.MNIST_SAMPLE),
                                  ds_tfms=tfms, size=24)
data.normalize()
data.show_batch(rows=3, figsize=(6,6))
show_doc(get_annotations)

get_annotations[source]

get_annotations(fname, prefix=None)

Open a COCO style json in fname and returns the lists of filenames (with maybe prefix) and labelled bboxes.

To use this dataset and collate samples into batches, you'll need to following function:

show_doc(bb_pad_collate)

bb_pad_collate[source]

bb_pad_collate(samples:BatchSamples, pad_idx:int=0) → Tuple[FloatTensor, Tuple[LongTensor, LongTensor]]

Function that collect samples of labelled bboxes and adds padding with pad_idx.

Finally, to apply transformations to Image in a Dataset, we use this last class.

ItemList specific to vision

The vision application adds a few subclasses of ItemList specific to images.

class ImageItemList[source]

ImageItemList(items:Iterator, create_func:Callable=None, path:PathOrStr='.', label_cls:Callable=None, xtra:Any=None, processor:PreProcessor=None) :: ItemList

Create a ItemList in path from filenames in items. create_func will default to open_image. label_cls can be specified for the labels, xtra contains any extra information (usually in the form of a dataframe) and processor is applied to the ItemList after splitting and labelling.

from_folder[source]

from_folder(path:PathOrStr='.', create_func:Callable='open_image', extensions:StrList={'.png', '.xpm', '.jpeg', '.bmp', '.jpg', '.svg', '.ras', '.pgm', '.pnm', '.xbm', '.tiff', '.xwd', '.gif', '.ief', '.ico', '.pbm', '.tif', '.rgb', '.ppm', '.jpe'}, kwargs) → ItemList

Get the list of files in path that have an image suffix. recurse determines if we search subfolders.

from_df[source]

from_df(df:DataFrame, path:PathOrStr, create_func:Callable='open_image', col:Union[int, Collection[int], str, StrList]=0, folder:PathOrStr='.', suffix:str='') → ItemList

Get the filenames in col of df and will had path/folder in front of them, suffix at the end. create_func is used to open the images.

class ObjectCategoryList[source]

ObjectCategoryList(items:Iterator, classes:Collection=None, kwargs) :: CategoryList

class ObjectItemList[source]

ObjectItemList(items:Iterator, create_func:Callable=None, path:PathOrStr='.', label_cls:Callable=None, xtra:Any=None, processor:PreProcessor=None) :: ImageItemList

class SegmentationItemList[source]

SegmentationItemList(items:Iterator, create_func:Callable=None, path:PathOrStr='.', label_cls:Callable=None, xtra:Any=None, processor:PreProcessor=None) :: ImageItemList

class SegmentationLabelList[source]

SegmentationLabelList(items:Iterator, classes:Collection=None, kwargs) :: ImageItemList

class PointsItemList[source]

PointsItemList(items:Iterator, create_func:Callable=None, path:PathOrStr='.', label_cls:Callable=None, xtra:Any=None, processor:PreProcessor=None) :: ItemList

Building your own dataset

This module also contains a few helper functions to allow you to build you own dataset for image classification.

download_images[source]

download_images(urls:StrList, dest:PathOrStr, max_pics:int=1000, max_workers:int=8, timeout=4)

Download images listed in text file urls to path dest, at most max_pics

verify_images[source]

verify_images(path:PathOrStr, delete:bool=True, max_workers:int=4, max_size:Union[int, Tuple[int, int]]=None, dest:PathOrStr='.', n_channels:int=3, interp=2, ext:str=None, img_format:str=None, resume:bool=None, kwargs)

Check if the image in path exists, can be opened and has n_channels. If n_channels is 3 – it'll try to convert image to RGB. If delete, removes it if it fails. If resume – it will skip already existent images in dest. If max_size is specifided, image is resized to the same ratio so that both sizes are less than max_size, using interp. Result is stored in dest, ext forces an extension type, img_format and kwargs are passed to PIL.Image.save. Use max_workers CPUs.