This module contains the classes that define datasets handling
Image objects and their tranformations. As usual, we'll start with a quick overview, before we get in to the detailed API docs.
To get you started as easily as possible, the fastai provides two helper functions to create a
DataBunch object that you can directly use for training a classifier. To demonstrate them you'll first need to download and untar the file by executing the following cell. This will create a data folder containing an MNIST subset in
path = untar_data(URLs.MNIST_SAMPLE); path
tfms = get_transforms(do_flip=False) data = ImageDataBunch.from_folder(path, ds_tfms=tfms, size=24)
Here the datasets will be automatically created in the structure of Imagenet-style folders. The parameters specified:
- the transforms to apply to the images in
do_flip=False because we don't want to flip numbers),
- the target
sizeof our pictures (here 24).
If you want to have a look at a few images inside a batch, you can use
rows argument is the number of rows and columns to display.
The second way to define the data for a classifier requires a structure like this:
path\ train\ test\ labels.csv
where the labels.csv file defines the label(s) of each image in the training set. This is the format you will need to use when each image can have multiple labels. It also works with single labels:
data = ImageDataBunch.from_csv(path, ds_tfms=tfms, size=28)
An example of multiclassification can be downloaded with the following cell. It's a sample of the planet dataset.
planet = untar_data(URLs.PLANET_SAMPLE)
If we open the labels files, we seach that each image has one or more tags, separated by a space.
df =pd.read_csv(planet/'labels.csv') df.head()
|1||train_9516||clear cultivation primary water|
|4||train_5302||haze primary road|
data = ImageDataBunch.from_csv(planet, folder='train', size=128, suffix='.jpg', sep=' ', ds_tfms=get_transforms(flip_vert=True, max_lighting=0.1, max_zoom=1.05, max_warp=0.))
show_batchmethod will then print all the labels that correspond to each image.
data.show_batch(rows=3, figsize=(10,8), ds_type=DatasetType.Valid)
DataBunch suitable for computer vision.
This is the same initilialization as a regular
DataBunch so you probably don't want to use this directly, but one of the factory methods instead.
If you quickly want to get a
ImageDataBunch and train a model, you should process your data to have it in one of the formats the following functions handle.
PathOrStr=`'valid'`, `valid_pct`=`None`, `classes`:
Create from imagenet style dataset in
test subfolders (or provide
"Imagenet-style" datasets look something like this (note that the test folder is optional):
path\ train\ clas1\ clas2\ ... valid\ clas1\ clas2\ ... test\
data = ImageDataBunch.from_folder(path, ds_tfms=tfms, size=24)
PathOrStr=`'.'`, `sep`=`None`, `csv_labels`:
Create from a csv file in
path by splitting the data in
folder and labelled in a file
csv_labels between a training and validation set. Use
valid_pct to indicate the percentage of the total images for the validation set. An optional
test folder contains unlabelled data and
suffix contains an optional suffix to add to the filenames in
csv_labels (such as '.jpg').
data = ImageDataBunch.from_csv(path, ds_tfms=tfms, size=24);
PathOrStr=`'.'`, `sep`=`None`, `valid_pct`:
Create from a
df = pd.read_csv(path/'labels.csv', header='infer') df.head()
data = ImageDataBunch.from_df(path, df, ds_tfms=tfms, size=24)
Different datasets are labeled in many different ways. The following methods can help extract the labels from the dataset in a wide variety of situations. The way they are built in fastai is constructive: there are methods which do a lot for you but apply in specific circumstances and there are methods which do less for you but give you more flexibility.
In this case the hierachy is:
fnames, calling a regular expression (containing one re group) on the file names to get the labels, putting aside
valid_pct for the validation. In the same way as
ImageDataBunch.from_csv, an optional
test folder contains unlabelled data.
Our previously created dataframe contains the labels in the filenames so we can leverage it to test this new method.
ImageDataBunch.from_name_re needs the exact path of each file so we will append the data path to each filename before creating our
fn_paths = [path/name for name in df['name']]; fn_paths[:2]
pat = r"/(\d)/\d+\.png$" data = ImageDataBunch.from_name_re(path, fn_paths, pat=pat, ds_tfms=tfms, size=24)
Works in the same way as
ImageDataBunch.from_name_re, but instead of a regular expression it expects a function that will determine how to extract the labels from the filenames. (Note that
from_name_re uses this function in its implementation).
To test it we could build a function with our previous regex. Let's try another, similar approach to show that the labels can be obtained in a different way.
def get_labels(file_path): return '3' if '/3/' in str(file_path) else '7' data = ImageDataBunch.from_name_func(path, fn_paths, label_func=get_labels, ds_tfms=tfms, size=24) data.classes
The most flexible factory function; pass in a list of
labels that correspond to each of the filenames in
To show an example we have to build the labels list outside our
ImageDataBunch object and give it as an argument when we call
from_lists. Let's use our previously created function to create our labels list.
labels_ls = list(map(get_labels, fn_paths)) data = ImageDataBunch.from_lists(path, fn_paths, labels=labels_ls, ds_tfms=tfms, size=24) data.classes
bool=`False`, `kwargs`) →
collate_fn and a potential
ds_tfms is a tuple of two lists of transforms to be applied to the training and the validation (plus test optionally) set.
tfms are the transforms to apply to the
size and the
kwargs are passed to the transforms for data augmentation.
In the next two methods we will use a new dataset, CIFAR. This is because the second method will get the statistics for our dataset and we want to be able to show different statistics per channel. If we were to use MNIST, these statistics would be the same for every channel. White pixels are [255,255,255] and black pixels are [0,0,0] (or in normalized form [1,1,1] and [0,0,0]) so there is no variance between channels.
path = untar_data(URLs.CIFAR); path
data = ImageDataBunch.from_folder(path, ds_tfms=tfms, valid='test', size=24)
def channel_view(x:Tensor)->Tensor: "Make channel the first axis of `x` and flatten remaining axes" return x.transpose(0,1).contiguous().view(x.shape,-1)
This function takes a tensor and flattens all dimensions except the channels, which it keeps as the first axis. This function is used to feed
ImageDataBunch.batch_stats so that it can get the pixel statistics of a whole batch.
Let's take as an example the dimensions our MNIST batches: 128, 3, 24, 24.
t = torch.Tensor(128, 3, 24, 24)
torch.Size([128, 3, 24, 24])
tensor = channel_view(t)
[tensor([0.4928, 0.4767, 0.4671]), tensor([0.2677, 0.2631, 0.2630])]
In the fast.ai library we have
mnist_stats so we can add normalization easily with any of these datasets. Let's see an example with our dataset of choice: MNIST.
ImageDataBunch; Train: LabelList y: CategoryList (50000 items) [Category truck, Category truck, Category truck, Category truck, Category truck]... Path: /home/ubuntu/.fastai/data/cifar10 x: ImageItemList (50000 items) [Image (3, 32, 32), Image (3, 32, 32), Image (3, 32, 32), Image (3, 32, 32), Image (3, 32, 32)]... Path: /home/ubuntu/.fastai/data/cifar10; Valid: LabelList y: CategoryList (10000 items) [Category truck, Category truck, Category truck, Category truck, Category truck]... Path: /home/ubuntu/.fastai/data/cifar10 x: ImageItemList (10000 items) [Image (3, 32, 32), Image (3, 32, 32), Image (3, 32, 32), Image (3, 32, 32), Image (3, 32, 32)]... Path: /home/ubuntu/.fastai/data/cifar10; Test: None
[tensor([ 0.0074, -0.0219, 0.0769]), tensor([1.0836, 1.0829, 1.0078])]
You may also want to normalize your data, which can be done by using the following functions.
Create normalize/denormalize func using
std, can specify
On MNIST the mean and std are 0.1307 and 0.3081 respectively (looked on Google). If you're using a pretrained model, you'll need to use the normalization that was used to train the model. The imagenet norm and denorm functions are stored as constants inside the library named
imagenet_denorm. If you're training a model on CIFAR-10, you can also use
You may sometimes see warnings about clipping input data when plotting normalized data. That's because even although it's denormalized when plotting automatically, sometimes floating point errors may make some values slightly out or the correct range. You can safely ignore these warnings in this case.
data = ImageDataBunch.from_folder(untar_data(URLs.MNIST_SAMPLE), ds_tfms=tfms, size=24) data.normalize() data.show_batch(rows=3, figsize=(6,6))
To use this dataset and collate samples into batches, you'll need to following function:
Function that collect
samples of labelled bboxes and adds padding with
str=`''`, `kwargs`) →
Get the filenames in
df and will had
path/folder in front of them,
suffix at the end.
This module also contains a few helper functions to allow you to build you own dataset for image classification.
Download images listed in text file
urls to path
dest, at most
int=`3`, `interp`=`2`, `ext`:
Check if the images in
path aren't broken, maybe resize them and copy it in
It will try if every image in this folder can be opened and has
n_channels is 3 – it'll try to convert image to RGB. If
delete=True, it'll be removed it this fails. If
resume – it will skip already existent images in
max_size is specifided, image is resized to the same ratio so that both sizes are less than
interp. Result is stored in
ext forces an extension type,
kwargs are passed to PIL.Image.save. Use