From bb49075107b5a5d804f552e5b4fc34a4bfe4393c Mon Sep 17 00:00:00 2001 From: Ralph Kube Date: Thu, 2 Feb 2023 09:58:49 -0500 Subject: [PATCH] Documenting how to implement a DataLoader for image augmentation --- docs/src/index.md | 144 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 144 insertions(+) diff --git a/docs/src/index.md b/docs/src/index.md index 2129a00..fe70020 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -58,6 +58,150 @@ copied. In fact, while `x` and `y` are materialized arrays, all the rest are data views. +A common task when training Convolutional Neural Networks for image representations +is to apply random augmentations to the training data. These augmentations are often +operations such as flipping the image or applying Gaussian Blur. This example shows +how to lazily apply such transformations at the time where the batch is loaded using +[`Augmentor.jl`](https://evizero.github.io/Augmentor.jl/stable/). +When training the model, one commonly iterates over mini-batches of data and applies the +augmentations batch-wise. Here we show how `MLUtils.jl` allows to implement this using a +custom dataset. + +First, we import the packages we are using. Besides `MLUtils`, we are using `Random` for random +number generation, `Augmentor` for the augmentations, and `ImageCore` to convert numerical arrays +into images. For more details on using images in the Julia eco-system see the +[JuliaImages documentation](https://juliaimages.org/stable/tutorials/quickstart/). + +```julia +using MLUtils +using Random +using Augmentor +using ImageCore +``` + +The first step is to define a custom [type](https://docs.julialang.org/en/v1/manual/types/) that defines our dataset: + +```julia +struct my_dset{T} + data_arr::T + trf +end +``` + +The structure takes a type parameter `T`, for numerical image data this could be `Array{Float32}(4)`. +That is, we specify that the numerical base type is `Float32`. The four dimensions correspond to +width, height, channels, and number of observations. The field `trf` stores the transformation we will +apply to the images. No type parameter is provided here, which allows to be more general for the type of +transformations we will apply. + +The data we operate on is a 4-dimensional numerical array, that represents a large collection of color images: + +```julia +num_samples = 100 +num_channels = 3 +width = height = 28 +d = randn(Float32, width, height, num_channels, num_samples) +``` + +Now we can define a composition of transformations we wish to apply to the data. In this example we +compose a horizontal, vertical flip or no operation, followed by a gaussian blur. A complete +list of available augmentations in `Augmentor.jl` is provided [here](https://evizero.github.io/Augmentor.jl/stable/operations/). + +```julia +pl = FlipX() * FlipY() * NoOp() |> GaussianBlur(3:2:5, 1f0:1f-1:2f0) +``` + +With the data and transformation in place, we can instantiate the dataset + +```julia +ds = my_dset(d, pl) +``` + +To instantiate a `DataLoader`to iterate over this simple dataset we need to implement custom +`numobs` and `getobs` methods: + +```julia +function MLUtils.getobs(dset::my_dset, ix::Int) + obs = dset.data_arr[:, :, :, ix] # Fetch a single observation from the dataset + obs_c = colorview(RGB, permutedims(obs, (3, 1, 2))) # Convert it into an image so that the transformation can be applied to it + obs_trf = augment(obs_c, dset.trf) # Apply the augmentations + permutedims(channelview(obs_trf), (2, 3, 1)) # Convert the augmented observation into numerical data +end + +MLUtils.numobs(data::my_dset) = size(data.data_arr)[end] +``` + +The `numobs` function just returns the number of samples in the dataset. Which is just the extend of the +last dimnension of the data array field of `my_dset`. The `getobs` function takes the dataset and an +integer index as input and return the augmented array. Internally, we first fetch a single observation +from the dataset. Then we convert it into an image, apply the augmentation, and convert the augmented +observation back into a numerical type. + +With these methods implemented, we can now construct a `DataLoader` and iterate over the dataset. +The augmentations will be applied lazily at the time a observation is accessed. + +```julia +loader = DataLoader(ds, batchsize=-1) + +for (ix, obs) ∈ enumerate(loader) + @show ix, size(obs) +end +``` + +Now we focus on batching. In practice we want to train on a batch of multiple images. +`MLUtils.jl` provides `BatchView` that allows to fetch batches of images at a time. +To make `BatchView` work on our dataset it needs to implement the data container +interface as described in `ObsView`. In particular, we need to implement a +`getobs` and `getobs!` method that fetch multiple observations. + +The difference between `getobs!` and `getobs` is that `getobs!` returns multiple +observations in a pre-allocated buffer. We can therefore implement `getobs!` first +and let `getobs` allocate a buffer and just call `getobs!`. + +```julia +function MLUtils.getobs!(buffer, dset, ix::AbstractVector) + batch = dset.data_arr[:, :, :, ix] # Load selected observations + batch_img = colorview(RGB, permutedims(batch, (3, 1, 2, 4))) # Convert to image + augmentbatch!(CPUThreads(), buffer, batch_img, dset.trf) # Augment entire batch + permutedims(channelview(buffer), (2, 3, 1, 4)) # Convert augmented batch to numerical type +end + +function MLUtils.getobs(dset::my_dset, ix::AbstractVector) + # Get the size of the dataset array, sans the number of batches. THat is defined by + # the length of the index vector + + batch_dim = [size(ds.data_arr)[[3, 1, 2]]..., length(ix)] + buffer = colorview(RGB, zeros(eltype(dset.data_arr), batch_dim...)) + MLUtils.getobs!(buffer, dset, ix) +end +``` + +`getobs!` takes as input the pre-allocated buffer, the dataset, and vector of +indices that specify the desired observations to fetch. The function then +copies the specified observations into a buffer and converts the datatype of the +buffer into an image type. Then, the entire batch is augmented and the result stored +in the pre-alloacted buffer. The first argument `CPUThreads()` in the call to +`augmentbatch!` allows individual augmentations to be performed in parallel. +The result is converted back into a numerical array which is then returned by the function. + + +The `getobs` method is essentially a wrapper around `getobs!` but also allocates +a buffer. The number of observations requested in each iteration of the `DataLoader` +can vary when there is a remainder for dividing number of total observations by the +batch size. + +With these methods implemented, we can now lazily apply random augmentations to each +batch of the dataset: + +```julia +loader_batch = DataLoader(ds, batchsize=27, shuffle=true) +for (ix, bobs) ∈ enumerate(loader_batch) + @show ix, size(bobs) +end +``` + + + ## Related Packages `MLUtils.jl` brings together functionalities previously found in [LearnBase.jl](https://github.com/JuliaML/LearnBase.jl) , [MLDataPattern.jl](https://github.com/JuliaML/MLDataPattern.jl) and [MLLabelUtils.jl](https://github.com/JuliaML/MLLabelUtils.jl). These packages are now discontinued.