EXPERIMENTS WITH IMAGE
AUGMENTATION FOR
SEGMENTATION
V. Martinek
Overview
● 2 datasets
● 4 models
● 20 augmentation techniques (Geometric, Filters, Erasing, Colors)
● Measuring impact on test set metrics
Camvid
● Images captured from a driving car
● 32 distinct labels (pedestrian, road, car,
sidewalk, fence...)
● 370-100-230 split
Drone
● images taken from drone ﬂying in urban
areas.
● 23 distinct labels (water, roof, person,
grass ...)
● 240-80-80 split
Segmentation
● Assign a class to each pixel
● Each pixel is a probability vector
● Network output is Height*Width*Classes with softmax activation on each pixel
● Input is the same resolution as output
U-net
● Encoder and decoder (Contracting
and expanding path)
● On expansion concatenated with
results from contracting path
Models
● Inception, Mobilenet, Resnet, Vgg
● Used as encoder part of the U-net
● Encoder pre-trained on imagenet and frozen
Metrics
Dice Loss - basically F1 score (Weighted)
Mean IOU (Macro)
Both in range <0,1>
No-aug models results
Most of the models performed reasonably well
Resnet on Camvid failed to learn
No-aug models results
Geometric
Filters
Erasing
Color space
Camvid dataset
Drone dataset
Conclusions
Some reasonable improvements (Motion blur, elastic, enhance edges… )
Big variance
More Models/Datasets?
Finish