EXPERIMENTS WITH IMAGE AUGMENTATION FOR SEGMENTATION V. Martinek Overview ● 2 datasets ● 4 models ● 20 augmentation techniques (Geometric, Filters, Erasing, Colors) ● Measuring impact on test set metrics Camvid ● Images captured from a driving car ● 32 distinct labels (pedestrian, road, car, sidewalk, fence...) ● 370-100-230 split Drone ● images taken from drone flying in urban areas. ● 23 distinct labels (water, roof, person, grass ...) ● 240-80-80 split Segmentation ● Assign a class to each pixel ● Each pixel is a probability vector ● Network output is Height*Width*Classes with softmax activation on each pixel ● Input is the same resolution as output U-net ● Encoder and decoder (Contracting and expanding path) ● On expansion concatenated with results from contracting path Models ● Inception, Mobilenet, Resnet, Vgg ● Used as encoder part of the U-net ● Encoder pre-trained on imagenet and frozen Metrics Dice Loss - basically F1 score (Weighted) Mean IOU (Macro) Both in range <0,1> No-aug models results Most of the models performed reasonably well Resnet on Camvid failed to learn No-aug models results Geometric Filters Erasing Color space Camvid dataset Drone dataset Conclusions Some reasonable improvements (Motion blur, elastic, enhance edges… ) Big variance More Models/Datasets? Finish