Improvements of Face Detection and Recognition Vladimír Míč Jan Sedmidubský Michal Batko Pavel Zezula Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 2 OutlineOutline Basic terms Face detection and recognition Quality evaluation Face retrieval system Original approach Face detection improvement Face recognition improvement Future work Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 3 Face detection problem Goal is to put an ellipse on the place where a face is Recall: how many faces out of real faces were detected? (100 %) Precision: how many faces out of detected ones represent real faces? (66,7 %) Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 4 Face recognition problem In general, the ability to answer the question “is it the same person in these two images?” In practice, sorting faces according to the similarity with respect to a query Example: 21 photos of person “00003”, 10-NN query Recall = 4 / 21 = 19 %, Precision = 40 % Distance Image 0 00003_941121_rd.jpg 8932 00003_941121_hr.jpg 9145 00003_941121_qr.jpg 9167 00003_941121_rc.jpg 9277 00750_941201_hr.jpg 9281 00765_941201_hr.jpg 9282 00972_960627_qr.jpg 9283 00695_941121_qr.jpg 9285 00772_941201_re.jpg 9286 00750_941205_hr.jpg Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 5 Face retrieval system Detection and recognition of human faces Only static images Web application Real time system Huge number of photos Demo with 1 000 000 pictures Images in a low quality Result – the most similar photos for an arbitrary query in a few seconds Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 6 Original approach – presumptions Extractor for MPEG7 descriptors Extraction from an arbitrary picture Meets metric properties More sophisticated software (software B) Higher quality than MPEG7 No constraints (e.g., about metric properties) Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 7 Original approach – performance boost Face detection using software B Face crops from images MPEG7 descriptors from crops Index over MPEG7 descriptors Candidates selection using index Overrank of candidates according to the recognition results performed by software B Face Detection Improvement Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 9 New approach Performed by 3 independent pieces of software OpenCV (makes MPEG7 descriptors) Luxand Neurotechnology (Verilook) Extraction made in parallel Detection: compliance of at least 2 out of 3 pieces of software Precision nearly 100 % Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 10 1 1260 small faces, low quality 2 66 big faces, high quality Face detection results Name Recall1 Precision1 Recall2 Precision2 Open CV (OCV) 55 % 89 % 92 % 86 % Luxand 63 % 83 % 95 % 94 % Neurotechnology (Verilook) 73 %*** 84 % 100 % 96 % Aggregated extractor 62 % 98 % 97 % 100 % Face Recognition Improvement Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 12 Aggregated face descriptor Aggregated face descriptor encapsulates MPEG7 descriptor Luxand descriptor Verilook descriptor MPEG7 descriptor is always present Can be added via crop made according to the Luxand or Verilook descriptor One of Luxand and Verilook descriptors may be missing Not both Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 13 Face recognition principle Based on a distance function How to measure a distance between two Aggregated descriptors? Missing encapsulated descriptors can't participate in distance calculation Distances between encapsulated objects For each couple of Aggregated descriptors it's possible to get from 1 to 3 distances How to mix more distances in order to get the best recognition function? Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 14 80 90 100 110 120 130 140 150 160 170 180 0% 20% 40% 60% 80% 100% MPEG7 distance function Recall Precision Threshold % 0,010,070,130,190,250,310,370,430,490,550,61 0% 20% 40% 60% 80% 100% Luxand distance function Recall Precision Threshold % Distance quality measurements 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0% 20% 40% 60% 80% 100% Aggregated distance function Recall Precision Threshold % 8900 8954 9008 9062 9116 9170 9224 9278 9332 0% 20% 40% 60% 80% 100% Verilook distance function Recall Precision Threshold % Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 15 Distance of encapsulated objects normalization Normalization according to the precision Normalized distance is within [0, 1] Original distance d on which the precision was p is normalized to a new distance 1 – p Quite accurate approximation is used Aggregated distance is equal to minimum of normalized partial distances Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 16 1 753 small images, low quality 3 11 338 big images, high q. Face recognition summary Name Recall1 on precision 85 % Recall1 on precision 95 % Recall3 on precision 85 % Recall3 on precision 95 % Open CV (OCV) 24 % 14 % 8 % 3 % Luxand 23 % 16 % 14 % Not possible Neurotechnology (Verilook) 12 % 11 % 53 % 51 % Aggregated extractor 31 % 24 % 54 % 51 % Bc. Vladimír Míč, 21.10.2013 Improvements of Face Detection and Recognition 17 Future work Index build on Aggregated descriptors using MPEG7 distance function Candidates selection Overrank according the aggregated distance function Prototype system is expected to run in a few days :-) Thank you for your attention