Person Tagging in Still Images by Fusing Face and Full-body Detections Vlastislav Dohnal (dohnal@fi.muni.cz) and Alexander Matecny DISA Lab – disa.fi.muni.cz Masaryk University, Brno, Czech Republic February 27, 2014MMEDIA, Nice, France1 Outline Motivation System architecture Figure recognition Experimental results Conclusions February 27, 2014MMEDIA, Nice, France2 Motivation Application: person tagging in personal photo albums State of the art: Face recognition is used Figure/body detectors exist Based on detecting person head, or full-body/torso detector But typically applied to video signal Aim: tag people that do not have their face detected Tag such people using figure recognition, i.e. figure detector applied to still images February 27, 2014MMEDIA, Nice, France3 February 27, 2014MMEDIA, Nice, France4 Framework for Person Recognition Prototypical Implementation Detection module Face detector: Luxand SDK Figure detector: Edgelets [1] Extraction module Face descriptor: Luxand SDK Figure descriptor: Obtain clothing patch region MPEG-7 visual descriptor Clustering module Objective function based on maximum distance within cluster members February 27, 2014MMEDIA, Nice, France5 [1] B. Wu and R. Nevatia: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. ICCV 2005. Figure Detection: Edgelets Detect edges Match edgelets (lines, arks, symmetrical pairs) Group edgelets to identify a body part By AdaBoost February 27, 2014MMEDIA, Nice, France6 Result of Sobel operator Quantized orientation Edgelets line ark pair Figure Detection: Merging & Patch Merge detections 𝑎𝑟𝑒𝑎 𝑅 𝑏𝑜𝑑𝑦1∩𝑅 𝑏𝑜𝑑𝑦2 𝑚𝑖𝑛 𝑎𝑟𝑒𝑎 𝑅 𝑏𝑜𝑑𝑦1 , 𝑎𝑟𝑒𝑎 𝑅 𝑏𝑜𝑑𝑦2 ≥ 0.72 Clothing patch Defined by shrinking the figure detection region Width [0.30; 0.70], height [0.32; 0.58] February 27, 2014MMEDIA, Nice, France7 Figure Detection: Quality Figure detector training Positives: 914 person images from MIT dataset Negatives: 1,886 images from INRIA Holidays dataset Figure detector quality Tested on ETH dataset 1,201 person figures in 196 images, i.e. ground truth is defined 817 people correctly detected (68%) 1,943 false detections (i.e. 29.6% precision) February 27, 2014MMEDIA, Nice, France8 Figure Descriptor MPEG7 standard descriptors Extracted from clothing patch region Color structure (CS), scalable color (SC), edge histogram (EH), and their combination February 27, 2014MMEDIA, Nice, France9 Figure Descriptor: Recognition Figure recognition Tested on ETH dataset Descriptors: Color structure (CS), scalable color (SC), edge histogram (EH), and their combinations February 27, 2014MMEDIA, Nice, France10 Clustering Detections Face clustering Similar faces form a cluster (distance < 0.14) Figure clustering Similar clothing patches (distance < 1.28) Face and figure correspondence  𝑎𝑟𝑒𝑎 𝑅 𝑓𝑎𝑐𝑒∩𝑅 𝑡𝑜𝑝_𝑏𝑜𝑑𝑦 𝑚𝑖𝑛 𝑎𝑟𝑒𝑎 𝑅 𝑓𝑎𝑐𝑒 ,𝑎𝑟𝑒𝑎 𝑅 𝑡𝑜𝑝_𝑏𝑜𝑑𝑦 ≥ 0.10 𝑅𝑡𝑜𝑝_𝑏𝑜𝑑𝑦= top third of the figure region Image file IDs are respected Two different detections in the same file cannot be merged February 27, 2014MMEDIA, Nice, France11 Experimental Evaluation Whole system tested on a small personal album 24 photos, 76 figures, 54 faces, 26 distinct person identities (11 of them repeated) Detection results: 7 correct faces (13%) 54 correct figures (71%) 24 false-posit. figs Clustering results: February 27, 2014MMEDIA, Nice, France12 More results at http://disa.fi.muni.cz/mmedia2014/ 6 - ok (14%) 3 - different people (7%) 7 - mix of false/true positives 27 - only one detection (63%) Conclusions Framework for combining face and figure detection for person identification A preliminary implementation Identification using body clothing patches Future plans Prepare independent head and upper-body detectors Improve merging detection regions Use personal photos only to train the detectors Test other figure detectors Test better clothing patch extractor (e.g. based on segmentation) February 27, 2014MMEDIA, Nice, France13 Experimental Evaluation February 27, 2014MMEDIA, Nice, France14 One cluster of different people Clothing patches are very similar Experimental Evaluation Different clusters of ungrouped detections February 27, 2014MMEDIA, Nice, France15 Experimental Evaluation A cluster with false and true positive detections Clothing patches are similar February 27, 2014MMEDIA, Nice, France16