Content-Based Annotation and Classification Framework: A General Multi-Purpose Approach Michal Batko, Jan Botorek, Petra Budikova, Pavel Zezula Laboratory of Data Intensive Systems and Applications Faculty of Informatics, Masaryk University, Brno IDEAS 2013 Gold bar IDEAS 2013 Slide ‹#› of 20 Outline §Motivation §Why annotations? §State-of-the-art in multimedia annotation § §General annotation model §Global architecture §Application to selected tasks §Specification of components § §Web image annotation §Current implementation §Experimental evaluation § §Conclusions and future research directions § Gold bar IDEAS 2013 Slide ‹#› of 20 Motivation „Image is worth a thousand words.“ https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRSpxVqEaSo5HgnpQLjkP44gtENLw8eHm8jhsO6LnM98BS eIzNt Yellow flower Flower, yellow, dandelion, detail, close-up, nature, plant, beautiful Taraxacum officinale The first dandelion that bloomed this year in front of the White House. nature dandelion Gold bar IDEAS 2013 Slide ‹#› of 20 Why do we need the thousand words? §Keyword-based image retrieval §Popular and intuitive §Needs pictures with text metadata, we do not want to create them manually §Information seeking: “What is in the photo I just took?” §Tourist information / Plant identification / … §Impaired users §Classification tasks §Scientific data (medicine, astronomy, chemistry, …) §Improper content identification §Personal image gallery §Data summarization: “What images are on this computer?” § §Not only images! §Sound, video, …. § Gold bar IDEAS 2013 Slide ‹#› of 20 Several dimensions of the annotation problem §Input §Image / Image and seed keyword / Image and text / Text §Type of information needed §Identification / Detection / Categorization §Vocabulary §Unlimited vocabulary / Controlled vocabulary §Form of annotation required §Sentence / Set of keywords / All relevant categories / A single category / Localization in a taxonomy §Interactivity §Online / offline annotation § §Easy tasks: identify a single relevant category from a short list §Difficult tasks: wide (unlimited) vocabulary, “all relevant needed”, online processing, very little or no input text § Gold bar IDEAS 2013 Slide ‹#› of 20 State-of-the-art text-extraction techniques §Pure text-based §Analyze the text on a surrounding web page §Content-based / Content- and text-based §Mainly exploit visual properties (+ text when available) § §Content-based annotation scenario: §Basic annotation §Model-based: train a model for each concept in vocabulary §Search-based: kNN search in annotated collection §Annotation refinement §Statistical §Ontology-based §Secondary kNN search §… § § Gold bar IDEAS 2013 Slide ‹#› of 20 Existing approaches – summary §Model-based techniques: §Specialized classifiers can achieve high precision §Fast processing §Training feasible only for a limited number of concepts feasible, high-quality training data needed §Search-based techniques: §Can exploit vast amounts of annotated data available online §No training needed, no limitation of vocabulary §Costly processing when large datasets need to be searched §Content-based similarity measures often not precise enough § §Summary of state-of-the-art: §Mostly specialized solutions for a specific type of application §Reasonable results only for simple tasks § § § + + - + + - - Gold bar IDEAS 2013 Slide ‹#› of 20 Our approach §Facts §Experiments show that state-of-the-art solutions are not very successful for complex problems §Psychologic research suggests hierarchical annotation §Our vision: §Broad-domain annotation is a complex process, needs to be modeled as such §Multiple processing phases §Modular design §Hierarchic annotation §Combine multiple knowledge sources §User in the loop §The same infrastructure can be used for different applications (annotation, classification, …) §The principal components are the same §Easy evaluation, comparisons Gold bar IDEAS 2013 Slide ‹#› of 20 General annotation model H:\employee_female_upravena.gif flower New query/relevance feedback Output (intermediate or final result) Flower, nature, ... Annotation forming https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcRLzH7kswrARM4BXFt4f0UK1kFGhP7eFtQ_1hrfUdRvua8 P0m9S flower, ... Input information + inferred information expansion transformation reduction http://www.clker.com/cliparts/j/V/R/4/3/s/cog-cogwheel-gear-zahnrad-md.png Resources http://nltk.googlecode.com/svn/trunk/doc/images/wordnet-hierarchy.png http://us.123rf.com/400wm/400/400/kaetana/kaetana1001/kaetana100100036/6274083-stack-of-shots-with- ukrainian-landmarks.jpg Gold bar IDEAS 2013 Slide ‹#› of 20 General annotation model (cont.) §Framework components §Query §Image / image + text / (text) §Knowledge sources §Annotated image collection, WordNet, ontologies, internet, …, user §Annotation-record §Query + candidate keywords, weights, any other knowledge §Processor modules §Expander, transformer, reducer §Evaluation scenarios § §Properties §Clear structure, modularity §Can be adapted to various annotation/classification tasks §Supports extensive experiments, comparison of techniques § Annotation-record word NULL word NULL word weight word weight Transformer Knowledge source Gold bar IDEAS 2013 Slide ‹#› of 20 Simple examples Query Annotated image collection Simple annotation Annotation-record word NULL word NULL word NULL word NULL §Basic search-based annotation § § § § § § § §Simple model-based annotation § § Query Annotated image collection, WordNet Simple annotation Annotation-record word NULL word NULL word NULL word NULL Gold bar IDEAS 2013 Slide ‹#› of 20 Advanced example: Hierarchic image annotation Query Image collections, dictionaries, Wikipedia, … WordNet, specialized ontologies Semantic weight transformer Basic level category https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcSHXTHafqEy-bvtoZupLAWXaUznGX67d5gGtDTPjVNo5h0 4Ua-q Result or next level category Semantic weight transformer https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcSHXTHafqEy-bvtoZupLAWXaUznGX67d5gGtDTPjVNo5h0 4Ua-q Relevance feedback Image collections, web, dictionaries, Wikipedia, … Relevance feedback WordNet, specialized ontologies animals, outdoor animals, outdoor, pinguins, whales, snow animals, nature, outdoor, snow, pinguins, group, standing Gold bar IDEAS 2013 Slide ‹#› of 20 Processing modules §“The brain of the annotation process” § § § §Expanders §Provide candidate keywords §Visual-based nearest-neighbor search §Similarity measured by MPEG-7 global descriptors §Metric search provided by efficient M-index structure §Knowledge source: annotated image collection §Face detection software §Luxand FaceSDK §commercial library for detection and recognition of faces §Depending on number of faces detected, people-related concepts are added to annotation-record § § § Transformer Gold bar IDEAS 2013 Slide ‹#› of 20 Processing modules (cont.) §Transformers §Adjust weights of candidate keywords §Basic weight transformer §Frequency of a keyword in the descriptions of similar images §Similarity score of each image with the particular keyword §Knowledge source: descriptions of similar images §Semantic transformer §Uses WordNet hierarchies to cluster related words §Keyword weight increased proportionately to the size of containing cluster §Knowledge source: WordNet § §Reducers §Remove unsuitable candidates §Syntactic cleaner §Stopword removal, translation, spell-correction §Knowledge sources: WordNet, dictionaries, Wikipedia § § Gold bar IDEAS 2013 Slide ‹#› of 20 Web image annotation problem §Task specification § “Given an image, provide the K most relevant keywords that describe the content of this image.” §Use case §A professional photographer uploading images to a photo-selling site needs to provide accompanying keywords to enable text search § §Basic solution § § § § § § § § Budikova, Batko, Zezula: Online image annotation. SISAP 2011. Query Annotated image collection Weight transformer (word frequency) Annotation-record word NULL word NULL word NULL word NULL Annotation-record word NULL word NULL word weight word weight Gold bar IDEAS 2013 Slide ‹#› of 20 Weight transformer (WordNet relationships) Web image annotation problem (cont.) §A more complex solution Query Annotated image collection WordNet Dictionary, Wikipedia, WordNet Weight transformer (word frequency) Annotation-record word NULL word NULL Annotation-record word NULL word NULL word NULL word NULL Annotation-record word weight word weight Annotation-record word NULL word NULL word weight word weight Annotation-record word NULL word NULL word weight word weight Flight, fly, airplane, sky, aircraft, plane, aviation Gold bar IDEAS 2013 Slide ‹#› of 20 Web image annotation – evaluation §Methods under comparison §Original search-based annotation §Cleaned keywords §Boosting by distance §Clustering by WordNet meaning §Face detector boosting §Face detector enrichment § §Evaluation methodology §160 test queries §Categories easy/medium/difficult §20 best keywords requested §Result relevance evaluation: §User-provided (result relevance assessments) §Automatic (comparison to image description provided by author) Gold bar IDEAS 2013 Slide ‹#› of 20 Web image annotation – evaluation (cont.) Easy query entertainment, art, sparkling, event, enjoyment, show, display, air, celebration, festival, flash, level, fireworks, cracker, explosion, fire, excitement, firecracker, light, bang Medium query blossom, location, plant, bird, food, trees, natural, citrus, flowers, generic, antique, destinations, nature, recreation, tree, foliage, botany, fruit, determination, flower Difficult query form, station, antique, interior, frame, bookcase, indoors, group, animal, antiques, snack, person, construction, food, chinese, study, wood, architecture, dynasty, building http://mufin.fi.muni.cz/profimedia/images/0000262717 http://mufin.fi.muni.cz/profimedia/images/0069993882 http://mufin.fi.muni.cz/profimedia/images/0035537162 Gold bar IDEAS 2013 Slide ‹#› of 20 Web image annotation – evaluation (cont.) Precision – user evaluation highly relevant keywords (dark) + relevant keywords (light) query type Processing costs Gold bar IDEAS 2013 Slide ‹#› of 20 Conclusions §Image annotation remains a challenging task §Broad domains, interactive applications, lack of training data, … § §Our contributions §General annotation model & implementation framework §Implementation & evaluation of several processing components §Improved annotation tool §http://disa.fi.muni.cz/prototype-applications/image-annotation/ § § § § § §Future work §Refinement of semantical analysis §Development of new components, hierarchic annotation processing §Relevance feedback strategies for image annotation Gold bar IDEAS 2013 Slide ‹#› of 20 More experimental results Gold bar IDEAS 2013 Slide ‹#› of 20 Message §Broad domain image annotation is a highly actual, challenging task §Existing methods provide useful tools, but are not able to solve the problem in its complexity §We propose to approach the problem in a novel way, relying on iterative annotation tuning §We presented a modular annotation framework that supports a variety of annotation and classification tasks §We already have a working annotation software and we evaluated the usefulness of several processing modules experimentaly. §