Similarity Search for Multimedia Data Pavel Zezula DISA FI Masaryk University Brno, Czech Republic Outline of the talk •Similarity in our lives •Principles of metric similarity search technology –effectiveness – metric similarity features –efficiency – indexing structure •Examples of multimedia applications: –Image search and annotation –Processing streams of similarity queries –Searching in motion capture data –Accessing protein databases •Future research directions – Similarity in the Real-Life quotations from the social psychology literature •Any event in the history of organism is, in a sense, unique. • •Recognition, learning, and judgment presuppose an ability to categorize stimuli and classify situations by similarity. • •Similarity (proximity, resemblance, communality, representativeness, psychological distance, ...) is fundamental to theories of perception, learning, judgment, etc. • •Similarity is subjective and context-dependent Real-life Similarity •Are they similar? http://html.slidesharecdn.com/similarity-120222193728-phpapp02/output009.png?1329961814 Real-life Similarity •Are they similar? http://html.slidesharecdn.com/similarity-120222193728-phpapp02/output001.png?1329961814 Real-life Similarity •Are they similar? http://html.slidesharecdn.com/similarity-120222193728-phpapp02/output004.png?1329961814 Real-life Similarity •Are they similar? http://html.slidesharecdn.com/similarity-120222193728-phpapp02/output010.png?1329961814 Prototypicality or Centrality not symmetric Context/Data/Environment Dependent circumstances alter similarities We learned from School •GEOMETRY: •Two polygons are similar to each other, if: 1)Their corresponding angles are congruent •∠A = ∠E; ∠B = ∠F; ∠C = ∠G; ∠D = ∠H, and 2)The lengths of their corresponding sides are proportional •AB/EF = BC/FG = CD/GH = DA/HE • B C A D E F H G Similarity & Geometry •If one polygon is similar to a second polygon, and the second polygon is similar to the third polygon, the first polygon is also similar to the third polygon. •In any case: • •Two geometric figures are either similar or they are not similar at all • Contemporary Networked Media The digital data view •Almost everything that we see, read, hear, write, measure, or observe can be digital. •Users autonomously contribute to production of global media and the growth is exponential. •Sites like Flickr, YouTube, Facebook host user contributed content for a variety of events. •The elements of networked media are related by numerous multi-facet links of similarity. • • •Majority of current data is unstructured •possibly only structured on display Challenge •Networked media database is getting close to the human “fact-bases” –the gap between physical and digital world has blurred • •Similarity data management is needed to connect, search, filter, merge, relate, rank, cluster, classify, identify, or categorize objects across various collections. • •WHY? •It is the similarity which is in the world revealing. Iterative and Interactive Nature of Contemporary Searching • •When we search, our next actions are reactions to the stimuli of previous search results • •What we find is changing what we seek • •In any case, search must be: • • fast, simple, and relevant Metric Space: A Geometric Model of Similarity •Metric space: M = (D,d) –D – domain –distance function d(x,y) •"x,y,z Î D •d(x,y) > 0 - non-negativity •d(x,y) = 0 Û x = y - identity •d(x,y) = d(y,x) - symmetry •d(x,y) ≤ d(x,z) + d(z,y) - triangle inequality Examples of Distance Functions •Lp Minkovski distance (for vectors) •L1 – city-block distance » •L2 – Euclidean distance • •L¥ – infinity » •Edit distance (for strings) •minimal number of insertions, deletions and substitutions •d(‘application’, ‘applet’) = 6 » •Jaccard’s coefficient (for sets A,B) • Examples of Distance Functions •Mahalanobis distance –for vectors with correlated dimensions •Hausdorff distance –for sets with elements related by another distance •Earth movers distance –primarily for histograms (sets of weighted features) •and many others – Content-Based Search •Content-based search in images 1806 1040158 1045791 984761 1042473 Image base Extracting Features •Extracting features 1806 Image level R B G Feature level MPEG-7 •Multimedia Content Descriptors Standard ~ 2000 •Global feature descriptors: –Color, shape, texture, … – – – – –One high-dimensional vector per image and feature –Minkovski distance used • – Visual Similarity - Local feature descriptors – SIFT, SURF, etc. - Invariant to image scaling, small viewpoint change, rotation, noise, illumination IMG_2088_sifts_thick Visual Similarity - finding coresspondence Biometrics: Fingerprint •Minutiae detection: –Detect ridges (endings and branching) –Represented as a sequence of minutiae •P=( (r1,e1,θ1), …, (rm,em,θm) ) •Point in polar coordinates (r,e) and direction θ •Matching of two sequences: –Align input sequence with database one –Compute weighted edit distance •wins,del=620 •wrepl=[0;26] - depending on similarity of two minutiae fingerprint1.png fingerprint1.png Points in the sequence are ordered in an increasing order of radial angle (e). Multiple Visual Aspects • tovarna Contemporary Approaches to Feature Extraction •Neural networks technology –Convolutional Neural Networks (CNN) –Recurrent Neural Networks (RNN) Classified dataset Training data Validation data Výsledek obrázku pro neural network Training (Fine-tuning) Validation Neural network model Data split Convolutional Neural Networks •Convolutional Neural Network (CNN) – AlexNet •The last layer with 1,000 output categories •Output of any layer can be used as a feature Výsledek obrázku pro convolutional neural network -It contains 5 convolutional layers and 3 fully connected layers. Relu is applied after very convolutional and fully connected layer. Dropout is applied before the first and the second fully connected year. -Convolutional layer is a feature detector that automatically learns to filter out not needed information from an input by using convolutional kernel. -Pooling layers compute the max or average value of a particular feature over a region of the input data (downsizing of input images) and also help to detect objects in some unusual places and reduce memory size. -The network has 62.3 million parameters (and 650,000 neurons), and needs 1.1 billion computation units in a forward pass. We can also see convolution layers, which accounts for 6% of all the parameters, consumes 95% of the computation. It takes five to six days to train on two GTX 580 3GB GPUs. -The network uses Relu instead of Tanh to add non-linearity. It accelerates the speed by 6 times at the same accuracy. Recurrent Neural Networks •Recurrent neural networks (RNN) •Long-Short Term Memory (LSTM) networks: –Learn when data should be remembered and when they should be thrown away –Well-suited to learn from experience to classify, process and predict time series when there are very long time lags of unknown size between important events – – SouvisejÃcà obrázek -Provide operations for reading, writing and resetting Deep Learning Summary • • •It is no magic! Just statistics in a black box, but exceptional effective at learning patterns, provided powerful computational infrastructure can be applied - GPU cards • • •Can be used not only for classification but is also able to provide content preserving feature vectors. •When calibrated by an Lp distance good quality similarity estimates can be obtained. Machines that learn to represent the world from experience Similarity Search Problem •For X ÍD in metric space M, •pre-process X so that the similarity queries •are executed efficiently. • •Implementation problems: -How to partition the data to reduce search space -How to ask questions - definition of queries –- How to execute queries – to achieve performance •The challenge: –In metric space, no total ordering exists! • Basic Partitioning Principles •Given a set X Í D in M=(D,d), basic partitioning principles have been defined: – –Ball partitioning –Generalized hyper-plane partitioning • • •Note: –Some special cases, such as Euclidian or Supermetric spaces are more tractable – Ball Partitioning •Inner set: { x Î X | d(p,x) ≤ dm } •Outer set: { x Î X | d(p,x) > dm } • p dm Generalized Hyper-plane •{ x Î X | d(p1,x) ≤ d(p2,x) } •{ x Î X | d(p1,x) > d(p2,x) } • p2 p1 Similarity Range Query • • • •range query –R(q,r) = { x Î X | d(q,x) ≤ r } • •… all museums up to 2km from my hotel … r q Nearest Neighbor Query •the nearest neighbor query –NN(q) = x –x Î X, "y Î X, d(q,x) ≤ d(q,y) • •k-nearest neighbor query –k-NN(q,k) = A –A Í X, |A| = k –"x Î A, y Î X – A, d(q,x) ≤ d(q,y) • • • •… five closest museums to my hotel … q k=5 Reverse Nearest Neighbor • • •… all hotels with a specific museum as a nearest cultural heritage cite … Example of 2-RNN •Objects o4, o5, and o6 have q between their two nearest neighbor. o5 q o4 o6 o1 o2 o3 Similarity Joins •similarity join of two data sets – – – • •similarity self join ó X = Y • •…pairs of hotels and museums •which are five minutes walk apart … m Pivot Filtering •Idea: •Given R(q,r) use triangle inequality for pruning •All distances between objects and a pivot p are known •Prune object o Î X if any holds –d(p,o) < d(p,q) – r –d(p,o) > d(p,q) + r q r p Generalized concept of the object-pivot constraint. Pivot Filtering (cont.) •Filtering with two pivots –Only Objects in the dark blue – region have to be checked. – –Effectiveness is improved – using more pivots. • p1 p2 q r o1 o2 o3 Pivot Filtering (summary) •Given a metric space M=(D,d) and a set of pivots • P = { p1, p2, p3,…, pn }. We define a mapping function Y: (D,d) ® (Rn,L∞) as follows: • • Y(o) = (d(o,p1), …, d(o,pn)) • • Then, we can bound the distance d(q,o) from below: • • L∞(Y(o), Y(q)) ≤ d(q,o) Angles formed by triplets of descriptors •Any 3 metric objects can be isometrically embedded into 2D Euclidean space to form the triangle •Angles within triangles are defined (by the Cosine rule) •Angles formed by real-life descriptors are from limited range The M-tree [Ciaccia, Patella, Zezula, VLDB 1997] • •1) Paged organization •2) Dynamic •3) Suitable for arbitrary metric spaces •4) I/O and CPU optimization –- computing d can be time-consuming The M-tree Idea •Depending on the metric, the “shape” of index regions changes C D E F A B B F D E A C Metric: L2 (Euclidean) L1 (city-block) L¥ (max-metric) weighted-Euclidean quadratic form o7 M-tree: Example o1 o6 o10 o3 o2 o5 o4 o9 o8 o11 o1 4.5 -.- o2 6.9 -.- o1 1.4 0.0 o10 1.2 3.3 o7 1.3 3.8 o2 2.9 0.0 o4 1.6 5.3 o2 0.0 o8 2.9 o1 0.0 o6 1.4 o10 0.0 o3 1.2 o7 0.0 o5 1.3 o11 1.0 o4 0.0 o9 1.6 Covering radius Distance to parent Distance to parent Distance to parent Distance to parent Leaf entries M-tree: Range Search •Given R(q,r): •Traverse the tree in a depth-first manner •In an internal node, for each entry áp,rc,d(p,pp),ptrñ –Prune the subtree if |d(q,pp) – d(p,pp)| – rc > r –Application of the pivot-pivot constraint q q r p rc pp r p rc pp M-tree: Range Search (cont.) •If not discarded, compute d(q,p) and –Prune the subtree if d(q,p) – rc > r –Application of the range-pivot constraint – – – – – •All non-pruned entries are searched recursively. q p rc r D-Index [Dohnal, Gennaro, Zezula, MTA 2002] 4 separable buckets at the first level 2 separable buckets at the second level exclusion bucket of the whole structure D-index: Insertion D-index: Range Search q r q r q r q r q r q r Pivot Permutation Approach •Assume a set n of pivots {p1, p2, ... pn} • •Given object x in X, order the pivots according to d(x, pi) • •Let ∏x be a permutation on the set of pivot indexes {1, … , n} such that ∏x(j) is index of the j-th closest pivot from x –e.g., ∏x(1) is the index of the closest pivot to x –p∏x(j) is the j-th closets pivot from x – •∏x is denoted as Pivot Permutation (PP) with respect to x. Pivot Permutation Approach •Can be seen as a recursive Voronoi partitioning to level l • •Cell C(i1, … il) contains objects x for which: • •∏x(1) = i1, ∏x (2) = i2, … , ∏x (l)=il Metric Sketches for Fast Searching and Filtering •Transformation of any metric space (D,d) to the Hamming space –Each descriptor o is transformed into a bit-string sketch of length λ. Hamming distance evaluates the number of different bits – very efficient, hardware instruction •Sketches compared by the Hamming distance should approximate similarity relationships of the original metric •Typical sketch length is 32 – 320 bits Sketching Transformation •GHP to set one bit of sk(o) • •GHP to set bits μ and ν of sk(o) • Properties and Application • •Good sketches (choosing pivots): –Balanced split –Low correlation •Application – even better for filtering than searching Textbooks on Metric Searching technology book Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics) Hanan Samet Foundation of Multidimensional and Metric Data Structures Morgan Kaufmann, 2006 P. Zezula, G. Amato, V. Dohnal, and M. Batko Similarity Search: The Metric Space Approach Springer, 2005 Teaching material: http://www.nmis.isti.cnr.it/amato/similarity-search-book/ [USEMAP] Infrastructure Independence: MESSIF Metric Similarity Search Implementation Framework Metric space (D,d) Operations Storage Centralized index structures Distributed index structures Communication Net Vectors • Lp and quadratic form Strings • (weighted) edit and protein sequence Insert, delete, range query, k-NN query, Incremental k-NN Volatile memory Persistent memory Performance statistics Similarity Search Demos •20M images: http://disa.fi.muni.cz/demos/profiset-decaf/ •Fashion: http://disa.fi.muni.cz/twenga/ •Image annotation: Image annotation: http://disa.fi.muni.cz/annotation-ui/ • •Fingerprints: http://disa.fi.muni.cz/fingerprints/ •Time series: http://disa.fi.muni.cz/subseq/ •Multi-modal person ident.: http://disa.fi.muni.cz/mmpi http://disa.fi.muni.cz/FaceMatch/image/215920 http://disa.fi.muni.cz/FaceMatch/image/215920 Preprocessing Retrieval http://disa.fi.muni.cz/FaceMatch/image/215920 Face detection with several technologies Merge of detected faces <13.9, 9.5, -6.0, 712.1, …> <17.9, 12.1, -9.1, 692.0, …> <8.8, 7.7, -3.5, 570.8, …> <14.4, 8.2, -8.4, 704.0, …> <10.1, 5.8, 40.6, 99.6, …> <5.4, 1.2, -60.4, 88.0, …> <45.1, 64.8, 90.6, 78.6, …> Face description with several technologies Similarity Search in Collections of Faces http://upload.wikimedia.org/wikipedia/commons/8/88/Cameron_Diaz_WE_2012_Shankbone_3.JPG <13.9, 9.5, -6.0, 712.1, …> <10.6, 78.9, -45.6, 101.3, …> 0.12 0.17 0.18 0.23 Fused features DB Features indexing by one technique > Face Detection Features Extraction Candidates filtering Index Fused features saving Candidate faces Query image Fused Face Detection and Face Matching •Fused face detection: •Faces detected by more technologies are taken into account •Showcase: 3 technologies, compliance of at least two: • •Fused face matching: •Characteristic features from more technologies are available for each face •Similarity of two faces evaluated by each technology is normalized into interval [0, 1] •Normalized value expresses a probability that faces belong to the same person •Highest probability is used to determine the similarity of faces Software name OpenCV Luxand Verilook Compliance of at least 2 Recall / precision (%) 55 / 89 64 / 83 73 / 83 64 / 96 Face Matching Results, Relevance Feedback •User may improve results by marking correctly found faces in several iterations: 00788_941205_qr00.jpg 00816_960530_rb00.jpg 00825_940307_fa00.jpg 00825_940307_fb00.jpg 00825_940307_hl00.jpg 00825_940307_pl00.jpg 00846_940307_hr_a00.jpg 00717_960530_fb00.jpg query 00788_941205_qr00.jpg 00816_960530_rb00.jpg 00788_941205_qr00.jpg 00825_940307_fa00.jpg 00825_940307_fb00.jpg 00825_940307_hl00.jpg 00825_940307_hr00.jpg 00825_940307_pl00.jpg 00846_940307_hr_a00.jpg 00717_960530_fb00.jpg query 00825_940307_hl00.jpg 00825_940307_pl00.jpg 00825_940307_hl00.jpg 00825_940307_hr00.jpg 00825_940307_hl00.jpg 00846_940307_hr_a00.jpg 2nd iteration 1st iteration Slide ‹#› Search-based Image Annotation § §We already have a strong tool – the similarity search §For any input image, we can retrieve visually similar images §Metadata of the similar images can be used to describe the original image § https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRSpxVqEaSo5HgnpQLjkP44gtENLw8eHm8jhsO6LnM98BS eIzNt §Keyword-based image retrieval §Popular and intuitive §Needs pictures with text metadata §Manual annotation is expensive ? Need for automatic image annotation Search-based image annotation Slide ‹#› Search-based annotation principles https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRSpxVqEaSo5HgnpQLjkP44gtENLw8eHm8jhsO6LnM98BS eIzNt Annotated image collection Content-based image retrieval Similar annotated images Yellow, bloom, pretty Meadow, outdoors, dandelion Mary’s garden, summer Candidate keyword processing Semantic resources Final candidate keywords with probabilities Plant 0.3 Flower 0.3 Garden 0.15 Sun 0.05 Human 0.1 Park 0.1 d = 0.2 d = 0.6 d = 0.5 ? Slide ‹#› Example http://disa.fi.muni.cz/profimedia/imagesJpeg/0077128591 Candidate keywords after CBIR church, architecture, travel, europe, building, religion, germany, buildings, north, churches, christianity, america, religious, exterior, st, historic, world, tourism, united, usa, … 1.Retrieve 100 similar images from Profiset 2.Merge their keywords, compute frequencies 3.Build the semantic network using WordNet 4.Compute the ConceptRank 5.Apply postprocessing & return 20 most probable keywords ConceptRank scores building (2.53), structure (2.41), LANDSCAPE (2.10), BUILDINGS (1.87), OBJECT (1.84), NATURE (1.78), place_of_worship (1.75), church (1.74), Europe (1.68), religion (1.64), continent (1.51), … Final keywords building, structure, church, religion, continent, group, travel, island, sky, architecture, tower, person, belief, locations, chapel, christianity, tourism, regions, country, district Semantic network 4 relationships: hypernym (dog → animal), hyponym (animal → dog), meronym (leaf → tree), holonym (tree → leaf) 270 network nodes, 471 edges Slide ‹#› Annotations in use §Participation in the ImageCLEF 2014 Scalable Annotation Challenge §2nd place, mean average precision of annotation approx. 60 % § §Web demo & Mozilla addon §Image annotation: http://disa.fi.muni.cz/annotation-ui/ § § § § § § § § § § § Similarity Search in Streams ▪ ▪Two basic approaches to explore data: ▪Store, pre-process and search later, database processing ▪Process (filter) continuously, stream processing ▪ ▪Examples of stream processing applications: ▪Surveillance camera and event detection ▪Mail stream and spam filter ▪Publish/subscribe applications Stream Processing Scenarios ▪Stream: potentially infinite sequence of data items (d1, d2, …) – tuples, images, frames, etc. ▪Basic scenarios: ▪Data items processed immediately, possible data item skipping → minimize delay - e.g., event detection ▪Process everything as fast as possible, delay possible to maximize throughput - our focus ▪Motivating examples with similarity searching ▪Image annotation – annotate a stream of images collected by a web crawler ▪Publish/subscribe applications – categorize a stream of documents by similarity searching Processing Streams of Query Objects ▪Typical large-scale similarity search approach: ▪partitioned data stored on a disk ▪partition reads from a disk form the bottleneck ▪Idea: similar queries need similar sets of partitions → save accesses ▪Buffer: memory used for reordering (clustering) queries ▪Cache: memory containing previously read data partitions Disk Buffer Query Cache Query Stream Result Metric index Experiment Results ▪100,000 processed 10-NN queries ▪DB: 1 mil. MPEG-7 descriptors ▪Buffer capacity: 8,000 queries ▪Cache size: 40,000 objects (4% of the DB) ▪100,000 processed 10-NN queries ▪DB: 10 mil. MPEG-7 descriptors ▪Buffer capacity: 10,000 queries ▪Cache size: 90,000 objects (0.9% of the DB) Continuous kNN Join Time window Past Future Users‘ preferences (query objects) 3NN join ▪Task: efficiently evaluate kNN join at any time ▪Challenge: efficient join computation having a lot of query objects and a lot of stream objects in the time window ▪General approach: update the join result continuously ▪Problem: a lot of time-consuming metric distance computations ▪Approximation technique proposal ▪Sketch-based filter ▪Hamming distances Continuous kNN Join Time window Past Future Users‘ preferences (query objects) 3NN join ▪Task: efficiently evaluate kNN join at any time ▪Challenge: efficient join computation having a lot of query objects and a lot of stream objects in the time window ▪General approach: update the join result continuously ▪Problem: a lot of time-consuming metric distance computations ▪Approximation technique proposal ▪Sketch-based filter ▪Hamming distances ? Continuous kNN Join Time window Past Future Users‘ preferences (query objects) 3NN join ▪Task: efficiently evaluate kNN join at any time ▪Challenge: efficient join computation having a lot of query objects and a lot of stream objects in the time window ▪General approach: update the join result continuously ▪Problem: a lot of time-consuming metric distance computations ▪Approximation technique proposal ▪Sketch-based filter ▪Hamming distances Experiments ▪Dataset: Caffe descriptors (4,096 dimensional vectors) ▪1 step: process a new stream object, delete the oldest stream object ▪Each experiment: 1,000 steps ▪10-NN join ▪200,000 stream objects in the time window ▪Sketch-based approach vs. no sketches: ▪sketches nearly 13 times faster, almost precise results (less than 0.5% missed objects) What Is Mocap •Digital representation is depicted by series of coordinates of body joints in space-time. •Complex multi-dimensional spatio-temporal data (3D space, 31 joints, 120 frames per second). •Visualized by simplified human skeleton (stick figure), coordinates of joints stored as float numbers. •1 minute of such motion data ≈ 669,600 float numbers. http://cg.cs.uni-bonn.de/aigaion2root/attachments/keyframeSynthesis_web.jpg Our Approach – Motion Images Every single-frame joint configuration is normalized (by centering and rotating), then transformed into a RGB stripe image while fully preserving skeleton configuration. We focus on body joints, not volumetric model nor skeleton bones Why normalziation is important Motion Images + Caffe 1) Effective transformation from (dynamic) motion capture data into (static) images. 2) Extract fixed-size feature vector using content-based image descriptors. 3) Index for fast and scalable search 1) Motion Motion Image (hop-one-leg) 2) <0, 0, 4.56, 0, 7.88, …> Nearest neighbors 3) d = 13.17 d = 10.1 d = 14.2 d = 1.1 4096-dim Caffe vector Demos Subsequence search demo http://disa.fi.muni.cz/mocap-demo/ Action recognition demo http://disa.fi.muni.cz/mocap-action-recognition/ Gait recognition demo http://disa.fi.muni.cz/mmpi Speed Climbing Olympic sport in Tokyo 2021 Speed climbing in numbers •15 m vertical wall •5° overhang •20 big hand holds, 11 small foot holds (standardized layout) •5.48 s world record (≈ 10 km/h) • Speed climbing is suitable for computer-aided motion analysis based on 2D skeleton sequences Application: Speed Climbing Analysis Performance comparison of two climbers (or with world record) Green chart – time advantage (who is winning) Blue chart – speed difference (who is faster) Color clusters – biggest delay points where hands and feet stayed for longer time Similarity Search in Protein Chains Each protein consists of 1 or more subparts – protein chains Approx. 500,000 chains are known – Protein Data Bank (PDB) 3D models of protein chains are used to define their pairwise similarity ◦Similarity evaluation time strongly depends on the size of compared chains ◦ ◦ Model of a protein chain: balls ≈ atoms, sticks ≈ bonds between atoms. Green ribbon ≈ simplification of the main atoms International conference series (http://sisap.org/) 2008 Cancun Mexico Výsledek obrázku pro cancun 2012 Toronto Canada 2016 Tokyo Japan 2018 Lima Peru 2009 Prague Czechia 2010 Istanbul Turkey 2011 Lipari Italy 2013 A Coruña Spain 2017 Munich Germany 2015 Glasgow UK 2014 Los Cabos Mexico Výsledek obrázku pro prague Výsledek obrázku pro istanbul Výsledek obrázku pro lipari SouvisejÃcà obrázek Výsledek obrázku pro la coruna Výsledek obrázku pro los cabos Výsledek obrázku pro glasgow Výsledek obrázku pro tokyo fudzi Výsledek obrázku pro munich Výsledek obrázku pro lima peru 2019 Newark NJ USA by _skynet (CC BY-SA 3.0) SISAP (Similarity Search and Applications) International Conferences 2022 Bologna Italy Virtual event - Free technology icons Virtual event - Free technology icons DISA at Masaryk University Laboratory of Data Intensive Systems and Applications http://disa.fi.muni.cz Our vision – future research challenges Challenge No.1 (adaptability): Respecting continuously changing distance metric – searched collection size as well as up to date collection of known samples – continuously adapt the search indexing mechanisms. Challenge No. 2 (explainability): Respecting an application domain – e.g. human skeleton data – provide explanation tools that might be requested on demand. Similarity cracks the code of explainable AI.