Convolutional network 1 Convolutional networks – architecture Denote X a set of input neurons Y a set of output neurons Z a set of all neurons (X, Y ⊆ Z) individual neurons denoted by indices i, j etc. ξj is the inner potential of the neuron j after the computation stops yj is the output of the neuron j after the computation stops (define y0 = 1 is the value of the formal unit input) wji is the weight of the connection from i to j (in particular, wj0 is the weight of the connection from the formal unit input, i.e. wj0 = −bj where bj is the bias of the neuron j) j← is a set of all i such that j is adjacent from i (i.e. there is an arc to j from i) j→ is a set of all i such that j is adjacent to i (i.e. there is an arc from j to i) [ji] is a set of all connections (i.e. pairs of neurons) sharing the weight wji. 2 Visualzation methods Visualize weights Visualize most "important" inputs for a given class Visualize effect of input perturbations on the output Construct a local "interpretable" model 3 Alex-net - filters of the first convolutional layer 64 filters of depth 3 (RGB). Combined each filter RGB channels into one RGB image of size 11x11x3. 4 Maximizing input Assume a trained model giving a score for each class given an input image. Denote by yi(I) the value of the output neuron i on an input image I. Maximize yi(I) − λ I 2 2 over all images I. A maximum image computed using gradient descent. Gives the most "representative" image of the class c. 5 Maximizing input - example 6 Image specific saliency maps Let us fix an output neuron i and an image I0. Rank pixels in I0 based on their influence on the value yi(I0). 7 Image specific saliency maps Let us fix an output neuron i and an image I0. Rank pixels in I0 based on their influence on the value yi(I0). Note that we can approximate yi locally around I0 with the linear part of the Taylor series: yi(I) ≈ yi(I0) + wT (I − I0) = wT I + (yi(I0) − wT I0) where w = δyi δI (I0) Heuristics: The magnitude of the derivative indicates which pixels need to be changed the least to affect the score most. 7 Saliency maps - example 8 Saliency maps - example Quite noisy, the signal is spread and does not tell much about the perception of the owl. 9 SmoothGrad Average several saliency maps of noisy copies of the input. 10 Occlusion Systematically cover parts of the input image. Observe the effect on the output value. Find regions with the largest effect. 11 Occlusion - example 12 Occlusion - example 13 LIME - for images Let us fix an image I0 to be explained. Outline: Consider superpixels of I0 as interpretable components. Construct a linear model approximating the network aroung the image I0 with weights corresponding to the superpixels. Select the superpixels with weights of large magnitude as the important ones. 14 Superpixels as explainable components Denote by P1, . . . , P all superpixels of I0. Consider binary vectors x = (x1, . . . , x ) ∈ {0, 1} . Each such vector x determines a "subimage" I[x] of I0 obtained by removing all Pk with xk = 0. 15 LIME Let us fix an output neuron i, we dnote by yi(I) the value of i for a given input image I. Given an image I0 to be interpreted, consider the following training set: T = {(x1, yi(I0[x1])), . . . , (xp, yi(I0[xp])} Here xh = (xh1, . . . , xh ) are (some) binary vectors of {0, 1} . E.g. randomly selected. 16 LIME Let us fix an output neuron i, we dnote by yi(I) the value of i for a given input image I. Given an image I0 to be interpreted, consider the following training set: T = {(x1, yi(I0[x1])), . . . , (xp, yi(I0[xp])} Here xh = (xh1, . . . , xh ) are (some) binary vectors of {0, 1} . E.g. randomly selected. Train a linear model (ADALINE) with weights w1, . . . , w on T minimizing the mean-squared error (+ a regularization term making the number of non-zero weights as small as possible). Intuitively, the linear model approximates the networks on the "subimages" of I obtained by removing unimportant superpixels. Inspect the weights (magnitude and sign). 16 LIME - example 17 LIME - example 18 LIME - example 19 LIME - example 20