Image analysis III & 3D Reconstruction C9940 3-Dimensional Transmission Electron Microscopy S1007 Doing structural biology with the electron microscope March 23, 2015 Outline Image analysis III Still more Fourier transforms Convolution Step function Power spectrum Friedel's Law More orientation alignment More interpolation Classification 3D Reconstruction Principles Reference-based alignment RCT Current events: Convolutions Molecule = f(x) lattice = l(x) Set a molecule down at every lattice point. Notation: f(x)*l(x) From David DeRosier Convolution of a molecule with a lattice generates a crystal. lattice = l(x) (http://www.photos-public-domain.com) Set a molecule down at every lattice point. http://www.symbolicmessengers.com Molecule = f(x) http://en.wikipedia.org Convolution of a molecule with a lattice generates a crystal. Notation: f(x)*l(x) Cross-correlation vs. convolution Complex conjugate: If a Fourier coefficient F(X) has the form: a + bi The complex conjugate F*(X) has the form: a - bi Cross-correlation: F*(X) G(X) Convolution: F(X) G(X) original 2D power spectrum G(X) CTF 1D profile f(x) F(X) F(X) G(X) f(x) g(x) G(X) g(x) Point spread function g(x) zoomed An ideal point spread function would be an infinitely-sharp point. Defocus groups Defocus groups Step function revisited Fourier transforms: plot of step function The higher the spatial frequencies (i.e., higher resolution) that are included, the more faithful the representation of the original function will be. http://cnx.org The power spectrum is the a real (as opposed to complex) map of the amplitudes of the Fourier transform Image f(x) Image g(x) F.T. F*(X) (complex conjugate) F.T. G(X) x = CCF The position of the peak gives us the shifts that give the best match, e.g., (8,-6). (8,-6) It's more difficult to plot a 2D F.T. showing both amplitude and phase. Fourier transform of a 2D crystal h k Amp Phase 0 0 500 0 1 -1 40 45 1 0 50 5 1 1 30 5 2 -2 2 54 2 -1 4 57 Powerspectrum QUESTION: Why did I not list the Fourier data where h was negative? h k Amp Phase 0 0 500 0 1 -1 40 45 1 0 50 5 1 1 30 5 2 -2 2 54 2 -1 4 57 If the complex part of f(x) is zero, then F(-X) = F*(X) where * indicates the complex conjugate. USE: Thus, centrosymmetrically related reflections have the same amplitude but opposite phases (Friedel’s law). When calculating a transform of an image, one only has to calculate half of it. The other half is related by Friedel’s law. Friedel's Law From David DeRosier AmplitudePhase 2D Fourier transform of a helix More orientational alignment Orientation alignment Image 1 Image 2 We take a series of rings from each image, unravel them, and compute a series of 1D cross-correlation functions. Shifts along these unraveled CCFs is equivalent to a rotation in Cartesian space. Orientation alignment Image 1 Image 2 radius 1 radius 2 radius 3 radius 4 radius 1 radius 2 radius 3 radius 4 0 360 0 360 Noiseadded Reference image 0 3600 360 Polar representation Orientation alignment Image 1 Image 2 radius 1 radius 2 radius 3 radius 4 radius 1 radius 2 radius 3 radius 4 0 360 0 360 Orientation alignment radius 1 radius 2 radius 3 radius 4 0 360 0 360 Orientation alignment: After rotation Another strategy for translation and orientation alignment Set of all new shifts of up to 2 pixels Set of all shifts of up to 1 pixel Translational and orientation alignment are interdependent Shifts of (0, +/-1, +/-2) pixels results in 25 orientation searches. The power spectrum is translationally invariant. If we shift the object in real space, the power spectrum is unchanged. Cross-correlation function (CCF) Image f(x) Image g(x) F.T. F*(X) (complex conjugate) F.T. G(X) x = CCF The position of the peak gives us the shifts that give the best match, e.g., (8,-6). (8,-6) Cross-correlation function (CCF) Image f(x) Image g(x) F.T. F*(X) (complex conjugate) F.T. G(X) x = Series of 1D CCFs Problems: 1. The power spectrum of a roughly round object is roughly round. 2. The amplitudes fall off quickly, so you don't have many rings of useful data. More interpolation Bammes... Chiu (2012) J. Struct. Biol. Shifts 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Suppose we shift the image in x & y. The new pixels will be weighted averages of the old pixels. original Δx=Δy=0.05px Δx=Δy=0.10px Δx=Δy=0.15px Δx=Δy=0.20px Δx=Δy=0.25px Δx=Δy=0.30px Δx=Δy=0.35px Δx=Δy=0.40px Δx=Δy=0.45px Effect of shifts 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 1 2 43 5 6 87 9 10 1211 13 14 1615 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 1 2 43 5 6 87 9 10 1211 13 14 1615 Questions 1) If the pixel size is 3 Å/px, what is the Nyquist frequency? ANSWER: 1/6Å 2) If we oversample/upscale the image by a factor of 1.5X, what is the new pixel size? ANSWER: 2 Å/px 3) What will be the new Nyquist frequency in the oversampled image? ANSWER: 1/4Å White noise Power spectrum Upscaled origin Nyquist frequency spatial frequencyspatial frequency 1/6Å origin Old Nyquist frequency spatial frequencyspatial frequency 1/6Å New Nyquist frequency 1/4Å Image Power spectrum Power spectrum profile OriginalShiftedby(0.5,0.5)Interpolated + Shifted 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 1 2 43 5 6 87 9 10 1211 13 14 1615 1 2 43 5 6 87 9 10 1211 13 14 1615 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 Image Power spectrum Power spectrum profile OriginalRotatedby45ºInterpolated + Rotated Conclusion: You can do a little better by oversampling. Bammes... Chiu (2012) J. Struct. Biol. Classification 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Multivariate data analysis (MDA) 1 2 3 4 9 10 11 125 6 7 8 13 14 15 16 Multivariate data analysis (MDA), or Multivariate statistical analysis (MSA) Our 16-pixel image can be reorganized into a 16-coordinate vector. MDA: An example 8 classes of faces, 64x64 pixels With noise added From http://spider.wadsworth.org/spider_doc/spider/docs/techs/classification/tutorial.html Average: Principal component analysis (PCA) or Correspondence analysis (CA) For a 4096-pixel image, we will have a 4096x4096 covariance matrix. Row-reduction of the covariance matrix gives us “eigenvectors.” • The eigenvectors describe correlated variations in the data. • The eigenvectors have 64 elements and can be converted back into images, called “eigenimages.” • The first eigenvectors will account for the most variation. The later eigenvectors may only describe noise. • Linear combinations of these images will give us approximations of the classes that make up the data. Eigenimages Reconstituted images Linear combinations of these images will give us approximations of the classes that make up the data. Average Eigenimage #1 Eigenimage #2 Eigenimage #3 c0 + c1 + c2 + c3 + ... Another example: worm hemoglobin Phantom images of worm hemoglobin PCA of worm hemoglobin Average: +c0 -c0 +c1 +c2 +c3 +c4 +c5 -c1 -c2 -c3 -c4 -c5 1 2 3 4 9 10 11 125 6 7 8 13 14 15 16 Classification How do we categorize/classify the images? K-means classification Diday's method of moving centers Diday's method of moving centers Diday's method of moving centers Diday's method of moving centers Dendrogram Dendrogram 1 2 3 4 9 10 11 125 6 7 8 13 14 15 16 Hierarchical Ascendant Classification Hierarchical Ascendant Classification All images are represented. The dendrogram will be too heavily branched to interpret without truncation. Binary-tree viewer 3D Reconstruction What information do we need for 3D reconstruction? 1. different orientations 2. known orientations 3. many particles Baumeister et al. (1999), Trends in Cell Biol., 9: 81-5. good missing views sparse sampling sparse sampling + missing views What happens when we're missing views? Your sample isn't guaranteed to adopt different orientations, in which case you many need to explicitly tilt the microscope stage. (more later...) Why do we need orientation? aligned images 1-4 of 4096 total unaligned images 1-4 of 4096 total This is a simple 2D case, but the effects are analogous in 3D. n=1 n=4 n=16 n=256 n=1024 n=4096 Signal-to-noise ratio increases with √n What happens as we include more particles? What information do we need for 3D reconstruction? 1. different orientations 2. known orientations 3. many particles I have all of this information. Now what? There are two general categories of 3D reconstruction 1. Real space 2. Fourier space Reconstruction in real space We are going to reconstruct a 2D object from 1D projections. The principle is the similar to, but simpler than, reconstructing a 3D object from 2D projections. Projection of our 2D object Reconstruction is the inversion of projection Reconstruction is the inversion of projection Reconstruction is the inversion of projection Reconstruction is the inversion of projection Reconstruction is the inversion of projection Baumeister et al. (1999), Trends in Cell Biol., 9: 81-5. good missing views sparse sampling sparse sampling + missing views What happens when we're missing views? Simultaneous Iterative Reconstruction Technique The reconstruction doesn't agree well with the projections. What can we do? (one) ANSWER: Simultaneous Iterative Reconstruction Technique Simultaneous Iterative Reconstruction Technique The idea: You compute re-projections of your model. Compare the re-projections to your experimental data. There will be differences. You weight the differences by a fudge factor, λ. You adjust the model by the difference weighted by λ. Repeat. Simultaneous Iterative Reconstruction Technique Here, the differences (which will be down-weighted by λ) are the ripples in the background. If we didn't down-weight by λ, we would over compensate, and would amplify noise. ModelExperimental projection Reconstruction in Fourier space Projection theorem (or Central Section Theorem) A central section through the 3D Fourier transform is the Fourier transform of the projection in that direction. Projection theorem (or Central Section Theorem) The disadvantage is that you have To resample your central sections from polar coordinates to Cartesian space, i.e. interpolate. There are new methods to better Interpolate in Fourier space. Reference-based alignment (or projection-matching) Reference-based alignment Step 1: Generation of projections of the reference. From Penczek et al. (1994), Ultramicroscopy 53: 251-70. You will record the direction of projection (the Euler angles), such that if you encounter an experimental image that resembles a reference projection, you will assign that reference projection's Euler angles to the experimental image. Assumption: reference is similar enough to the sample that it can be used to determine orientation. The model (The extra features helped determine handedness in noisy reconstructions.) Reference-based alignment Steps: 1. Compare the experimental image to all of the reference projections. 2. Find the reference projection with which the experimental image matches best. 3. Assign the Euler angles of that reference projection to the experimental image. From Penczek et al. (1994), Ultramicroscopy 53: 251-70. Random conical tilt Binary-tree viewer Random-conical tilt: Filling the missing cone Filling the missing cone + = + = Reconstruction Distribution of orientation From Nicolas Boisset If there are multiple preferred orientations, or if there is symmetry that fills the missing cone, you can cover all orientations. Top view Side view 3D classification Classification: Multi-reference alignment vs. Maximum likelihood (ML3D) Multi-reference alignment: • Possible conformations must be known. • The combination of parameters (shift, rotation, class) is chosen from the highest correlation value. ML3D • Possible conformations are not known. • The probability of the occurrence of the parameters (shift, rotation, class) is maximized. Seeding ML3D classification There will be slight differences in the reconstructions. We will iteratively maximize the likelihood of a particle belonging to a particular class. images We split the data set into K classes at random. There isn't an unambiguous 3D structure if there's only one view. John O'Brien, 1991, The New Yorker What information do we need for 3D reconstruction? 1. different orientations 2. known orientations 3. many particles 4. identical particles