Image analysis III & 3D Reconstruction C9940 3-Dimensional Transmission Electron Microscopy S1007 Doing structural biology with the electron microscope April 11, 2016 Outline Image analysis III More on last week's material Dependence of SNR on √N Oversampling Classification 3D Reconstruction Principles Tomography Reference-based alignment Common lines RCT CTF-correction 3D classification Outline Image analysis III More on last week's material Dependence of SNR on √N Oversampling Classification 3D Reconstruction Principles Tomography Reference-based alignment Common lines RCT CTF-correction 3D classification Random walks: Why signal-to-noise improves with √N The “Drunkard's walk” Let's conduct an experiment. The “Drunkard's walk” 0 1 2 3 4-1-2-3-4 We're going to assume that each step is random and independent of previous steps. The “Drunkard's walk” 0 1 2 3 4-1-2-3-4 t=1 t=2 t=3 t=4 t=5 t=6 The teetotaler's walk 0 1 2 3 4-1-2-3-4 t=1 t=2 t=3 t=4 Expectation value The expected distance that “noise” travels increases with √N. However, it is not as fast as the distance that “signal” travels. Thus, as we collect more data, the SNR increase by N/√N = √N Random walks: more information Expectation values and how they related to resolution criteria images “odd” reconstruction “even” reconstruction We split the data set into halves and compare them. Review: How do we evaluate the quality of a reconstruction? Review: Fourier Shell Correlation (FSC) Properties: - Fourier terms have amplitude + phase. - Correlation values range from -1 to +1. - Noise should give an average of 0. - The comparison is done as a function of spatial frequency (or “resolution”) Reconstruction1 Reconstruction2 term 1 term 2 Review: Fourier Shell Correlation curve FSC curve with expectation value of noise Why does σ vary with spatial frequency? With small N, behavior is more unpredictable One resolution criterion was to compare the FSC to, say, 3*σ. BUT: The σ value describes the behavior of unaligned noise. Review: model bias N = 128 N = 256 N = 512 N = 1024 N = 2048 original The model bias can yields false correlations in real space is equivalent to false correlations in Fourier space. images “odd” reconstruction “even” reconstruction Refinement: classical and “gold standard” + OLD STRATEGY merge & refine orientations “GOLD STANDARD” refinement1 refinement2 Different resolution criteria FSC=0.5 FSC=0.143 FSC=0.333 Outline Image analysis III More on last week's material Dependence of SNR on √N Oversampling Classification 3D Reconstruction Principles Tomography Reference-based alignment Common lines RCT CTF-correction 3D classification Sampling: Oversampling an already-sampled image Shifts: worst-case scenario 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 original Δx=Δy=0.05px Δx=Δy=0.10px Δx=Δy=0.15px Δx=Δy=0.20px Δx=Δy=0.25px Δx=Δy=0.30px Δx=Δy=0.35px Δx=Δy=0.40px Δx=Δy=0.45px Effect of shifts 1 2 43 5 6 87 9 10 1211 13 14 1615 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 Oversampling 1 2 43 5 6 87 9 10 1211 13 14 1615 Worst-case scenario after oversampling White noise Power spectrum Upscaled 300px origin Nyquist frequency spatial frequencyspatial frequency 1/6Å origin Old Nyquist frequency spatial frequencyspatial frequency 1/6Å New Nyquist frequency 1/4Å 200px Image Power spectrum Power spectrum profile OriginalShiftedby(0.5,0.5)Upscaled + Shifted 1 2 43 5 6 87 9 10 1211 13 14 1615 Image Power spectrum Power spectrum profile OriginalRotatedby45ºUpscaled + Rotated You can do a little better by oversampling. Bammes... Chiu (2012) J. Struct. Biol. Oversampling: Conclusion Outline Image analysis III More on last week's material Dependence of SNR on √N Oversampling Classification 3D Reconstruction Principles Tomography Reference-based alignment Common lines RCT CTF-correction 3D classification Classification Reiteration of the problem 8 classes of faces, 64x64 pixels With noise added Before we can average the data, we first should find homogeneous subsets. Average: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Multivariate data analysis (MDA) 1 2 3 4 9 10 11 125 6 7 8 13 14 15 16 Multivariate data analysis (MDA), or Multivariate statistical analysis (MSA) Our 16-pixel image can be reorganized into a 16-coordinate vector. MDA: Reconstituted images Linear combinations of these images will give us approximations of the images that make up the data. Average Eigenimage #1 Eigenimage #2 Eigenimage #3 c0 + c1 + c2 + c3 + ... Phantom images of worm hemoglobin MDA of worm hemoglobin Average: +c0 -c0 +c1 +c2 +c3 +c4 +c5 -c1 -c2 -c3 -c4 -c5 1 2 3 4 9 10 11 125 6 7 8 13 14 15 16 Classification How do we categorize/classify the images? K-means classification A number K of images are chosen as seeds. BAD: Some clusters may be overrepresented/underrepresented. Diday's method of moving centers Diday's method of moving centers Diday's method of moving centers Diday's method of moving centers We will note the images that always “travel” together, and will call them a class. Dendrogram Dendrogram Hierarchical ascendant classification “Images” Hierarchical Ascendant Classification All images are represented. The dendrogram will be too heavily branched to interpret without truncation. Binary-tree viewer BAD: Information about the height of the branch is lost. Outline Image analysis III More on last week's material Dependence of SNR on √N Oversampling Classification 3D Reconstruction Principles Tomography Reference-based alignment Common lines RCT CTF-correction 3D classification John O'Brien, 1991, The New Yorker How do you go from 2D to 3D? What information do we need for 3D reconstruction? 1. different orientations 2. known orientations 3. many particles Baumeister et al. (1999), Trends in Cell Biol., 9: 81-5. good missing views sparse sampling sparse sampling + missing views What happens when we're missing views? Your sample isn't guaranteed to adopt different orientations, in which case you many need to explicitly tilt the microscope stage. (more later...) What information do we need for 3D reconstruction? 1. different orientations 2. known orientations 3. many particles I have all of this information. Now what? There are two general categories of 3D reconstruction 1. Real space 2. Fourier space Reconstruction in real space We are going to reconstruct a 2D object from 1D projections. The principle is the similar to, but simpler than, reconstructing a 3D object from 2D projections. Projection of our 2D object Now, project in several directions Reconstruction is the inversion of projection Reconstruction is the inversion of projection Reconstruction is the inversion of projection Reconstruction is the inversion of projection Reconstruction is the inversion of projection The reconstruction doesn't agree well with the projections. What can we do? (one) ANSWER: Simultaneous Iterative Reconstruction Technique Simultaneous Iterative Reconstruction Technique The idea: You compute re-projections of your model. Compare the re-projections to your experimental data. There will be differences. You weight the differences by a fudge factor, λ. You adjust the model by the difference weighted by λ. Repeat. Simultaneous Iterative Reconstruction Technique Simultaneous Iterative Reconstruction Technique Here, the differences (which will be down-weighted by λ) are the ripples in the background. If we didn't down-weight by λ, we would overcompensate, and would amplify noise. ModelExperimental projection Reconstruction in Fourier space Projection theorem (or Central Section Theorem) A central section through the 3D Fourier transform is the Fourier transform of the projection in that direction. Projection theorem (or Central Section Theorem) The disadvantage is that you have To resample your central sections from polar coordinates to Cartesian space, i.e. interpolate. There are new methods to better Interpolate in Fourier space. Converting from polar to Cartesian coordinates X Y SOLUTION: A simple weighting scheme is to divide the weight by the radius: r* weighting, or “r-weighted backprojection” Adapted from Pawel Penczek If you know the orientation angles for each image, you can compute a back-projection. Going from 2D to 3D How do we determine the last two Euler angles? Parameters required for 3D reconstruction Two translational:  Δx  Δy Three orientational (Euler angles):  phi (about z axis)  theta (about y)  psi (about new z) http://www.wadsworth.org These are determined in 2D. These are determined in 3D. Adapted from Pawel Penczek If you know the orientation angles for each image, you can compute a back-projection. Going from 2D to 3D Ф1 ,θ1 Ф 2,θ 2Ф 3,θ 3 Ф4, θ4 Ф5 ,θ5 Ф 6 ,θ 6 Ф7 ,θ7 Outline Image analysis III More on last week's material Dependence of SNR on √N Oversampling Classification 3D Reconstruction Principles Tomography Reference-based alignment Common lines RCT CTF-correction 3D classification From Ken Downing BUT... Tomography We have:  known orientations  different views “bubbling” 10 e-/Å2 20e-/Å2 30e-/Å2 40e-/Å2 40e-/Å2 Baker et al. (1999) Microbiol. Mol. Biol. Rev. 63: 862 We are destroying the sample as we image it. What happens when we image the sample? From Ken Downing Solution: If we have many identical molecules, and if we can determine the orientations, we can use one exposure per molecule and use these images in the reconstruction. Consequences of repeated exposure  Accumulated beam damage  If number of views is limited, then distortions “Single-particle reconstruction” If we have many identical molecules, and if we can determine the orientations, we can use one exposure per molecule and use these images in the reconstruction. BUT: Unlike in the tomographic case, we don't know how the orientations between the different images are related. Outline Image analysis III More on last week's material Dependence of SNR on √N Oversampling Classification 3D Reconstruction Principles Tomography Reference-based alignment Common lines RCT CTF-correction 3D classification Reference-based alignment Step 1: Generation of projections of the reference. From Penczek et al. (1994), Ultramicroscopy 53: 251-70. You will record the direction of projection (the Euler angles), such that if you encounter an experimental image that resembles a reference projection, you will assign that reference projection's Euler angles to the experimental image. Assumption: reference is similar enough to the sample that it can be used to determine orientation. The model (The extra features helped determine handedness in noisy reconstructions.) Reference-based alignment Steps: 1. Compare the experimental image to all of the reference projections. 2. Find the reference projection with which the experimental image matches best. 3. Assign the Euler angles of that reference projection to the experimental image. From Penczek et al. (1994), Ultramicroscopy 53: 251-70. Outline Image analysis III More on last week's material Dependence of SNR on √N Oversampling Classification 3D Reconstruction Principles Tomography Reference-based alignment Common lines RCT CTF-correction 3D classification Common lines (or Angular Reconstitution) Summary:  A central section through the 3D Fourier transform is the Fourier transform of the projection in that direction  Two central sections will intersect along a line through the origin of the 3D Fourier transform  With two central sections, there is still one degree of freedom to relate the orientations, but a third projection (i.e., central section) will fix the relative orientations of all three. Frank, J. (2006) 3D Electron Microscopy of Macromolecular Assemblies Common lines (or Angular Reconstitution) From Steve Fuller Summary:  A central section through the 3D Fourier transform is the Fourier transform of the projection in that direction  Two central sections will intersect along a line through the origin of the 3D Fourier transform  With two central sections, there is still one degree of freedom to relate the orientations, but a third projection (i.e., central section) will fix the relative orientations of all three. Common lines: Problems Noise can lead to incorrect angles Symmetry helps Handedness cannot be determined without additional information Tilting α-helices Assumes conformational homogeneity Outline Image analysis III More on last week's material Dependence of SNR on √N Oversampling Classification 3D Reconstruction Principles Tomography Reference-based alignment Common lines RCT CTF-correction 3D classification From Nicolas Boisset This scenario describes a worst case, when there is exactly one orientation in the 0º image. Since the in-plane angle varies, in the tilted image, we have different views available. Random-conical tilt: Determination of Euler angles Two images are taken: one at 0° and one tilted at an angle of 45°. 0° 45° 1 2 876543 Radermacher, M., Wagenknecht, T., Verschoor, A. & Frank, J. Three-dimensional reconstruction from a singleexposure, random conical tilt series applied to the 50S ribosomal subunit of Escherichia coli. J Microsc 146, 113- 36 (1987). From Nicolas Boisset 1 2 5 43 678 Random-conical tilt: Geometry One problem though: We can't tilt the stage all the way to 90 degrees. Review: Projection theorem Representation of the distribution of views, if we display a plane perpendicular to each projection direction The missing information, in the shape of a cone, elongates features in the direction of the cone's axis. From Nicolas Boisset Random-conical tilt: The “missing cone” Random-conical tilt: Filling the missing cone + = + = Reconstruction Distribution of orientations From Nicolas Boisset If there are multiple preferred orientations, or if there is symmetry that fills the missing cone, you can cover all orientations. Phantom images of worm hemoglobin We compute a separate reconstruction for each class IF the classes simply correspond to different orientations, you can combine them, and boost the signal-to-noise. Helicase G40P If the classes correspond to different conformations, then you have to keep them as separate reconstructions. Outline Image analysis III More on last week's material Dependence of SNR on √N Oversampling Classification 3D Reconstruction Principles Tomography Reference-based alignment Common lines RCT CTF-correction 3D classification More properties of Fourier transforms: Convolutions Why might two images in a data set look different? different sample different magnification different illumination different orientations different defocus different conformations better biochemistry better microscopy normalization determine angles CTF correction Classification Molecule g(x) lattice: f(x) Set a molecule down at every lattice point. Notation: f(x)•g(x) Adapted from David DeRosier Convolution of a molecule with a lattice generates a crystal. lattice: f(x) http://www.photos-public-domain.com Set a molecule down at every lattice point. http://www.symbolicmessengers.com Molecule: g(x) http://en.wikipedia.org Convolution in real life Notation: f(x)•g(x) Cross-correlation vs. convolution Complex conjugate: If a Fourier coefficient F(X) has the form: a + bi The complex conjugate F*(X) has the form: a - bi Cross-correlation: F*(X) G(X) Convolution: F(X) G(X) Remember: f(x), g(x) are real-space functions F(X), G(X) are Fourier-space functions original 2D power spectrum G(X) CTF 1D profile f(x) F(X) F(X) G(X) f(x)•g(x) G(X) g(x) Point spread function g(x) zoomed An ideal point spread function would be an infinitely-sharp point. Red: Power-spectrum profile calculated from experimental image Green: Fitted, theoretical power-spectrum profile Blue: Phase-only correction profile Defocus groups: CTF correction in 3D Assign micrographs to defocus groups Separate reconstruction for each defocus group CTF-correction of micrographs in 2D CTF-correct each micrograph Outline Image analysis III More on last week's material Dependence of SNR on √N Oversampling Classification 3D Reconstruction Principles Tomography Reference-based alignment Common lines RCT CTF-correction 3D classification Why might two images in a data set look different? different molecule different magnification different illumination different orientations different defocus different conformations better biochemistry better microscopy normalization determine angles CTF correction Classification Classification: Reference-based classification vs. Maximum likelihood (ML3D) Reference-based classification: • Possible conformations must be known. • The combination of parameters (shift, rotation, class) is chosen from the highest correlation value. • Possible reference bias ML3D • Possible conformations are not known. • The probability of the occurrence of the parameters (shift, rotation, class) is maximized. • Random, data-dependent RELION is a variation of maximum likelihood. Seeding ML3D classification There will be slight differences in the reconstructions. We will iteratively maximize the likelihood of a particle belonging to a particular class. images We split the data set into K classes at random.