Image analysis III & 3D Reconstruction
C9940 3-Dimensional Transmission Electron Microscopy
S1007 Doing structural biology with the electron microscope
April 11, 2016
Outline
Image analysis III
More on last week's material
Dependence of SNR on √N
Oversampling
Classification
3D Reconstruction
Principles
Tomography
Reference-based alignment
Common lines
RCT
CTF-correction
3D classification
Outline
Image analysis III
More on last week's material
Dependence of SNR on √N
Oversampling
Classification
3D Reconstruction
Principles
Tomography
Reference-based alignment
Common lines
RCT
CTF-correction
3D classification
Random walks:
Why signal-to-noise improves with √N
The “Drunkard's walk”
Let's conduct an experiment.
The “Drunkard's walk”
0 1 2 3 4-1-2-3-4
We're going to assume that each step is random and independent of previous steps.
The “Drunkard's walk”
0 1 2 3 4-1-2-3-4
t=1
t=2
t=3
t=4
t=5
t=6
The teetotaler's walk
0 1 2 3 4-1-2-3-4
t=1
t=2
t=3
t=4
Expectation value
The expected distance that “noise” travels increases with √N.
However, it is not as fast as the distance that “signal” travels.
Thus, as we collect more data, the SNR increase by N/√N = √N
Random walks: more information
Expectation values
and how they related to resolution criteria
images
“odd” reconstruction
“even” reconstruction
We split the data set into halves and compare them.
Review:
How do we evaluate the quality of a reconstruction?
Review: Fourier Shell Correlation (FSC)
Properties:
- Fourier terms have amplitude + phase.
- Correlation values range from -1 to +1.
- Noise should give an average of 0.
- The comparison is done as a function of spatial frequency (or “resolution”)
Reconstruction1
Reconstruction2
term 1 term 2
Review: Fourier Shell Correlation curve
FSC curve with expectation value of noise
Why does σ vary with spatial frequency?
With small N, behavior is more unpredictable
One resolution criterion was to compare the FSC to, say, 3*σ.
BUT:
The σ value describes the behavior of unaligned noise.
Review: model bias
N = 128 N = 256 N = 512
N = 1024 N = 2048 original
The model bias can yields false correlations in real space
is equivalent to false correlations in Fourier space.
images
“odd” reconstruction
“even” reconstruction
Refinement: classical and “gold standard”
+
OLD STRATEGY
merge & refine orientations
“GOLD STANDARD”
refinement1 refinement2
Different resolution criteria
FSC=0.5
FSC=0.143
FSC=0.333
Outline
Image analysis III
More on last week's material
Dependence of SNR on √N
Oversampling
Classification
3D Reconstruction
Principles
Tomography
Reference-based alignment
Common lines
RCT
CTF-correction
3D classification
Sampling:
Oversampling an already-sampled image
Shifts: worst-case scenario
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
original Δx=Δy=0.05px Δx=Δy=0.10px Δx=Δy=0.15px Δx=Δy=0.20px
Δx=Δy=0.25px Δx=Δy=0.30px Δx=Δy=0.35px Δx=Δy=0.40px Δx=Δy=0.45px
Effect of shifts
1 2 43
5 6 87
9 10 1211
13 14 1615
1 2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18
19 20 21 22 23 24
25 26 27 28 29 30
31 32 33 34 35 36
Oversampling
1 2 43
5 6 87
9 10 1211
13 14 1615
Worst-case scenario after oversampling
White noise Power spectrum
Upscaled
300px
origin
Nyquist
frequency
spatial frequencyspatial frequency
1/6Å
origin
Old
Nyquist
frequency
spatial frequencyspatial frequency
1/6Å
New
Nyquist
frequency
1/4Å
200px
Image Power spectrum Power spectrum profile
OriginalShiftedby(0.5,0.5)Upscaled
+
Shifted
1 2 43
5 6 87
9 10 1211
13 14 1615
Image Power spectrum Power spectrum profile
OriginalRotatedby45ºUpscaled
+
Rotated
You can do a little better by oversampling.
Bammes... Chiu (2012) J. Struct. Biol.
Oversampling: Conclusion
Outline
Image analysis III
More on last week's material
Dependence of SNR on √N
Oversampling
Classification
3D Reconstruction
Principles
Tomography
Reference-based alignment
Common lines
RCT
CTF-correction
3D classification
Classification
Reiteration of the problem
8 classes of faces, 64x64 pixels
With noise added
Before we can average the data, we first should find homogeneous subsets.
Average:
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
Multivariate data analysis (MDA)
1 2 3 4 9 10 11 125 6 7 8 13 14 15 16
Multivariate data analysis (MDA), or
Multivariate statistical analysis (MSA)
Our 16-pixel image can be reorganized into a 16-coordinate vector.
MDA: Reconstituted images
Linear combinations of these images will give us
approximations of the images that make up the data.
Average Eigenimage #1 Eigenimage #2 Eigenimage #3
c0 + c1
+ c2
+ c3
+ ...
Phantom images of worm hemoglobin
MDA of worm hemoglobin
Average:
+c0
-c0
+c1
+c2
+c3
+c4
+c5
-c1
-c2
-c3
-c4
-c5
1 2 3 4 9 10 11 125 6 7 8 13 14 15 16
Classification
How do we categorize/classify the images?
K-means classification
A number K of images are chosen as seeds.
BAD: Some clusters may be overrepresented/underrepresented.
Diday's method of moving centers
Diday's method of moving centers
Diday's method of moving centers
Diday's method of moving centers
We will note the images that always “travel” together, and will call them a class.
Dendrogram
Dendrogram
Hierarchical ascendant classification
“Images”
Hierarchical Ascendant Classification
All images are represented.
The dendrogram will be too heavily branched to interpret without truncation.
Binary-tree viewer
BAD: Information about the height of the branch is lost.
Outline
Image analysis III
More on last week's material
Dependence of SNR on √N
Oversampling
Classification
3D Reconstruction
Principles
Tomography
Reference-based alignment
Common lines
RCT
CTF-correction
3D classification
John O'Brien, 1991, The New Yorker
How do you go from 2D to 3D?
What information do we need for 3D reconstruction?
1. different orientations
2. known orientations
3. many particles
Baumeister et al. (1999), Trends in Cell Biol., 9: 81-5.
good
missing
views
sparse
sampling
sparse
sampling
+
missing
views
What happens when we're missing views?
Your sample isn't guaranteed to adopt different orientations,
in which case you many need to explicitly tilt the microscope stage.
(more later...)
What information do we need for 3D reconstruction?
1. different orientations
2. known orientations
3. many particles
I have all of this information.
Now what?
There are two general categories of 3D reconstruction
1. Real space
2. Fourier space
Reconstruction in real space
We are going to reconstruct a 2D object from 1D projections.
The principle is the similar to, but simpler than, reconstructing
a 3D object from 2D projections.
Projection of our 2D object
Now, project in several directions
Reconstruction is the inversion of projection
Reconstruction is the inversion of projection
Reconstruction is the inversion of projection
Reconstruction is the inversion of projection
Reconstruction is the inversion of projection
The reconstruction doesn't agree well with the projections.
What can we do?
(one) ANSWER:
Simultaneous Iterative Reconstruction Technique
Simultaneous Iterative Reconstruction Technique
The idea:
You compute re-projections of your model.
Compare the re-projections to your experimental data.
There will be differences.
You weight the differences by a fudge factor, λ.
You adjust the model by the difference weighted by λ.
Repeat.
Simultaneous Iterative Reconstruction Technique
Simultaneous Iterative Reconstruction Technique
Here, the differences (which will be down-weighted by λ)
are the ripples in the background.
If we didn't down-weight by λ, we would overcompensate,
and would amplify noise.
ModelExperimental projection
Reconstruction in Fourier space
Projection theorem
(or Central Section Theorem)
A central section through the
3D Fourier transform is
the Fourier transform of the
projection in that direction.
Projection theorem
(or Central Section Theorem)
The disadvantage is that you have
To resample your central sections
from polar coordinates to
Cartesian space, i.e. interpolate.
There are new methods to better
Interpolate in Fourier space.
Converting from polar to Cartesian coordinates
X
Y
SOLUTION:
A simple weighting scheme is to divide the weight by the radius:
r* weighting, or “r-weighted backprojection”
Adapted from Pawel Penczek
If you know the orientation angles for each image,
you can compute a back-projection.
Going from 2D to 3D
How do we determine the last two Euler angles?
Parameters required for 3D reconstruction
Two translational:

Δx

Δy
Three orientational
(Euler angles):

phi (about z axis)

theta (about y)

psi (about new z)
http://www.wadsworth.org
These are determined in 2D.
These are determined in 3D.
Adapted from Pawel Penczek
If you know the orientation angles for each image,
you can compute a back-projection.
Going from 2D to 3D
Ф1
,θ1
Ф
2,θ
2Ф
3,θ
3
Ф4,
θ4
Ф5
,θ5
Ф
6 ,θ
6
Ф7
,θ7
Outline
Image analysis III
More on last week's material
Dependence of SNR on √N
Oversampling
Classification
3D Reconstruction
Principles
Tomography
Reference-based alignment
Common lines
RCT
CTF-correction
3D classification
From Ken Downing
BUT...
Tomography
We have:
 known orientations
 different views
“bubbling”
10 e-/Å2
20e-/Å2
30e-/Å2
40e-/Å2
40e-/Å2
Baker et al. (1999) Microbiol. Mol. Biol. Rev. 63: 862
We are destroying the sample as we image it.
What happens when we image the sample?
From Ken Downing
Solution:
If we have many identical molecules,
and if we can determine the orientations,
we can use one exposure per molecule
and use these images in the reconstruction.
Consequences of repeated exposure
 Accumulated beam damage
 If number of views is limited,
then distortions
“Single-particle reconstruction”
If we have many identical molecules,
and if we can determine the orientations,
we can use one exposure per molecule
and use these images in the reconstruction.
BUT:
Unlike in the tomographic case,
we don't know how the orientations
between the different images are related.
Outline
Image analysis III
More on last week's material
Dependence of SNR on √N
Oversampling
Classification
3D Reconstruction
Principles
Tomography
Reference-based alignment
Common lines
RCT
CTF-correction
3D classification
Reference-based alignment
Step 1: Generation of projections of the reference.
From Penczek et al. (1994), Ultramicroscopy 53: 251-70.
You will record the direction of projection (the Euler angles), such that
if you encounter an experimental image that resembles a reference projection,
you will assign that reference projection's Euler angles to the experimental image.
Assumption: reference is similar enough to the sample that it can be used to determine orientation.
The model
(The extra features helped determine handedness in noisy reconstructions.)
Reference-based alignment
Steps:
1. Compare the experimental image to all of the reference projections.
2. Find the reference projection with which the experimental image matches best.
3. Assign the Euler angles of that reference projection to the experimental image.
From Penczek et al. (1994), Ultramicroscopy 53: 251-70.
Outline
Image analysis III
More on last week's material
Dependence of SNR on √N
Oversampling
Classification
3D Reconstruction
Principles
Tomography
Reference-based alignment
Common lines
RCT
CTF-correction
3D classification
Common lines
(or Angular Reconstitution)
Summary:
 A central section through the 3D
Fourier transform is the Fourier
transform of the projection in
that direction
 Two central sections will
intersect along a line through
the origin of the 3D Fourier
transform
 With two central sections, there
is still one degree of freedom to
relate the orientations, but a
third projection (i.e., central
section) will fix the relative
orientations of all three.
Frank, J. (2006) 3D Electron Microscopy of Macromolecular Assemblies
Common lines
(or Angular Reconstitution)
From Steve Fuller
Summary:
 A central section through the 3D
Fourier transform is the Fourier
transform of the projection in
that direction
 Two central sections will
intersect along a line through
the origin of the 3D Fourier
transform
 With two central sections, there
is still one degree of freedom to
relate the orientations, but a
third projection (i.e., central
section) will fix the relative
orientations of all three.
Common lines: Problems
Noise can lead to incorrect angles
Symmetry helps
Handedness cannot be determined without
additional information
Tilting
α-helices
Assumes conformational homogeneity
Outline
Image analysis III
More on last week's material
Dependence of SNR on √N
Oversampling
Classification
3D Reconstruction
Principles
Tomography
Reference-based alignment
Common lines
RCT
CTF-correction
3D classification
From Nicolas Boisset
This scenario describes a
worst case, when there is
exactly one orientation in
the 0º image. Since the
in-plane angle varies, in
the tilted image, we have
different views available.
Random-conical tilt:
Determination of Euler angles
Two images are taken: one at 0° and one tilted at an angle
of 45°.
0°
45°
1 2 876543
Radermacher, M., Wagenknecht, T., Verschoor, A. & Frank,
J. Three-dimensional reconstruction from a singleexposure,
random conical tilt series applied to the 50S
ribosomal subunit of Escherichia coli. J Microsc 146, 113-
36 (1987).
From Nicolas Boisset
1
2
5
43
678
Random-conical tilt: Geometry
One problem though:
We can't tilt the stage all the way to 90 degrees.
Review:
Projection theorem
Representation of the distribution of views, if we
display a plane perpendicular to each projection
direction
The missing information, in the shape of a cone,
elongates features in the direction of the cone's axis.
From Nicolas Boisset
Random-conical tilt:
The “missing cone”
Random-conical tilt:
Filling the missing cone
+ =
+ =
Reconstruction
Distribution
of orientations
From Nicolas Boisset
If there are multiple preferred orientations, or if there is symmetry
that fills the missing cone, you can cover all orientations.
Phantom images of worm hemoglobin
We compute a separate reconstruction for each class
IF the classes simply correspond to different orientations,
you can combine them, and boost the signal-to-noise.
Helicase G40P
If the classes correspond to different conformations,
then you have to keep them as separate reconstructions.
Outline
Image analysis III
More on last week's material
Dependence of SNR on √N
Oversampling
Classification
3D Reconstruction
Principles
Tomography
Reference-based alignment
Common lines
RCT
CTF-correction
3D classification
More properties of Fourier transforms:
Convolutions
Why might two images in a data set look different?
different sample
different magnification
different illumination
different orientations
different defocus
different
conformations
better biochemistry
better microscopy
normalization
determine angles
CTF correction
Classification
Molecule g(x)
lattice: f(x) Set a molecule down at every
lattice point.
Notation: f(x)•g(x)
Adapted from David DeRosier
Convolution of a molecule with a lattice
generates a crystal.
lattice: f(x)
http://www.photos-public-domain.com
Set a molecule down at every
lattice point.
http://www.symbolicmessengers.com
Molecule: g(x)
http://en.wikipedia.org
Convolution in real life
Notation: f(x)•g(x)
Cross-correlation vs. convolution
Complex conjugate:
If a Fourier coefficient F(X) has the form: a + bi
The complex conjugate F*(X) has the form: a - bi
Cross-correlation: F*(X) G(X)
Convolution: F(X) G(X)
Remember:
f(x), g(x) are real-space functions
F(X), G(X) are Fourier-space functions
original
2D power spectrum
G(X)
CTF
1D profile
f(x)
F(X) F(X) G(X)
f(x)•g(x)
G(X)
g(x)
Point spread function
g(x) zoomed
An ideal point spread function would be an infinitely-sharp point.
Red: Power-spectrum profile calculated from experimental image
Green: Fitted, theoretical power-spectrum profile
Blue: Phase-only correction profile
Defocus groups: CTF correction in 3D
Assign micrographs to defocus groups
Separate reconstruction
for each defocus group
CTF-correction of micrographs in 2D
CTF-correct each micrograph
Outline
Image analysis III
More on last week's material
Dependence of SNR on √N
Oversampling
Classification
3D Reconstruction
Principles
Tomography
Reference-based alignment
Common lines
RCT
CTF-correction
3D classification
Why might two images in a data set look different?
different molecule
different magnification
different illumination
different orientations
different defocus
different
conformations
better biochemistry
better microscopy
normalization
determine angles
CTF correction
Classification
Classification:
Reference-based classification vs.
Maximum likelihood (ML3D)
Reference-based classification:
• Possible conformations must be
known.
• The combination of parameters
(shift, rotation, class) is chosen
from the highest correlation
value.
• Possible reference bias
ML3D
• Possible conformations are
not known.
• The probability of the
occurrence of the
parameters (shift, rotation,
class) is maximized.
• Random, data-dependent
RELION is a variation of maximum likelihood.
Seeding ML3D classification
There will be slight differences in the reconstructions.
We will iteratively maximize the likelihood of a
particle belonging to a particular class.
images
We split the data set into K classes at random.