Interpretation of cryo-EM maps
Jiri Novaček
Content
- symmetries
- map validation
- map interpretation
- model building
- map improvement
Symmetries
- regular assemblies of protein oligomers are common in nature
- oligomeric protein structures obey certain rules
—► no mirror symmetry
- understanding symmetry rules may prevent incorrect interpretation of the data
- presence of symmetry generally facilitates determination of the density map
C3 symmetry
C4 symmetry
C4 tetramer
C4 tetramer
C22 symmetry
D4 octamer
(Xu et al., Curr Opin Struct Biol 2019)
Symmetries
Projection Theorem, Euler angles
-Acentral section through the 3D Fourier transform is the Fourier transform to the projection in that direction
Symmetries
Projection Theorem, Euler angles
-Acentral section through the 3D Fourier transform is the Fourier transform to the projection in that direction
- Images for all possible projection directions are required to obtain structure with homogeneous resolution in all directions
Symmetries
Projection Theorem, Euler angles
-Acentral section through the 3D Fourier transform is the Fourier transform to the projection in that direction
- Images for all possible projection directions are required to obtain structure with homogeneous resolution in all directions
- Euler angles cp and 0 cover ranges of (0° - 360°) and (-90° - +90°)
Symmetries
Rotational (cyclic) symmetries
- one symmetry axis (usually molecules oriented with the symmetry axis alongside z)
- Asymmetric unit - the smallest portion of the angular space to which symmetry operation can be applied in order to completely fill the angular space
- C1 - the most trivial case, no symmetry, cp (0 360°), 0 (-90° - +90°)
Symmetries
Rotational (cyclic) symmetries
- one symmetry axis (usually molecules oriented with the symmetry axis alongside z)
- Asymmetric unit - the smallest portion of the angular space to which symmetry operation can be applied in order to completely fill the angular space
- C2 - q> (0° - 180°), 0 (-90° - +90°)
- C3 - q> (0° - 120°), 0 (-90° - +90°)
- C4 - q> (0° - 90°), 0 (-90° - +90°)
- C6 - q> (0° - 60°), 0 (-90° - +90°)
Symmetries
Dihedral symmetries
- one n-fold rotational axis and two-fold axis perpendicular to it
-Asymmetric unit
-D2- (p(0°-180o),e(0°-+90o) - D5 - q> (0° - 72°), Q (0° - +90°) -D7- (p(0°-~51o),e(0°-+90o)
D2 (4 subunits) 222
r
D4 422
Symmetries
Platonic symmetries
- faces, edges, and corners are related by symmetry operations
- tetrahedral - 4 3-fold axes and 3 2-fold axes
- octahedral - 3 4-fold axes of symmetry, 4 3-fold axes of symmetry, and 6 2-fold axes
- icosahedral - 6 5-fold, 10 3-fold and 15 2-fold axes
Symmetries
Platonic symmetries
- faces, edges, and corners are related by symmetry operations
- tetrahedral - 4 3-fold axes and 3 2-fold axes
- octahedral - 3 4-fold axes of symmetry, 4 3-fold axes of symmetry, and 6 2-fold axes
- icosahedral - 6 5-fold, 10 3-fold and 15 2-fold
EMAN2
Symmetries
Symmetries
Helical symmetry
- A single view contains all the necessary info for 3D reconstruction
- 2D surface lattice rolled into 3D
- 3D reconstruction approaches:
- Fourier-Bessel analysis
- Iterative Real-Space Refinement (IHRSR)
FT
<8> =
• ■ ■ ■
■ • • > • ■
- small inaccuracies in indexing lead to incorrect structure
- requires strict helical symmetry
- requires flat straight helices
- laborious
Symmetries
Helical symmetry
- A single view contains all the necessary info for 3D reconstruction
- 2D surface lattice rolled into 3D
- 3D reconstruction approaches:
- Fourier-Bessel analysis
- Iterative Real-Space Refinement (IHRSR)
- requires fairly good estimate of the cylinder diameter, rise, and twist
- can cope with heterogeneous data
- manages to reconstruct weakly diffracting filaments (where layer lines are not visible)
El tl li El El
2D Templates Systematically generated reterence projections
no: next iteration
(Behrmann et al., J Struct Biol 2012)
Projection Matching
Orientation parameters: shifts and Euler angles
3D Reconstruction
(optional point symmetry)
Structure without helical symmetry
1
Symmetrization Low-pass filtered and symmetrized helical structure Symmetry Search Helical symmetry parameters (&
c = - - yFi Fi*
- ■-- \s
11 11 10
Si."
t t
Resolution Ä
fsc C ^REF Phase error S/Ny2
0.50 0.67 0.82 35° 1.00
0.33 0.50 0.71 45° 0.71
0.14 0.25 0.50 60° 0.41
Map validation
- observed features of the map should be consistent with the resolution assessment
- visibility of expected structural features
- helices visible at 8A
- strands separated at 4.8A
- side-chains visible beyond 4A
^rz^^ 100 A
S^CEITEC
Map validation
- experimental - cryo-ET
Map validation
- Steps:
- map is correct at low resolution
- spurious noise features are not present (noise overfitting, over-refinement)
- FSC curve has a proper shape
- resolution estimate corresponds to the observed structural features
- acquisition of complementary data to confirm the model (e.g. in low resolution)
OCEITEC
Map interpretation
m. iB •
Segmentation ]
Visualization tools
- Chimera/ChimeraX -Coot
- PyMol
- VMD
-Amira (Commercial)
Segmentation
Q Volume viewei BB
Ptn*H j D«t« ■ DiipUy Rogion Ophoni
OUt emd_1u15«ap — 1 Size (301,30 ,301) SWO 444 — |
Snow ♦ imTK* . »tin w sohd
tttCM jn -i. !..|r .n. ic ado oi dee» N - ■Ml
tUnot -24m-25619 |16071 Co " ej
Btgr«i*ii [1 | _i
MMpMMy F tew |o |_|
ShOW J UftttMN | ctMM | on** |
Opm | Rmovo | Cloio | Mtlp |
- identify boundaries map regions which represent different structural components
- component structures can be positions into the identified segments g^-^jh^i^e^gfjhe segmented components is related to the map resolution
- manual segmentation | automated segmentation | knowledge-based segmentation
Map interpretation
known component structure
Segmentation
10-20A
Fit known structures
Fold assignment from sequence
No
Template-fre 2' modelling
emplate founc
P
Yes
Homology modelling
■n ?Resolution?
4-1 OA
Sec. str. assign.
Sec. Str. Sequence assignment
Rigid body fit
J
fit different from map
Multiple conformations Real-space methods MD-based methods
<4A
Model building
Map interpretation
Structure determination from amino acid sequence
GFCHIKAYTRLIMVG.
I
Template-free
9 s
! I
I
g
s
ft
£
I
!
Template-based
/\b /A7/Y/0 (de novo) prediction Fragment Assembly
Evolutionary Couplings
Threading Comparative (Homology) Modelling
Alphafold 2
"p^gramsHvlODELLER, SWISS-MODEL, Phyre2, RaptorX, l-TASSER, Rosetta, EVfold
Map interpretation
Fold recognition from density
- 4.5-1 OA: secondary structure detection
A**
r J V
-«
y -
j
Baker et al. Structure 2007
- 4.5Aand better: de novo CA tracing and model building
Density
8*2
Q. 00
ro o
CT <
4#
Pathwalker Model
Baker et al. Structure 2012
- programs: SSEhunter, SSEtracer, Ematch, Pathwalker, Coot, Buccaneer, EM-fold, Rosseta, Phenix, ARP/wARP
ModelAngelo
0> CI EE I T~ EE d Villa & Lasker, Curr Opin Struct Biol, 2014,
Cassidy et al, Curr Opin Microbiology 2018
Map interpretation
Density fitting
- manual fitting
- positioning of the atomic structure into the cryo-EM density using visualization programs
- usually efficient (human brain efficient in pattern recognition)
- direct feedback
- good for initial placement of the component in to the map
- high level of subjectivity may lead to errors
- depends on contour level at which the map is visualized
- conformational rearrangements cannot be modelled
Map interpretation
Density fitting
- automated fitting
- requires common representation of both the structure and the density map
- measure of the quality of the fit
- optimization protocol for fit improvement
Density map
Component structure
atomic
Component representation and placement
Optimisation based on goodness-of-fit
0>CZEITECZ
Map interpretation
Problems of density fitting
- limited resolution
- many local optima with similar numerical values at low resolution
- local resolution, noise, scaling, filtering, masking
- blurring of the atomic structure
better resolution
improve scoring for goodness-of-fit coarse-graining (change represenation) fit/model validation
Correct fit
1
Flipped 180P
Map interpretation
Problems of density fitting
- conformational variability
- many conformations which are observed in density maps deviate from the conformations of the atomic models which are fitted
- dynamics
- crystal packing effects
- errors in structure prediction
—► allow for the conformational changes during model fitting process = flexible fitting
Map interpretation
Model refinement
- without any restraints a model may fit well with a high score in near-atomic to low resolution density
- such a model will, however, not have standard protein geometry: backbone torsions (Ramachandran diagram), peptide planarity, chirality (trans/cis), bond lengths and angles, side chain torsions / rotamers
- refinement methods try to maintain standard geometry while fitting the model into the density map. The geometry restraints reduce the levels of freedom.
- map density contributes as an additional penalty in the scoring function
Programs: MDFF, Refmac, Rosetta, Coot, Phenix, Isolde, iMODFIT
Map interpretation
Model validation
Model fit
Model geometry
peptide planarity backbone torsions (Ramachandran) bond lengths bond angles side chain rotamers
Oc:eei-teec:
Molprobity: http://molprobity.biochem.duke.edu/ What check: http://swift.cmbi.ru.nl/gv/whatcheck/ PROCHECK: http://www.ebi.ac.uk/thornton-srv/software/PROCHECK/
Map improvement
- in order to facilitate map interpretation, the data processing should correct for the imperfections of the imaging system to the highest possible level
- these imperfections comprise:
- aberrations of microscope optical system (higher-order)
- sample drift and distortions caused by interaction of the electrons with a matter
- the effect is primarily pronounced at high frequencies (resolution) —► parameter optimization and additional data processing primarily concerns improving the quality of high resolution maps (<4.5A resolution)
- the effect on medium and low resolution (>8A) is limited and additional data processing usually does not result in any map improvement
Map improvement
Electron lens aberrations
- objective lens of the transmission electron microscope is really bad
2.2: Description of aberration constants to 6th order
A0 Lateral image shift
Ai Two-fold astigmatism
Ci Defocus
A2 Three-fold astigmatism
B2 Axial coma
A3 Four-fold astigmatism
S3 Axial star aberration
c3 = cs Spherical aberration
A4 Five-fold astigmatism
D4 Three-lobe aberration
B4 Fourth-order axial coma
A5 Six-fold astigmatism
S5 Fifth-order star aberration
c5 Fifth-order spherical aberrat
Rs Fifth-order rosette aberratior
B(k) = exp W(k) =!H{AoAk+
' 2n
i=rW(k) A
+ iAiA2k*2+ic1A2k*k
+ iA2A3k*3 + ^B2A3k*2k 3 3
+ ^A3A4k*4 + *S3A4k*3k + ^C3A4k*2k2
+ ^A4A5k*5 + pD4A5k*4k 4- ^B4A5k*3k2
5 5 5
+ \A5\ek*6 + 3-S5A6k*4k2 + \c&fk*U? + ;j-R5A6k*5k
6 6 6 6
Map improvement
Zernike polynomials
- complete set of orthogonal functions
- Zernike transform analogous to Fourier transform
- can be used to visualize lens aberrations
- the aberrations can be corrected for by introducing additional lens to the microscope or by software during the image processing
Frits Zernike,
1953 Nobel Prize in Physics inventor of phase contrast microscopy
7-1 \ 7/-2 \ 7° 72 > 74
1 1 3
0><^EITE<=
Spatial frequency
Map improvement
Lens aberrations
Original Compromise •
aio aio
Horizontal Focus aio Vertical Focus aio
Map improvement
Lens aberrations
- certain level of underfocus is necessary during cryo-EM data collection
—► corrected during CTF correction
- astigmatism can be eliminated to high extent by proper microscope alignment
- only aberrations which are relevant for the quality of medium and low resolution maps
- correct estimation of CTF parameters (defocus,astigmatism)
—► quality control - goodness of fit
Astigmatism
j iDefocus
\
3
✓ v
I
Z/4 _ Zja
i i 3
Z r ^« ^Z r
Map improvement
Lens aberrations
- dependence on fourth power of the frequency
- lens is stronger off axis, plane of least confusion
- considered constant for microscope, further optimization in software possible
Lens
Cs = 0 Cs ° 0
Plane of least confusion Disk diameter = Csfi3
Gaussian — image plane
Disk diameter = 2Csf33
Spherical o aberration
i i 3
Z r ^« ^Z r
Map improvement
Sample distorsions during imaging
- local motion different in distinct parts of the image
Compare particle in each frame to sum of frames
3000
r > v>
500
1000 1500 2000 2500 3000 3500
x-position (pixels)
n
Alignpartsjmbfgs (Rubinstein & Brubaker,
2015, JSB 192, 188-95)
[improved version in cryoSPARC ver 2]
V
Compare particle in each frame Compare patch from each frame to map to sum of frames
t
1___
-1_
Relion Polishing (Scheres, 2014, MotionCor2 (Zheng...Agard, eLife 3:e03665) 2017, Nat Meth 14, 331-2)
[improved version with Alignparts-like smoothing in Bayesian polishing]
Map improvement
Sample distorsions during imaging
- the information in each frame is damped by different B-factor due to distinct effects during data collection
- compensation for local motion (per particle) + per frame amplitude weighting with corresponding B-factor => particle polishing
Electron fluence (e~/A2)
Map improvement
Sample distorsions during imaging
- distortion of sample surface due to illumination with electron beam
- particles located in different depth of the specimen layer
—> defocus variance for particles within single micrograph
- per particle defocus (astigmatism) estimation = ctf refinement
38*f
Apoferritin (0.5 mg/mL)
39
* t
Apoferritin with 0.5 mM TCEP
I
JÍ*
40
Protein with Carbon Over Holes
41
Protein and ONA Strands with Carbon Over Moles
42*f
T20S Proteasome
Noble etaleLife2018
Map improvement
a
Ewald sphere correction
- the assumption that the image is 2D projection of the 3D object does not hold for thick specimens
- wave-function at the image plane samples surface of the sphere in 3D FT of the object
- Friedel symmetry is lost
- depends on electron wavelength - stronger effect (more important to consider) for 10OkeV than for 300keV
o o
CO
incident beam
-single particle
specimen
diffracted beams
cmMfiB
CJIMMIt>
«— effraction plane
A in image AafterCTFP
AafterCTFQ
A after CTFR = CTFP+CTFQ
Russo & Henderson (2018), Ultramicroscopy