Interpretation of cryo-EM maps Jiri Novaček Content - symmetries - map validation - map interpretation - model building - map improvement Symmetries - regular assemblies of protein oligomers are common in nature - oligomeric protein structures obey certain rules —► no mirror symmetry - understanding symmetry rules may prevent incorrect interpretation of the data - presence of symmetry generally facilitates determination of the density map C3 symmetry C4 symmetry C4 tetramer C4 tetramer C22 symmetry D4 octamer (Xu et al., Curr Opin Struct Biol 2019) Symmetries Projection Theorem, Euler angles -Acentral section through the 3D Fourier transform is the Fourier transform to the projection in that direction Symmetries Projection Theorem, Euler angles -Acentral section through the 3D Fourier transform is the Fourier transform to the projection in that direction - Images for all possible projection directions are required to obtain structure with homogeneous resolution in all directions Symmetries Projection Theorem, Euler angles -Acentral section through the 3D Fourier transform is the Fourier transform to the projection in that direction - Images for all possible projection directions are required to obtain structure with homogeneous resolution in all directions - Euler angles cp and 0 cover ranges of (0° - 360°) and (-90° - +90°) Symmetries Rotational (cyclic) symmetries - one symmetry axis (usually molecules oriented with the symmetry axis alongside z) - Asymmetric unit - the smallest portion of the angular space to which symmetry operation can be applied in order to completely fill the angular space - C1 - the most trivial case, no symmetry, cp (0 360°), 0 (-90° - +90°) Symmetries Rotational (cyclic) symmetries - one symmetry axis (usually molecules oriented with the symmetry axis alongside z) - Asymmetric unit - the smallest portion of the angular space to which symmetry operation can be applied in order to completely fill the angular space - C2 - q> (0° - 180°), 0 (-90° - +90°) - C3 - q> (0° - 120°), 0 (-90° - +90°) - C4 - q> (0° - 90°), 0 (-90° - +90°) - C6 - q> (0° - 60°), 0 (-90° - +90°) Symmetries Dihedral symmetries - one n-fold rotational axis and two-fold axis perpendicular to it -Asymmetric unit -D2- (p(0°-180o),e(0°-+90o) - D5 - q> (0° - 72°), Q (0° - +90°) -D7- (p(0°-~51o),e(0°-+90o) D2 (4 subunits) 222 r D4 422 Symmetries Platonic symmetries - faces, edges, and corners are related by symmetry operations - tetrahedral - 4 3-fold axes and 3 2-fold axes - octahedral - 3 4-fold axes of symmetry, 4 3-fold axes of symmetry, and 6 2-fold axes - icosahedral - 6 5-fold, 10 3-fold and 15 2-fold axes Symmetries Platonic symmetries - faces, edges, and corners are related by symmetry operations - tetrahedral - 4 3-fold axes and 3 2-fold axes - octahedral - 3 4-fold axes of symmetry, 4 3-fold axes of symmetry, and 6 2-fold axes - icosahedral - 6 5-fold, 10 3-fold and 15 2-fold EMAN2 Symmetries Symmetries Helical symmetry - A single view contains all the necessary info for 3D reconstruction - 2D surface lattice rolled into 3D - 3D reconstruction approaches: - Fourier-Bessel analysis - Iterative Real-Space Refinement (IHRSR) FT <8> = • ■ ■ ■ ■ • • > • ■ - small inaccuracies in indexing lead to incorrect structure - requires strict helical symmetry - requires flat straight helices - laborious Symmetries Helical symmetry - A single view contains all the necessary info for 3D reconstruction - 2D surface lattice rolled into 3D - 3D reconstruction approaches: - Fourier-Bessel analysis - Iterative Real-Space Refinement (IHRSR) - requires fairly good estimate of the cylinder diameter, rise, and twist - can cope with heterogeneous data - manages to reconstruct weakly diffracting filaments (where layer lines are not visible) El tl li El El 2D Templates Systematically generated reterence projections no: next iteration (Behrmann et al., J Struct Biol 2012) Projection Matching Orientation parameters: shifts and Euler angles 3D Reconstruction (optional point symmetry) Structure without helical symmetry 1 Symmetrization Low-pass filtered and symmetrized helical structure Symmetry Search Helical symmetry parameters (&c = - - yFi Fi* - ■-- \s 11 11 10 Si." t t Resolution Ä fsc C ^REF Phase error S/Ny2 0.50 0.67 0.82 35° 1.00 0.33 0.50 0.71 45° 0.71 0.14 0.25 0.50 60° 0.41 Map validation - observed features of the map should be consistent with the resolution assessment - visibility of expected structural features - helices visible at 8A - strands separated at 4.8A - side-chains visible beyond 4A ^rz^^ 100 A S^CEITEC Map validation - experimental - cryo-ET Map validation - Steps: - map is correct at low resolution - spurious noise features are not present (noise overfitting, over-refinement) - FSC curve has a proper shape - resolution estimate corresponds to the observed structural features - acquisition of complementary data to confirm the model (e.g. in low resolution) OCEITEC Map interpretation m. iB • Segmentation ] Visualization tools - Chimera/ChimeraX -Coot - PyMol - VMD -Amira (Commercial) Segmentation Q Volume viewei BB Ptn*H j D«t« ■ DiipUy Rogion Ophoni OUt emd_1u15«ap — 1 Size (301,30 ,301) SWO 444 — | Snow ♦ imTK* . »tin w sohd tttCM jn -i. !..|r .n. ic ado oi dee» N - ■Ml tUnot -24m-25619 |16071 Co " ej Btgr«i*ii [1 | _i MMpMMy F tew |o |_| ShOW J UftttMN | ctMM | on** | Opm | Rmovo | Cloio | Mtlp | - identify boundaries map regions which represent different structural components - component structures can be positions into the identified segments g^-^jh^i^e^gfjhe segmented components is related to the map resolution - manual segmentation | automated segmentation | knowledge-based segmentation Map interpretation known component structure Segmentation 10-20A Fit known structures Fold assignment from sequence No Template-fre 2' modelling emplate founc P Yes Homology modelling ■n ?Resolution? 4-1 OA Sec. str. assign. Sec. Str. Sequence assignment Rigid body fit J fit different from map Multiple conformations Real-space methods MD-based methods <4A Model building Map interpretation Structure determination from amino acid sequence GFCHIKAYTRLIMVG. I Template-free 9 s ! I I g s ft £ I ! Template-based /\b /A7/Y/0 (de novo) prediction Fragment Assembly Evolutionary Couplings Threading Comparative (Homology) Modelling Alphafold 2 "p^gramsHvlODELLER, SWISS-MODEL, Phyre2, RaptorX, l-TASSER, Rosetta, EVfold Map interpretation Fold recognition from density - 4.5-1 OA: secondary structure detection A** r J V -« y - j Baker et al. Structure 2007 - 4.5Aand better: de novo CA tracing and model building Density 8*2 Q. 00 ro o CT < 4# Pathwalker Model Baker et al. Structure 2012 - programs: SSEhunter, SSEtracer, Ematch, Pathwalker, Coot, Buccaneer, EM-fold, Rosseta, Phenix, ARP/wARP ModelAngelo 0> CI EE I T~ EE d Villa & Lasker, Curr Opin Struct Biol, 2014, Cassidy et al, Curr Opin Microbiology 2018 Map interpretation Density fitting - manual fitting - positioning of the atomic structure into the cryo-EM density using visualization programs - usually efficient (human brain efficient in pattern recognition) - direct feedback - good for initial placement of the component in to the map - high level of subjectivity may lead to errors - depends on contour level at which the map is visualized - conformational rearrangements cannot be modelled Map interpretation Density fitting - automated fitting - requires common representation of both the structure and the density map - measure of the quality of the fit - optimization protocol for fit improvement Density map Component structure atomic Component representation and placement Optimisation based on goodness-of-fit 0>CZEITECZ Map interpretation Problems of density fitting - limited resolution - many local optima with similar numerical values at low resolution - local resolution, noise, scaling, filtering, masking - blurring of the atomic structure better resolution improve scoring for goodness-of-fit coarse-graining (change represenation) fit/model validation Correct fit 1 Flipped 180P Map interpretation Problems of density fitting - conformational variability - many conformations which are observed in density maps deviate from the conformations of the atomic models which are fitted - dynamics - crystal packing effects - errors in structure prediction —► allow for the conformational changes during model fitting process = flexible fitting Map interpretation Model refinement - without any restraints a model may fit well with a high score in near-atomic to low resolution density - such a model will, however, not have standard protein geometry: backbone torsions (Ramachandran diagram), peptide planarity, chirality (trans/cis), bond lengths and angles, side chain torsions / rotamers - refinement methods try to maintain standard geometry while fitting the model into the density map. The geometry restraints reduce the levels of freedom. - map density contributes as an additional penalty in the scoring function Programs: MDFF, Refmac, Rosetta, Coot, Phenix, Isolde, iMODFIT Map interpretation Model validation Model fit Model geometry peptide planarity backbone torsions (Ramachandran) bond lengths bond angles side chain rotamers Oc:eei-teec: Molprobity: http://molprobity.biochem.duke.edu/ What check: http://swift.cmbi.ru.nl/gv/whatcheck/ PROCHECK: http://www.ebi.ac.uk/thornton-srv/software/PROCHECK/ Map improvement - in order to facilitate map interpretation, the data processing should correct for the imperfections of the imaging system to the highest possible level - these imperfections comprise: - aberrations of microscope optical system (higher-order) - sample drift and distortions caused by interaction of the electrons with a matter - the effect is primarily pronounced at high frequencies (resolution) —► parameter optimization and additional data processing primarily concerns improving the quality of high resolution maps (<4.5A resolution) - the effect on medium and low resolution (>8A) is limited and additional data processing usually does not result in any map improvement Map improvement Electron lens aberrations - objective lens of the transmission electron microscope is really bad 2.2: Description of aberration constants to 6th order A0 Lateral image shift Ai Two-fold astigmatism Ci Defocus A2 Three-fold astigmatism B2 Axial coma A3 Four-fold astigmatism S3 Axial star aberration c3 = cs Spherical aberration A4 Five-fold astigmatism D4 Three-lobe aberration B4 Fourth-order axial coma A5 Six-fold astigmatism S5 Fifth-order star aberration c5 Fifth-order spherical aberrat Rs Fifth-order rosette aberratior B(k) = exp W(k) =!H{AoAk+ ' 2n i=rW(k) A + iAiA2k*2+ic1A2k*k + iA2A3k*3 + ^B2A3k*2k 3 3 + ^A3A4k*4 + *S3A4k*3k + ^C3A4k*2k2 + ^A4A5k*5 + pD4A5k*4k 4- ^B4A5k*3k2 5 5 5 + \A5\ek*6 + 3-S5A6k*4k2 + \c&fk*U? + ;j-R5A6k*5k 6 6 6 6 Map improvement Zernike polynomials - complete set of orthogonal functions - Zernike transform analogous to Fourier transform - can be used to visualize lens aberrations - the aberrations can be corrected for by introducing additional lens to the microscope or by software during the image processing Frits Zernike, 1953 Nobel Prize in Physics inventor of phase contrast microscopy 7-1 \ 7/-2 \ 7° 72 > 74 1 1 3 0><^EITE<= Spatial frequency Map improvement Lens aberrations Original Compromise • aio aio Horizontal Focus aio Vertical Focus aio Map improvement Lens aberrations - certain level of underfocus is necessary during cryo-EM data collection —► corrected during CTF correction - astigmatism can be eliminated to high extent by proper microscope alignment - only aberrations which are relevant for the quality of medium and low resolution maps - correct estimation of CTF parameters (defocus,astigmatism) —► quality control - goodness of fit Astigmatism j iDefocus \ 3 ✓ v I Z/4 _ Zja i i 3 Z r ^« ^Z r Map improvement Lens aberrations - dependence on fourth power of the frequency - lens is stronger off axis, plane of least confusion - considered constant for microscope, further optimization in software possible Lens Cs = 0 Cs ° 0 Plane of least confusion Disk diameter = Csfi3 Gaussian — image plane Disk diameter = 2Csf33 Spherical o aberration i i 3 Z r ^« ^Z r Map improvement Sample distorsions during imaging - local motion different in distinct parts of the image Compare particle in each frame to sum of frames 3000 r > v> 500 1000 1500 2000 2500 3000 3500 x-position (pixels) n Alignpartsjmbfgs (Rubinstein & Brubaker, 2015, JSB 192, 188-95) [improved version in cryoSPARC ver 2] V Compare particle in each frame Compare patch from each frame to map to sum of frames t 1___ -1_ Relion Polishing (Scheres, 2014, MotionCor2 (Zheng...Agard, eLife 3:e03665) 2017, Nat Meth 14, 331-2) [improved version with Alignparts-like smoothing in Bayesian polishing] Map improvement Sample distorsions during imaging - the information in each frame is damped by different B-factor due to distinct effects during data collection - compensation for local motion (per particle) + per frame amplitude weighting with corresponding B-factor => particle polishing Electron fluence (e~/A2) Map improvement Sample distorsions during imaging - distortion of sample surface due to illumination with electron beam - particles located in different depth of the specimen layer —> defocus variance for particles within single micrograph - per particle defocus (astigmatism) estimation = ctf refinement 38*f Apoferritin (0.5 mg/mL) 39 * t Apoferritin with 0.5 mM TCEP I JÍ* 40 Protein with Carbon Over Holes 41 Protein and ONA Strands with Carbon Over Moles 42*f T20S Proteasome Noble etaleLife2018 Map improvement a Ewald sphere correction - the assumption that the image is 2D projection of the 3D object does not hold for thick specimens - wave-function at the image plane samples surface of the sphere in 3D FT of the object - Friedel symmetry is lost - depends on electron wavelength - stronger effect (more important to consider) for 10OkeV than for 300keV o o CO incident beam -single particle specimen diffracted beams cmMfiB CJIMMIt> «— effraction plane A in image AafterCTFP AafterCTFQ A after CTFR = CTFP+CTFQ Russo & Henderson (2018), Ultramicroscopy