L LOSCHMIDT , LABORATORIES PROTEIN ENGINEERING 7. Rational and semi-rational design Loschmidt Laboratories Department of Experimental Biology Masaryk University, Brno □ Protein engineering approaches □ Semi-rational design ■ identification of hot-spots ■ evaluation of hot-spots ■ selection of substitutions ■ design of library ■ mutagenesis and screening □ Rational design ■ molecular modeling □ Protein engineering approaches □ Semi-rational design ■ identification of hot-spots ■ evaluation of hot-spots ■ selection of substitutions ■ design of library ■ mutagenesis and screening □ Rational design ■ molecular modeling Protein engineering □ altering protein structure to improve its properties □ three main approaches ■ directed evolution ■ rational design ■ semi-rational design Protein engineering approaches RATIONAL DESIGN 1. Computer aided design 2. Site-directed mutagenesis Individual mutated gene 3. Transformation 4. Protein expression 5. Protein purification 6. not applied IMPROVED ENZYME Constructed mutant enzyme 7. Biochemical testing DIRECTED EVOLUTION 1. not applied 2. Random mutagenesis u Library of mutated genes ( > 10,000 clones ) 3. Transformation 4. Protein expression 5. not applied 6. Screening and selection - stability - selectivity - affinity - activity flB tBS ^BE? www Selected mutant enzymes W Protein engineering approaches Rational design Directed evolution Semi-rational design high-throughput screening/selection not essential essential advantageous but not essential structural and/or functional information both essential neither essential either is sufficient sequence space exploration low high, random moderate, targeted probability to obtain synergistic mutations moderate low high Structure information □ worldwide Protein Data Bank (wwPDB) ■ http://www.wwpdb.org/ ■ central repository of "180,000 experimental macromolecular structures (April 2021) □ RCSBPDB ■ https://www.rcsb.org/ protein data bank □ PDBe ■ https://www.ebi.ac.uk/pdbe/ Protein Data Bank in Europe 86PDBe □ PDBj https://pdbj.org/ PDBj Protein Data Bank Japan 3D structure is determined by the sequence MSLGAKPFGEKKFIEIKGRRMAYIDEGTGDPILFQHGNPTSSYLWRNI MPHCAGLGRLIACDLIGMGDSDKLDPSGPERYAYAEHRDYLDALWEA LDLGDRWLWHDWGSALGFDWARRHRERVQGIAYMEAIAMPIEWA DFPEQDRDLFQAFRSQAGEELVLQD MSLGAKPFGE... target sequence 8 to Residue number model validation database search model optimization selection of template loop and side-chain modeling MSLGAKPFGE. MGV-AKTYGE. sequence alignment building model framework □ MODELLER ■ http://salilab.org/modeller/ □ SWISS-MODEL ■ http://swissmodel.expasy.org/ □ Robetta ■ http://robetta.bakerlab.org/ □ l-TASSER ■ https://zhanglab.ccmb.med.umich.edu/l-TASSER/ □ Protein engineering approaches □ Semi-rational design ■ identification of hot-spots ■ evaluation of hot-spots ■ selection of substitutions ■ design of library ■ mutagenesis and screening □ Rational design ■ molecular modeling □ combine advantages of rational and random approaches □ selection of promising target sites (hot-spots) -> mutagenesis -> creation of small "smart" libraries □ based on knowledge of protein structure and function □ © high-throughput screening usually not needed □ © increased chance of obtaining variants with desired properties □ © certain knowledge of protein structure-function relationships is still required, © but not that much □ Protein engineering approaches □ Semi-rational design ■ identification of hot-spots ■ evaluation of hot-spots ■ selection of substitutions ■ design of library ■ mutagenesis and screening □ Rational design ■ molecular modeling □ hot-spots for engineering catalytic properties □ hot-spots for engineering thermostability Hot-spots for engineering catalytic properties □ residues mediating substrate binding, transition-state stabilization or product release -> mutations can improve or disrupt binding, catalysis or ligand transport ■ residues involved in protein-ligand interactions ■ residues located in binding pockets ■ residues located in access tunnels -> these residues also include catalytic or other essential residues which generally should not be mutated! Analysis of protein-ligand interactions □ requires 3D structure of protein-ligand complex ■ experimental structure (wwPDB) ■ theoretical model (molecular docking) Ile50- LigPlot, LigPlot+ PoseView Analysis of protein-ligand interactions □ inter-atomic contacts between protein and bound ligands Residue Dist(A) Surf(A2) Number of contacts ASN 33 A 2.7 22.5 2 ASP 103 A 2.S 35.1 5 ILE 134 A 6.3 0.7 1 PHE 143 A 5.0 6.5 2 PHE 151 A 3.3 26.7 4 PHE 169 A 3.5 6.4 2 VAL 173 A 3.6 23.4 1 LEU 177 A 4.S 8.5 2 ILE 211 A 5.2 3.3 1 LEU 243 A 5.6 10.3 4 HIS 272 A 3.S 33.3 9 PHE 273 A 3.5 2.3 2 BR 901 A 3.S 30.7 2 LPC server UniProt (Swiss-Prot) - https://www.uniprot.org/ Sites Feature key Position(s) Description Binding site3 38 Halide Active site1 106 Nucleophile * 3 Publications ▼ Binding site3 109 Halide Active site1 132 Proton donor * 3 Publications ▼ Active site' 272 Proton acceptor * 3 Publications ▼ Catalytic Site Atlas - https://www.ebi.ac.uk/thornton-srv/m-csa/ Catalytic Residues Roles UniProt PDB* His272 His272A Acts as general acid base to deprotonate water, thus activating water so its lone pair can attack the covalent enzyme intermediate. Asp108 AsplOBA Acts as nucleophile on the electropriilic carbon atom to form a covalent enzyme intermediate which is hydrolysed to give the product. Asn387 Trp109 Asn38A; Trp109A Involved in stabilisation of the halogen, transition-states and product. Glu132 Glu132A Acts to modify the pKa of His 272 so that it remains in the correct protonation state for its role in catalysis Analysis of binding pockets □ binding and active sites of enzymes are often associated with structural pockets and cavities active site pocket ligand Analysis of binding pockets □ binding and active sites of enzymes are often associated with structural pockets and cavities Analysis of binding pockets □ binding and active sites of enzymes are often associated with structural pockets and cavities mutation □ binding and active sites of enzymes are often associated with structural pockets and cavities Analysis of binding pockets □ binding and active sites of enzymes are often associated with structural pockets and cavities Analysis of binding pockets □ binding and active sites of enzymes are often associated with structural pockets and cavities ■ most amino acid residues located in these pockets may come into contact with the ligands during the catalytic cycle -> one can accurately predict which residues may interact with the ligand even without precise knowledge of ligand orientation in the active site □ requires 3D structure of protein □ software for detection of pockets ■ CASTp, fPocket, MetaPocket, Caver Analyst... Analysis of binding pockets □ detailed characterization of all pockets in the structure Pocket Information ID IA re a |VOl 42 356 4 364.2 - □ 41 291.3 313.2 o 40 105.1 SO = a 39 101.6 79.2 □ 3S 39 6 67.3 a 37 53 62.1 a 36 116 325 □ 35 71.& S6B 34 95.3 55B 33 65.1 43.& 32 SO.S 57.1 31 766 743 30 SS.3 21.6 n i 29 BIS 53 R ► 42 ss CB ASN A - 42 3S ND2 ASN A 42 3S 0 ASN A 42 103 CG ASP A —J 42 103 GD1 ASP A 42 103 ÜD2 ASP A 42 10S CD1 TRP A 42 10& NE1 TRP A 42 134 CD1 ILE A 42 134 CG1 ILE A jobID: ld07 hydrolase hydrolytic haloalkane dehalogenase linb from sphingomonas paucimobilis ut26 with Lf3-propanediolr a product of debromidation of dibrompropane, at 2.0a resolution Annotated Sites Jmol CASTp Residue: ASN Residue*: Chain: 38:A Pocket/Pockets: 30,42 Databases SWP BINDING H aloe. Residue: ASP Residue #: Chain: 108:A Pocket/Pockets: 30,42 Databases SWP ACT_STE NuclSttphils. Residue: TRP Residue #: Chain: 109: A Pocket/Pockets: 42,32 Databases SWP BINDING H aide. Residue: GLU Residue*: Chain: 132:A Pocket/Pockets: 24,34 Databases SWP ACT SITE Prptpn_ donor. Residue: HIS Residue*: Chain: 272A Pocket/Pockets: 42 Databases SWP ACT_STE Preten, accept nr. Analysis of access tunnels □ buried binding or active sites are connected with bulk solvent by access tunnels access tunne active site pocket |igand □ buried binding or active sites are connected with bulk solvent by access tunnels Analysis of access tunnels □ buried binding or active sites are connected with bulk solvent by access tunnels Analysis of access tunnels □ buried binding or active sites are connected with bulk solvent by access tunnels mutation □ buried binding or active sites are connected with bulk solvent by access tunnels ■ adjusted to permit transport of specific molecules ■ mutations can speed-up or hinder transport of molecules as well as allow transport of other molecules □ requires 3D structure of protein □ software for detection of tunnels ■ Caver, Mole, HOLE, PoreWalker Analysis of access tunnels □ Detection and detailed characteristics of access tunnels . CAVER Viewer 3.0 BETA (Evaluation Version) build 20120717114: Development File View Visualization Tools Window Help Select All Remove Selected Remove All |@ 1CQW chasu.117asm) :- *\ P*87 '■- *\ PfSS 1IZ8 QR£S#2 iio* w*. a w« *\ P*171 CAVER Analyst 2.0 Hot-spots for engineering thermostabilit □ highly flexible residues - introduction of rigidifying mutations □ residues located in access tunnels □ residues predicted by systematic in silico saturation mutagenesis -> these residues may also include catalytic or other essential residues which generally should not be mutated! Identification of highly flexible residues □ prediction based on crystallographic B-factors ■ reflect the degree of thermal motion, and thus the flexibility of individual residues □ requires 3D structure of protein ■ experimental structure determined by X-ray crystallography (wwPDB) Identification of highly flexible residues □ average B-factor of each residue in the target protein pitie: CRYSTAL STRUCTURE OF NIDOGEN/LAMININ COMPLEX (The highest 2 0 averaged 3 valines are shown only. ) Chain identifier of chain no. Residue Nair.e Residue seq. no ARE GLU SIR ALA ILE GLU ASN ELY FHE LY5 SLY HIS LEU < A A A A A A A A A A A A A 331 330 151 143 150 9E 1 332 373 E 9 152 9E :■ 9" E 14 E A 3 value 48.46 46.87 46.50 45.90 45.69 45.63 44.67 44.39 44.32 43.64 43.33 42.84 42.76 Rank _ 2 E - 5 6 E 9 13 11 12 IE □ B-FITTER Analysis of access tunnels □ saturation mutagenesis in tunnel residues has 2x better chance to significantly improve stability than mutagenesis in other protein regions (based on computational predictions) tunnel other A AG [kcal/mol] Analysis of access tunnels □ Detection of tunnels in proteins and analysis of ligand transport ^9 Tool for tn« anarytit of tunntii and channels in protein •tructur** • m OAK »»— 1«» imm r—_ OMMtM PyMOA. MMA 1_ VtoaUMMiai _ 1 * ^——— 11' ** \ / m v jJ M»r>«4T*>c ^v M WmmnM« MAkrHMl 11 t e ; : i e i 1111 c ; : cnri | s E ■ ' Bvtmm— 11 ■■»,. bn-.c*. («•■•« 1 - □ X to— m mmmv> 1 m r .[ ~ j CAVER Web I Systematic in silico saturation mutagenesis v □ computational tools for the prediction of effect of amino acid substitutions on protein stability ■ each residue in the protein structure is replaced by all other possible amino acids and the change in folding free energy (AAG) upon mutation is estimated ■ positions with a high proportion of stabilizing mutations and/or low proportion of destabilizing mutations are good candidates for randomization by experimental saturation mutagenesis □ usually requires 3D structure of protein ■ experimental structure (wwPDB) ■ theoretical model (homology modeling) Systematic in silico saturation mutagenesis v □ fast systematic scan of all possible single-point mutations -prediction of stability changes upon mutation □ sequence optimality score (the sum of all negative AAGs at a given position) - indicates poorly optimized positions □ Protein engineering approaches □ Semi-rational design ■ identification of hot-spots ■ evaluation of hot-spots ■ selection of substitutions ■ design of library ■ mutagenesis and screening □ Rational design ■ molecular modeling □ hot-spots identified by computational tools can be further evaluated to prevent replacing indispensable amino acid residues and to prioritize the hot-spots (i.e., order the hot-spots based on their suitability for mutagenesis) □ analysis of evolutionary conservation □ prediction of effects of mutations on protein stability or function Analysis of evolutionary conservation □ residues essential for maintaining structural or functional properties of a protein tend to be conserved during evolution ■ conserved residues are generally not recommended as suitable targets for mutagenesis - their replacement often leads to the loss of protein function ■ mutagenesis targeting highly mutable positions provides a significantly higher proportion of viable variants than random mutagenesis ■ targeting moderately or highly variable positions, which are expected to be tolerant to a wide range of substitutions, represents a good approach for producing efficient smart libraries (i.e., libraries with a high proportion of correctly folded and active variants) Analysis of evolutionary conservation V □ residue conservation can be derived from a multiple alignment of a set of related proteins (3D structure not required) I T LVVHDWGGMIGMGYAARYP E RIK Analysis of evolutionary conservation V □ residue conservation can be derived from a multiple alignment of a set of related proteins (3D structure not required) 1 2 3 4 5 6 7 TLV DWG M TLV DWG M TUAV DWG M TLV DWG P VTLVC WGSL ITUFC |WG L VTL^LOlY I GMGYBARYBEHIK IGMAYIV PR IR IGF WBLAH VOVR IGLCMA RHlA lglrlIaehhr iglrlvaenId GAAFGLNWHSRF PD Analysis of evolutionary conservation V □ evolutionary conservation of individual positions in protein mapped on protein 3D structure 1 2 3 4 5 6 Variable Conserved ConSurf Prediction of mutation effects □ computational tools for the prediction of effect of amino acid substitutions on protein stability or protein function ■ in silico site-saturation mutagenesis of identified hot-spots - check if mutations at a given site are likely to be tolerated ■ many highly destabilizing/deleterious mutations predicted for a certain position - given site is not a very good target for mutagenesis sites with only a few highly destabilizing /deleterious mutations predicted can still represent promising hot-spots (the amino acids with potentially destabilizing/deleterious effects can be discarded from the library by the appropriate selection of degenerate codons) □ effects on protein stability - usually requires 3D structure of protein ■ experimental structure (wwPDB) ■ theoretical model (homology modeling) □ effects on protein function - sequence information often sufficient □ prediction of effect of substitutions on protein stability ■ Evaluation of the change of protein free energy upon mutation ■ Evaluation of contributions of individual interactions to total energy ■ Usually requires structural information □ software for prediction of effect of mutation on stability ■ Rosetta, FoldX, CUPSAT, ERIS □ prediction of effect of substitutions on protein stability Amino Acid Mutations Amino acid Overall Stability Torsion Predicted AAG (kcal/mol) GLY Stabilising Unfavourable 1.43 ALA Destabilising Unfavourable -0.9 VAL Destabilising Unfavourable -2.23 ILE Destabilising Unfavourable -2.12 MET Stabilising Unfavourable 1.39 PRO Stabilising Unfavourable 1.55 TRP Stabilising Favourable 2.73 SER Stabilising Unfavourable 1.2 THR Destabilising Unfavourable -0.44 PHE Stabilising Favourable 3.64 GUM Destabilising Unfavourable -0.69 LYS Stabilising Unfavourable 9.91 TYR Stabilising Favourable 0.96 ASN Stabilising Favourable 4.14 CYS Destabilising Favourable -6.73 GLU Stabilising Unfavourable 4.93 ASP Stabilising Favourable 1.31 ARG Stabilising Unfavourable 2.94 HIS Stabilising Favourable 1.33 CUPSAT □ prediction of effect of substitutions on protein function ■ Evaluation if a mutation would impair protein function ■ Hard to describe by physico-chemical properties > machine learning ■ Usually sequence based calculation □ software for prediction of effect of mutation on function ■ PredictSNP, SIFT, MAPP, PhD-SNP... pred iction of effect of substitutions on protein function o Unified platlMm fnr prediction al SNP aflna ii disbnci genomic regions Home Hi* Job ID H IC-: bpgh mm hanpa RESULTS ■ ■ ŕ.JiJi tied -ace uV&Cp b CAOO DjUjhl HjMttl tiBisma,* -t *?« J*h 75* *ľH 76* 7SH "Ô" G UIR3 BJ*t [flit 94« Sift lJ54aCii7i.ii-.il S hTEC hTTl [■_ [■ *a« BTK aiH 54« Q Q Q V' ■ l«:«JWSa.*-G 1ID 79 "t a * 92« E7H «« Q ■.L'.M'.'ll-.,: .-, ÍI ■. HH hu M H Q © C N* HH 7 iflíiiiasM.c -c BDta aa *t BI tt Q i v- s,™ ]i.aiMBara,Ľ P'K.:i5HPL DOWNLOAD (A^EMBLIT CCHIM7.WCL5) VCffi* PC* IX □ Protein engineering approaches □ Semi-rational design ■ identification of hot-spots ■ evaluation of hot-spots ■ selection of substitutions ■ design of library ■ mutagenesis and screening □ Rational design ■ molecular modeling Selection of substitutions □ substitutions introduced using degenerate codons ■ e.g., NNK(N = A/T/G/C; K = T/G) IUP AC Nucleotide Nomenclature Table symbol A C G T U R Y K base adenosine cytidine guanine thymidine uridine symbol M S w B d G A (purine) T C (pyrimidine) G T (keto) H V N base A C (amino) G C (strong) A T (weak) GTC GAT ACT GC A A G C T (any) □ all possible substitutions - NNK or NNS degenerate codons ■ © encode all 20 amino acids with the lowest redundancy and price (mixture of 32 codons) ■ © redundancy is not completely eliminated (3* Arg, Leu, Ser, 2* Ala, Gly, Pro, Thrand Val) □ all possible substitutions - NNK or NNS degenerate codons □ introduction of only selected substitutions using degenerate codons encoding reduced amino acid alphabets ■ © do not encode all 20 amino acids ■ © decreased library size -> improved screening efficiency ■ NDT- balanced set of 12 amino acids (12 codons) Selection of substitutions □ all possible substitutions - NNK or NNS degenerate codons □ introduction of only selected substitutions using degenerate codons encoding reduced amino acid alphabets Table 1, Oversampling necessary for 95% coverage as a function of NNK and IMDT codon degeneracy, No,[Jl Codons NNK Transform ants needed Codom NDT Transform ants needed 1 32 94 1? 34 2 1028 3066 14V 430 3 32768 98163 1 728 5175 4 1048 576 3141251 20 736 62 11 B 5 33554 432 100520093 243832 745433 6 >1.0x109 > 3.2x10* > 2,9x10' >8.9xl0* 7 >3.4x1010 > 1.0x1011 >3.5x10! > 1.1 X10* 8 >1.Qx1012 >3.3x101J > 4.2 x 1CV > 1.3x10* 9 >3.5x1013 >1.0x10lJl >5/l x10' > 1,5x10ia 10 >1.1x1015 >3<4x10lf >6.1 X101' >1.9x 1011 [a] Number of aa positions at one site. □ introduction of amino acids exhibiting certain properti ■ VRK - 8 hydrophilic amino acids (12 codons) ■ NYC - 8 hydrophobic amino acids (8 codons) ■ KST - 4 small amino acids (4 codons) □ introduction of amino acids exhibiting certain properti □ introduction of a balanced set of amino acids ■ NDT- balanced set of 12 amino acids (12 codons) Selection of reduced amino acid alphabets v □ introduction of amino acids exhibiting certain properties □ introduction of a balanced set of amino acids □ introduction of substitutions existing (at a given site) in known natural proteins ■ likely increasing the proportion of viable variants in the resulting library ■ can be obtained by analysis of multiple sequence alignment Selection of reduced amino acid alphabets v □ introduction of amino acids exhibiting certain properties □ introduction of a balanced set of amino acids □ introduction of substitutions existing (at a given site) in known natural proteins □ discarding amino acids with potentially destabilizing/ deleterious effects ■ can be obtained by prediction of effects of mutations on protein stability or function □ meta-server combining several tools ■ automatic identification of hot-spots for engineering of enzyme catalytic properties ■ prioritization of hot-spots by their mutability ■ distribution of amino acids at individual positions HotSpot Wizard Stability hot-spots (evolution) Correlated hot-soots HotSpot Wizard Functional hot spots of 1CV2 Residue features I I Exclude correlated positions I I Exclude catalytic pockets I I Exclude tunnels Exclude buried residues fj Include residues with moderate mutability I I Exclude a-helices and (i-sheets Show all residues chain position residue □ Chain A ■J o> ■J ■J A ft A ft A ft A ft A 146 i:c 147 271 138 2^7 245 2-S 253 Gin Met Asp Ala He Ala Leu "hr Met mutable ■J ■/ ■J ■/ ■J ■/ ■J ■/ non-essential in tunnel •J X •J ■/ X ■/ ■J ■/ X in catalytic pocket HotSpot Tunnels Return to Results browser Visualization settings 0 length [A) bottleneck radius (A) E) Starting from pocket: 1 1 7.7 IE Pockets 0 chainfs) relevance [16) volume [A^) 100 S2 62 25 25 19 19 576 883 275 755 183 119 c?2 m a u Q Q u - Residues selected for mutagenesis 0 Zoom residues Design library position ft ft ft 1 ?£ 1-7 residue Gin Met HotSpot U ■J 1. protein structure HotSpot Wizard 2. residues indispensable for protein function: catalytic and binding residues 3. functional residues: active site pocket and tunnels 4. mutability of individual positions of prote □ Protein engineering approaches □ Semi-rational design ■ identification of hot-spots ■ evaluation of hot-spots ■ selection of substitutions ■ design of library ■ mutagenesis and screening □ Rational design ■ molecular modeling □ decisions to be made after evaluation and prioritization of hot-spots: ■ how many and which positions to target? ■ should the positions be randomized simultaneously or separately? ■ should all or only a reduced set of amino acids be introduced at individual positions? -> dramatic effect on the size of the resulting library Design of library - HotSpot Wizard Functional hot spots of 1CV2 Tunnels Return to Results browser Visualization settings 0 id length [A) bottleneck radius (A) E) Starting from pocket: 1 1 7.7 Pockets 0 Residue features chainfs) relevance [16) volume^) 100 32 62 25 25 1& 19 576 883 275 753 183 11& m u Q Q I I Exclude correlated positions I I Exclude catalytic pockets I I Exclude tunnels Exclude buried residues Q Include residues with moderate mutability I I Exclude o-helices and (i-sheets Show all residues chain position residue □ Chain A ■J o> ■J ■J A A A A A 146 Ml 147 271 13E 2^7 246 2-S 253 Gin Met Asp ^la He ^la Leu "hr Met mutable ■J ■/ ■J ■/ ■J ■/ ■J ■/ non-essential in tunnel •J X •J ■/ X / ■J ■/ X in catalytic pocket HotSpot Residues selected for mutagenesis Zoom residues Design library position 1 ?; 1-7 residue :3lr Met HotSpot ^ m ^ m Design of library - HotSpot Wizard Library design r Stare ire SwiflLib AAs selection mode : Amino acid frequency Minimal frequency (%): I n eljde wi Id-type Excl ud e wild-type — m chain position residue m a 136 Met 121 a 146 Gin 0 a 147 Asp des red oniric £c cs Dodon Ala, Lys, Pro, Gln: Arg, Thr WR Ala, Asp, Glu: Gly, Pro, Gin Eer Ala, Phe. Gly, Leu, Met, Thr Val EVV DES desired ralio stop ratio (%) 77.3 L ~ 53.0 61.1 0.0 11.1 0.0 Library size: 7315 Expects d c ov e rag e : Ü. 95 Probability of f u 11 coverage : 0 Codon usage : Escherichia coli k12 Generate report Design of library - HotSpot Wizard Library design r Stare ire SwiflLib AAs selection mode : Amino acid frequency Minimal frequency (%): Ineljde wiId-type EkdIude wild-type — chain A A A position 136 146 147 residue Met Gin Asp des red ^ninc £c cs Dodon Ala, Lys, Pro, Gin, Arg, Thr WR Ala, Asp, Glu: Gly, Pro, Gin Eer EVV Ala, Phe, Gly, Leu, Met, Thr, Val DES desired ralio stop ratio (%) 77.3 53.0 61.1 0.0 11.1 0.0 codon desired ralio [%) stop ratio (%] desired amino acids encoded amino acids DBS DBK DBB □ BN DBV □ BD NBS NBK NBB MEN 100.0 100.0 100.0 97.2 96.3 96.3 91.7 91.7 91.7 39.6 C 0 ; 0 ; ■:< : 3 3.7 3.7 : ü ; ■:< ; q : ■ Ala:2 Cys:1 Phe:1 Gly 2 lle:1 Leu:1 Metl Arg:1 Ser3 Ala:2 Cys:1 Phe:1 Gly:2 lle:1 Leu:1 Met1 Ang:1 Ser:3 Thn2Val:2Trp:1 Thn2 Val:2 Trp:1 Ala:2 Cys:1 Phe:1 Gly:2 lle:1 Leu:1 Met1 Arg:1 Ser3 Ala:2 Cys:1 Phe:1 Gly:2 lle:1 Leu:1 Met1 Arg:1 Ser:3 Thn2Val:2Trp:1 Thn2 Val:2 Trp:1 Ala:3 Cys:2 Phe:2 Gly:3 lle:2 Leu:1 Met1 Arg:1 SerS Ala:3 Cys:2 Phe:2 Gly:3 lle:2 Leu:1 Met1 Arg:1 Ser:5 Thr! Val 3 Trp: 1 Thn3 Val:3 Trp:1 Ala :4 Cys: 2 Phe 2 Gly:4 I le:3 Le u :2 Mel: 1 Arg 2 Ser 6 Ala 4 Cys: 2 Phe :2 Gly:4 I le:3 Le u :2 Mel: 1 Arg :2 Se r: 6 Thn4 Val :4 Trp: 1 Thn4 Val :4 Trp: 1 Ala:3 Cys:1 Phe:1 Gly:3 lle:2 Leu:2 Mel:1 Arg:2 Ser4 Ala:3 Cys:1 Phe:1 Gly:3 lle:2 Leu:2 Mel:1 Arg:2 Sen4 Thn3Val:3Trp:1 TTin3 Val:3 Trp:1 Ala:3 Cys:1 Phe:1 Gly:3 lle:2 Leu:2 Mel:1 Arg:2 Ser4 Ala:3 Cys:1 Phe:1 Gly 3 lie 2 Leu:2 Mel:1 Arg:2 Ser4 Thn3Val:3Trp:1 Thr3 Val:3 Trp:1 Ala:2 Cys:1 Phe:1 Gly:2 lle:1 Leu:3 Mel:1 Arg:3 Ser3 Ala:2 Cys:1 Phe:1 Gly:2 lle:1 Leu:3 Mel:1 Pro:2 Arg:3 Thn2 Val:2 Trp:1 Ser:3Thr:2 Val: 2 Trp:1 Ala:2 Cys:1 Phe:1 Gly:2 lle:1 Leu:3 Mel:1 Arg:3 Ser3 Ala:2 Cys:1 Phe:1 Gly:2 lle:1 Leu:3 Mel:1 Pro:2 Arg:3 Thn2 Val:2 Trp:1 Ser:3Thr:2 Val: 2 Trp:1 Ala: 3 Cys: 2 Phe 2 Gly:3 I le:2 Le u 4 Mel: 1 Arg 4 Ser 5 Ala:3 Cys: 2 Phe :2 Gly:3 I le:2 Le u :4 Mel: 1 Pro: 3 Arg: 4 Thn3 Val:3 Trp:1 SerS Thr:3 Val:3 Trp:1 Ala :4 Cys: 2 Phe 2 Gly:4 I le:3 Le u: 6 Mel: 1 Arg: 6 Ser 6 Ala 4 Cys: 2 Phe :2 Gly:4 I le:3 Le u: & Mel: 1 Pro 4 Arg: 6 Tk. .1 t,...i_c--.c ti.-h w- ■ .1 T..- 1_ Library size: 7315 Expected coverage: 0.95 Probability of full coverage : 0 Codon usage : | Escherichia coli K12 Generate report □ Protein engineering approaches □ Semi-rational design ■ identification of hot-spots ■ evaluation of hot-spots ■ selection of substitutions ■ design of library ■ mutagenesis and screening □ Rational design ■ molecular modeling □ saturation mutagenesis - next lecture □ Protein engineering approaches □ Semi-rational design ■ identification of hot-spots ■ evaluation of hot-spots ■ selection of substitutions ■ design of library ■ mutagenesis and screening □ Rational design ■ molecular modelings design of mutations □ site-specific changes on the target enzyme □ few amino-acid substitutions that are predicted to elicit desired improvements of enzyme function □ based on detailed knowledge of protein structure, function and catalytic mechanism □ © relatively simple characterization of constructed variants □ © complexity of protein structure-function relationships □ © molecular modeling expertise usually required □ Protein engineering approaches □ Semi-rational design ■ identification of hot-spots ■ evaluation of hot-spots ■ selection of substitutions ■ design of library ■ mutagenesis and screening □ Rational design ■ molecular modelings design of mutations □ "Theoretical or computational technique that provides insight into the behavior of molecular system/' A /?. Leach □ Applications ■ Protein stabilization ■ prediction of protein dynamics ■ prediction of protein-ligand interactions ■ prediction of reaction barriers and reaction mechanisms □ relationship between energy and 3D-structure ■ potential energy surface □ basic methods molecular mechanics molecular dynamics quantum chemistry molecular docking □ Enzymes as biocatalysts ■ good activity and selectivity in water solution and standard temperature ■ for many biotechnological applications, high temperature or addition of organic solvents are necessary ■ this conditions can lead to denaturation > importance of stable proteins Design of stability □ Computational method FireProt https://loschmidt. chemi.muni. cz/fireprot/ ■ prediction of all single-point mutants by FoldX, Rosetta, and back-to-consens ■ smart filtering based on conservation, correlation, electrostatic interactions, and antagonistic effect ■ final prediction of multiple-point mutants for gene synthesis cu c o cu Q_ E o ü 03 c= cd E <5> Mutation tito char poston ret Energy nformation visualization settings Slructur* visualization styl«: . MM M 56CH Tu fcjis I Mi ho« ji vauateed letidues Save mage RmM vmw Visualization quality: 1 FifcProt protocol design - POOD: 4*4« Length: 292 Z Evolution mutant: -3 7 Kal triol (6 mutations not conserved not correlated fosetta Evolution riormabon mutable by majori, mutable by rat» told" A A A 11 20 33 0 E T P S I ✓ ✓ ✓ -1 89 I 94 X X X ✓ X -1 39 008 • 1 31 Design of stability - use case □ Stabilization of haloalkane dehalogenase DhaA ■ In silico prediction of 5,500 mutants ■ Experimental testing of 5 mutants Energy-based approach Evolution-based approach □ Output ■ 3 more stable mutants ■ Combined mutant A7m= 24°C Target protein 5,529 CD c o -I—' CD ■*—> CL £ o O co £Z CD E 1_ CD CL X Conservation and correlation analysis ary i 3,610 FoldX prediction + 151 Rosetta prediction + 22 Interaction analysis I 21 Antagonistic effect prediction | I 1" Multiple-point mutant design 1 Structure and activity check Back-to-c ana onsensus ysis ,42 FoldX prediction 20 r Interaction analysis 13 r Multiple-point mutant design 4 r Structure and activity check Stability determination H Stability determination 1 Combined mutant □ successive configurations of system in time □ provides information on energetics, amplitudes and time scales of local motions on atomic level □ generates ensemble of structures ■ more precise calculations of free energies Ligand conversion Dynamical behaviour □ predicts structure of receptor (protein) - ligand complex Molecular docking x. □ Two components procedure ■ searching - finding the conformation of ligand in the active site of the enzyme ■ scoring - evaluation of the binding free energy □ Docking software ■ Autodock, Vina, Gold, Medusa, Rosetta Dock... C-offinity V. * Energy - -76.69 kcal/jno' RMSD - 0.86 Angstrom □ Virtual screening ■ many compounds against one enzyme ■ one compound against many enzymes Dock Test predictions □ modeling of reaction ■ reaction barrier Quantum chemistry □ modeling of reaction 3.0 2.8 2.6 2.4 2.2-108839 jb 1-934 6609 M TRITON Design of mutations Experiment 1 J A. Investigated system MD of free protein Docking MD of complex QM of complex Hypothesis Knowledge ■ Design of mutations □ identification of functionally important residues ■ decomposition of energies to individual contribution ■ flexible residues - functionally important dynamics ■ residues in contact with ligand -> further molecular modeling -> semi-rational design design of modified enzymes by in silico screening ■ study of effects of all relevant mutations ■ selection and combination of the best mutations Alt possible Best single-point mutations candidate ^^^^ ^^^^ ^^^^ M°de,in9 ^^^^ m m m Modelinq of TS stabilization M GGG--► Q M % M% M% Modeling of stability A % GGG--► C gn of mutations effect of mutations at molecular level ■ example: improved activity of tunnel mutant 4 closed tunnel + improved activity LOSCHMIDT LABORATORIES PROTEIN ENGINEERING 8. Directed evolution Loschmidt Laboratories Department of Experimental Biology Masaryk University, Brno