Bi7430 Molecular Biotechnology Protein Engineering Outline ❑ Limitations of proteins in biotechnology processes ❑ Definition and aim of protein engineering ❑ Targeted properties of proteins ❑ Basic approaches in protein engineering ▪ DIRECTED EVOLUTION ▪ RATIONAL DESIGN ▪ SEMI-RATIONAL DESIGN ❑ Examples, application of artificial inteligence Proteins in biotechnology Available protein Available protein Suitable/adopted protein ❑ key problem -availability of optimal protein for specific process ❑ traditional biotechnology - adapt process ❑ modern biotechnology - adapt protein H OW TO O BTA I N O P T I M A L P R OT E I N ? Proteins in biotechnology ❑ classical screening ▪ screening culture collections ▪ polluted and extreme environment ❑ environmental gene libraries ▪ metagenomic DNA ❑ data-base mining ▪ gene databases ▪ (meta)genome sequencing projects ▪ numerous uncharacterised proteins Proteins in biotechnology ❑ classical screening ▪ screening culture collections ▪ polluted and extreme environment ❑ environmental gene libraries ▪ metagenomic DNA ❑ data-base mining ▪ gene databases ▪ (meta)genome sequencing projects ▪ numerous uncharacterised proteins ❑ PROTEIN ENGINEERING I F S U I TA B L E P R OT E I N D O E S N OT E X I S T I N N AT U R E ? Aims of protein engineering ❑ the process of constructing novel protein molecules by design first principles or altering existing structure ❑ use of genetic manipulations to alter the coding sequence of a gene and thus modify the properties of the protein ❑ AIMS AND APPLICATIONS ▪ technological - optimisation of the protein to be suitable in particular technology purpose ▪ scientific - desire to understand what elements of proteins contribute to folding, stability and function Targeted properties of proteins ❑ structural properties of proteins ▪ stability (temperature, solvents) ▪ tolerance to pH, salt ▪ resistance to oxidative stress ❑ functional properties of proteins ▪ substrate specificity and selectivity ▪ kinetic properties (e.g., Km, kcat, Ki) ▪ cofactor selectivity ▪ protein-protein or protein-DNA interactions Strategies in protein engineering Improved protein Directed evolution ❑ directed evolution techniques emerged during mid -1990s ❑ inspired by natural evolution ❑ this form of "evolution" does not match what Darwin had envisioned ▪ requires outside intelligence, not blind chance ▪ does not take millions of years, but happens rapidly Improved protein Directed evolution ❑ evolution in test tube comprises two steps ▪ random mutagenesis building mutant library (diversity) ▪ screening and selection identification of desired biocatalyst ❑ prerequisites for directed evolution ▪ gene encoding protein of interest ▪ method to create mutant library ▪ suitable expression system ▪ screening or selection system Methods to create mutant libraries ❑ technology to generate large diversity ▪ NON-RECOMBINING one parent gene -> variants with point mutations ▪ RECOMBINING several parental homologous genes -> chimeras Non-recombining mutagenesis ❑ UV irradiation or chemical mutagens (traditional) ❑ mutator strains - lacks DNA repair mechanism mutations during replication (e.g., Epicurian coli XL1-Red) ❑ error-prone polymerase chain reaction (ep-PCR) ▪ gene amplified in imperfect copying process (e.g., unbalanced deoxyribonucleotides concentrations, high Mg2+ concentration, Mn2+, low annealing temperatures) ▪ 1 to 20 mutation per 1000 base pairs ❑ saturation mutagenesis ▪ randomization of single or multiple codons ▪ gene site saturation mutagenesis ❑ other methods ▪ insertion/deletions (InDel) ▪ cassette mutagenesis (region mutagenesis) Recombining mutagenesis ❑ also refered to as „sexual mutagenesis“ ❑ DNA shuffling ▪ fragmentation step ▪ random reassembly of segments ❑ StEP - staggered extension process ▪ simpler then shuffling ▪ random reannealing combined with limited primer extension ❑ other methods shuffling of genes with lower homology down to 70% (e.g., RACHITT, ITCHY, SCRATCHY) Screening and selection ❑ most critical step of direct evolution ❑ isolation of positive mutants hiding in library ▪ HIGH THROUGHPUT SCREENING individual assays of variants one by one ▪ DIRECT SELECTION display techniques (link between genotype and phenotype) (Utra)High throughput screening ❑ common methods not applicable ❑ agar plate (pre)screening ❑ microtiter plates screening ▪ 96-, 384- or 1536-well formate ▪ robot assistance (colony picker, liquid handler) ▪ 10 4 libraries ▪ volume 10 – 100 uL ❑ microfluidic systems (Lesson 6) ▪ water in oil emulsions ( u p t o 1 0 k H z ) ▪ FACS sorting (1 0 8 e v e n t s / h o u r ) ▪ 10 9 libraries ▪ volume 1 – 10 pL Direct selection ❑ not generally applicable (mutant libraries >10 6 variants) ❑ link between genotype and phenotype ❑ display technologies ▪ ribosome display ▪ phage display ❑ life-or-death assay ▪ auxotrophic strain ▪ toxicity based selection Example of Directed evolution ❑ directed evolution of enantioselectivity ▪ lipase from P. aeruginosa (E-value improved from 1.1 into 51) ▪ spectrophotometric screening of (R)- and (S)-nitrophenyl esters ▪ 40 000 variants screened ▪ the best mutant contains six amino acid substitutions Reetz, M., et al. 2001. Angew. Chem. Int. Ed. 40: 3589-91 Strategies in protein engineering Improved protein Rational design ❑ emerged around 1980s as the original protein engineering approach ❑ knowledge based - combining theory and experiment ❑ protein engineering cycle: „structure-theory-design-mutation-purification-analysis“ ❑ difficulty in prediction of mutation effects on protein property ❑ de novo design most challenging Principal of rational design Improved protein ❑ rational design comprises: ▪ design - understanding of protein functionality ▪ experiment - construction and testing of mutants ❑ prerequisites for rational design: ▪ gene encoding protein of interest ▪ 3D structure (e.g., X-ray, NMR) or sequence alignment ▪ structure-function relationship ▪ computational methods and capacity ▪ side directed mutagenesis techniques ▪ efficient expression system ▪ biochemical tests Design ❑ SEQUENCE HOMOLOGY APPROACH ▪ homologous wild-type sequences alignment ▪ identifying amino acid residues responsible for differences ▪ design - combination of possitive mutation from all parental proteins ❑ ANCESTRAL RECONSTRUCTION ▪ construction of phylogenetic tree ▪ design - nods prediction by consensus approach Design ❑ SEQUENCE HOMOLOGY APPROACH ▪ homologous wild-type sequences alignment ▪ identifying amino acid residues responsible for differences ▪ design - combination of possitive mutation from all parental proteins ❑ ANCESTRAL RECONSTRUCTION ▪ construction of phylogenetic tree ▪ design - nods prediction by consensus approach Bioinformatika Bi5000 ▪ Období: podzim ▪ Rozsah: přednáška 2 hodiny/týden, cvičení 2 hodiny/týden ▪ Vyučující: prof. Mgr. Jiří Damborský, Dr., doc. RNDr. Roman Pantůček, Ph.D., ▪ Osnova: ▪ bioinformatické databáze a jejich prohledávání ▪ analýza nukleotidových a proteinových sekvencí ▪ hledání a identifikace genů ▪ analýza a předpověď struktury proteinů Design ❑ STRUCTURE-BASED APPROACH ▪ prediction of enzyme function from structure alone is challenging ▪ protein structure (X-ray crystallography, NMR, homology models!) ▪ molecular modelling o molecular docking o molecular dynamics o quantum mechanics/molecular mechanics (QM/MM) Strukturní biologie Bi9410 ▪ Období: podzim ▪ Rozsah: přednáška 2 hodiny/týden, cvičení 2 hodiny/týden ▪ Vyučující: Mgr. David Bednář ▪ Osnova: ▪ struktura, stabilita a dynamika biologických makromolekul ▪ makromolekulární interakce a komplexy ▪ stanovení a předpověď struktury, identifikace důležitých oblastí ▪ stanovení vlivu mutace na strukturu a funkci proteinu ▪ aplikace v biologickém výzkumu, návrhu léčiv a biokatalyzátorů Construction ❑ site-directed mutagenesis ▪ introducing point mutations ❑ multi site-directed mutagenesis ❑ gene synthesis ▪ commercial service ▪ codone optimisation Example of rational design ❑ rational design of protein stability ▪ stability to high temperature, extreme pH, proteases etc. ▪ stabilizing mutations increase strength of weak interactions o salt bridges and H-bonds Eijsink et al., Biochem. J. 285: 625-628, 1992 o S-S bonds Matsumura et al., Nature 342: 291 -293, 1989 o addition of prolines Watanabe et al., Eur. J. Biochem. 226: 277-283, 1994 o less glycines Margarit et al., Protein Eng. 5: 543 -550, 1992 o oligomerisation Dalhus et al., J. Mol. Biol. 318: 707 -721, 2002 Example of rational design ❑ engineering protein to resist boiling ▪ reduced rotational freedom Ser65Pro, Ala96Pro ▪ introduction of disulfide bridge Gly8Cys + Asn60Cys ▪ improved internal hydrogen bond Ala4Thr ▪ filling cavity Tyr63Phe Burg, B., et al., 1998. PNAS 95: 2056-2060 Half-lifes (min.) 80°C 100°C wild type 17.5 >0.5 mutant stable 170 Strategies in protein engineering Strategies in protein engineering Strategies in protein engineering S E M I R AT I O N A L D E S I G N Example of semi-rational design ❑ c o n v e rs i o n o f 1 , 2 , 3 - t r i c h l o ro p ro p a n e b y D h a A f ro m R h o d o c o c c u s e r y t h r o p o l i s Y 2 Example of semi-rational design ❑ c o n v e rs i o n o f 1 , 2 , 3 - t r i c h l o ro p ro p a n e b y D h a A f ro m R h o d o c o c c u s e r y t h r o p o l i s Y 2 ❑ D I R E C T E D E VO LU T I O N - i m p o r t a n c e o f a c c e s s p a t h w ay s C176 Y176 F176 Bosma, et al. 2002: AEM 68: 3582-87 Gray, et al. 2003: Adv. Appl. Microbiol. 52: 1-27 Example of semi-rational design Pavlova, et al. 2009: Nature Chem. Biol. 5: 727-733 ❑ c o n v e rs i o n o f 1 , 2 , 3 - t r i c h l o ro p ro p a n e b y D h a A f ro m R h o d o c o c c u s e r y t h r o p o l i s Y 2 ❑ D I R E C T E D E VO LU T I O N - i m p o r t a n c e o f a c c e s s p a t h w ay s ❑ S E M I - R AT I O N A L D E S I G N - h o t s p o t s i n a c c e s s t u n e l s ❑ l i b ra r y o f 5 , 3 0 0 c l o n e s s c re e n e d Example of semi-rational design Pavlova, et al. 2009: Nature Chem. Biol. 5: 727-733 Experimental throughput is critical S T A N D A R D D E S I G N ▪ r a n d o m m u t a g e n e s i s ( 2 - 3 p o s i t i o n s ) ▪ l i b r a r y o f 1 0 4 c l o n e s A D V A N C E D D E S I G N ▪ r a n d o m m u t a g e n e s i s ( 5 - 7 p o s i t i o n s ) ▪ l i b r a r y o f > 1 0 6 c l o n e s v o l u m e : 1 0 ´ p L a s s a y s / d a y : 1 0 7 v o l u m e : 1 0 0 ´ m L a s s a y s / d a y : 1 0 3 AI in Protein Engineering DEEP MUTATIONAL SCANNING super vised learning SEQUENCE BASED PREDIC TION super vised learning MOLECULAR DYNAMI CS unsuper vised learning STRUCTURE PREDIC TION deep learning ACS Catal. 10, 1210-1223 (2020) – 105 ❑ … next week (Lesson 7) AI in Biology, Chemistry, and Bioengineering Bi9680En ▪ Období: podzim ▪ Rozsah: přednáška 2 hodiny/týden ▪ Vyučující: Dr. Stanislav Mazurenko ▪ Osnova: ▪ modern bio-challenges: drug design, DNA interpretation, protein engineering ▪ types of AI algorithms and workflow for designing predictors ▪ clustering algorithms, random forests, artificial neural networks ▪ features, databases, and predictors used in applications Bioinformatics 35: 4986-4993 (2019) Nucleic Acids Res. 47: W414-W422 (2019) Tools for protein engineering Nucleic Acids Res. 48, W356-W362 (2018) Nucleic Acids Res. 45, W393-W399 (2017) Brief. Bioinform., bbaa337 (2020) Bioinformatics 37, 23–28 (2021) Bioinformatics 34: 3586-3588 (2018) https://loschmidt.chemi.muni.cz/portal/ www.enantis.com Proteinové inženýrství Bi7410 ▪ Období: jaro ▪ Rozsah: přednáška 1 hodina/týden ▪ Vyučující: doc. Radka Chaloupková, Ph.D. ▪ Osnova: ▪ strukturně-funkční vztahy proteinů ▪ metody exprese a purifikace rekombinantních proteinů ▪ metody strukturní a funkční analýzy proteinů ▪ racionální design, semi-racionální design a řízená evoluce ▪ příklady využití proteinového inženýrství Reading ❑ L u t z , S . 2 0 1 0 : B e y o n d d i r e c t e d e v o l u t i o n - s e m i - ra t i o n a l p r o t e i n e n g i n e e r i n g a n d d e s i g n . C u r r O p i n B i o t e c h n o l . 2 1 ( 6 ) : 7 3 4 – 7 4 3 ❑ C o m p u ta t i o n a l e n z y m e r e d e s i g n a n d C o m p u ta t i o n a l d e n o v o e n z y m e d e s i g n ( p a g e 5 - 7 )