IPTG Induction f. colRKA W polymeia« T7 gene 1 repressor /ac I gene— F. coll genome T7 RNA polymerase ,-1-, 4 I * IPTG Induction T7RKA w polymerase Target gene Recombinant protein expression Recombinant proteins Recombinant DNA - artificial DNA sequence created by combining two or more strands of DNA Recombinant proteins - proteins obtained by introducing recombinant DNA into heterologous host, where the expression of this DNA occurs Taking advantage of recombinant proteins Overexpression and purification of recombinant proteins are essential prerequisite for • biochemical characterization of protein function (determination of kinetic parameters Km, kcat for enzymes with their substrates, K{ for enzymes with inhibitors and Kd for protein-protein or protein-ligand interactions). • analysis of protein structure (NMR, crystalography). • protein engineering (improvement of protein quality - activity, stability). • on an industrial scale there are produced drugs, vaccines and dietary supplements. The goal of recombinant technology: High yield of homogeneous proteins (mg - kg of proteins) Maintenance of biological activity of proteins Why to produce recombinant proteins? Natural source: • Difficult to obtain (tissues, organs). • Difficult to cultivate (bacteria, virusses, tissue cultures). • Limited expression • In most cases tough purification of protein TABLE 1.2. Examples of low-abundance proteins and peptides Isolated from natural biological sources Protein Source Yield (tig) Reference Multipotential colony-stimulating factor pokeweed mitogen-stimulated mouse spleen-cell-conditioned medium (10 liters ) 1 Cutteret al.( 1985) Human A33 antigen human colon cancer cell lines (1010 cells) 2.5 Catimeletal. (1996) Platelet-derived growth factor (PDGF) human serum (200 liters) 180 Heldin et al.( 1981) Granulocyte-colony-stimulating growth factor (G-CSF) mouse lung-conditioned medium [3 liters) 4M Nicola et al. (1983) G ran ul ocyte - ma c rop h age colony-stimulating growth factor (GM-CSF) mouse lung-conditioned medium (3 liters) 12 Burgess et al. (1986) Coelenterate morphogen sea anemone (200 kg) :m Schaller and Bodenmuller (1981) Peptide YY(PYY) porcine intestine (4000 kg) 600 Tatemoto(1982) Tumor necrosis factor (TNF) HL60 tissue culture medium (18 liters) 20 Wang and ' reasy 1 le- Murine transferrin receptor NS-1 myeloma cells (10'° cells) 2D vari Driel et al. (1984) Fibroblast growth factor (FGF) bovine brain (4 kg) 33 Gospodarowicz et al. (1984) Transforming growth factor-ß(TGF-ß) human placenta (8.8 kg) 47 Froliketal.(1983) Human interferon human leukocyte-conditioned medium 1111 liters i 21 Rubinstein et al. (1979) Musearinic acetylcholine receptor porcine cerebaim (600 g) Haga and Haga(1985) (J,-adrenergic receptor rat liver (400 g) 2 Grazianoet al. (1985) Adapted, with permission, from Simpson and Nice (1989). Scheme of recombinant protein technology Host organisms for recombinant protein expression • Procaryotic expression systems (E.coli, Baccilus subtillis,...) • Yeasts (Sacharomyces cerevisiae, Pichia Pastoris) • Mammalia cells (human embryonic kidney cells- HEK, Chinese hamster ovary cells -CHO) • Insect cells with baculovirusses • Expression in vitro - Cell free (lysates from rabbit reticulocyte, extracts from E.coli, extracts from wheat germ) Bacteria Yeast Insect Mammalian Expression Expression Expression Expression System System System System Criteria for selection of expression host organism Posttranslation modification Expression system N-glycosylation O-glycosylation Phosphorylation Acetylation Acylation Carboxylation E.coli missing - - - - - Yeasts Highly mannosilated glycans + + + + Insect cells Simple, without sialization + + + + Mammalian cells complex + + + + + Criteria for selection of expression host organism Expression system Budget for cultivation Growth rate Level of expression Protein conformation E. coli low high high Often refolding needed Yeasts low high Low/ high Sometimes refolding needed Insect cells high low Low / high Mostly correct Mammalian cells high low Mostly low Mostly correct PDB statistics Q in CD C CD o CD a 100% 80% 60% 40% 20% E.coli 1 I ^=^_ ! Mammalian Cell Free Organism used for expression Data from 2014 https://www.rcsb.org/stats/distribution-expression-organism-gene Production of heterologous proteins in E. coli Advantages : • High yield of recombinant proteins • Well known genome and proteome -facilitation of gene manipulation • Design of vectors facilitates cloning and expression of foreign genes • Rapid growth in an inexpensive medium • Adaptalibity of system Production of heterologous proteins in E. coli Disadvantages: • Requirement of cDNA of target protein • Absence of eukaryotic posttranslation systems (posttranslation modification) • Formation of insoluble inclusion bodies • Limited ability of disulfide bond formation • Missing secretion mechanism for effective protein releasing into the cultivation medium • Different codon usage compared to higher eucaryotes • Contamination of protein by lipopolysacharids (endotoxin) Expression system for recombinant prot production in E.coli Cultivation conditions Expression vector = cloning vector containing necessary regulation sequences for expression of foreign gene inserts pET-32a(«) sequence landmarks 17 promoter 764 780 T7 transcription start 763 Trx"Tag coding sequence 366-692 His"Tag coding sequence 327-344 S*7qg coding sequence 249-203 Multiple cloning sites {Nco\ Xho\) 158217 Hls'Tag coding sequence 140-157 T7 terminator 26-72 Air/coding sequence 1171-2250 pBR322 origin 36S4 bh coding sequence 4445-5302 fl origin 5434 5889 The maps for pET-32bt» and pET 32c(+) are the same as pET-32a(+) (shown) with the following exceptions: pET-32b(+) is a 5899bp plasmid; subtract I bp from each site beyond BamH 1 at 198. pET-32c(+) is a 590ibp plasmid; add lbp (o each site beyond BitnM I at 198 except for BrtfU V, which cuts at 209. Bsa 1(4576) Eam1l05 1(4357: BssH 11(1932) Hpa 1(2027) BspLUH 1(3622) Sap 1(3506] Bst1107 k£»3) TIM 11 1(3367) BspG l(3tdfl) PshA 1(2366) Psp5 11(2626) TAATACGACTCAC Lie i>p«raifji TATACATA' GACC Tf>'T,i(| AITCCCCtCTACAAATAATttTCtTtAACT Afecl His-Tag CICGCCGCr'ClGG:TC TCGCCA'AfGCACCA fotSo' IDfauo LeuAloC'ySerClySorGIvHi>n«tHisH.»Hi»MIsHIal fi t_. ttTT7-+■ S-Tag pnr-wt «69946-3 CCTATCAAACAAACCGC :CC rGC"AAATTCGAACKCAGCACATGGACAGCCCAGATCTGCGTACCGACGACGACGACAA C1ylVtLvaCI uThrAI aft IgA IoLysPheCluArqGInHiatetAspSarProAspLcuGI yThrAspAapAapAspLy Nco\ EraRV BamHI EccR I Sacl Sa/I ^nd 111 Not\ Xhol__His-Tag CAICMTCI TC'GGTCTGGICCCACGCGGITC" [ i «HisHiaHiahiiH pET-32b(+) GCCAICCCATATCrCICCATCCGAATTCCACCTCCCTCGACAAGCTICCGQCCGCACrCGAGCACCACCACCACCACCACTCACATCCGGCTGCTAA pET-32c(* Bpu\ 102 I 17 terminate* T CCCCCC TCTAAACCCCTCT TGAGCGCT T TT T TC 17 terminator primer #S933 7-3 pET-32a^c(+) cloning/expression region Expression vector configuration pET-32a(*) sequence landmarks T7 promoter T7 transcription start Trx"Tag coding sequence His*Tag coding sequence S'Tag coding sequence Multiple cloning sites (Nml-Xhol) His'Tag coding sequence T7 terminator l3Clcoding sequence pBR322 origin bis coding sequence f 1 origin 764 780 763 366-692 327-344 249 293 158-217 140-157 26-72 1171-2250 3684 4445-5302 5434-5889 Cloning sites The maps for pET-32b(+) and pET-32c(+) □re the same as pET-32a(+) (shown) with the following exceptions: pET-32b(+) is a 5899bp plasmid; subtract lbp from each site beyond BamH I at 198. pET-32c(+) is a 5901bp plasmid; add lbp to each site beyond Bamii I at 198 except for EaM V. which cuts at 209. ^ Gene for antibiotic resistance (ampicilin) Operator - Binding site for repressor PshA 1(23661 Psp5 11(2628) T7 promoter . lac operator TAA TAH Í1AC írAHtATAÍ GCCATGGCTGATATCGGATCCGAATTCGAGCTCCCTCGACAAGCTTCCGGCCCCACTCGACCACCA AloHetAlaAsplIeC1ySorGIuPhoGluLauArgArgGInAIaCysCIyArglhrArgAIaPro . . . pET-32b(+ Al aMotAlal I aSor AspProAartSor SorSor Va I AspLysLouA aA laA cL ouG uHisH isH sH i sH i sHi stnd GCCATGGGATATCTGTCGArcCGAATTCGAGCICCGrCGACAAGCTTGCCGCCGCACrCGAGCACCACCACCACCACCACiGAGAICCGGCTGCIAA pET-32c(+ A!ciH6tSiyTyrLa NCO 1(212) promoter i T7 piomoter . loc operator TAA TAH flAC. TrAfiTATAf Bsa 1(45761 j Eam11051(4357« Operator - Binding site for repressor lOSoo S-1,,.i CTGGCCGGTTCTCGIlCTGGCCAIATGCACCATCAlCAlCATCArIC1 IGT GG TCTGGTCCC ACGCGGHCT LeuAlaGlvSorGlvSorGlyM.sHetHiiH.aHnH.sHiiH.iSerSorClyLouVGlProArQGlYSor - S-Tag ^#69945-3 -——^J— Bo/II Kpn\ GGTATGAAAGAAACCCCTGC1CCTAAATICGAACGCCAGCACA7GCACAGCCCAGA IC r GGCT ACC G ACCAC G ACGAC AAG Gt yNet LyaGI ulhrA I aA ! aA I aLysPheG luArgGI >tH i a Not AjpSerProAspLeuG I y I"hr AspAspAapAsp^s pET-32a(+) Av3\ Nco\ EcdHV fiamH I EcoR I Sx I 3a/l Hind III Not\ Xha\__Hra'Tag GCCATGGCrGATATCGGATCCGAATTCGAGCTCCCTCGACAAGCTTCCGGCCCCACTCGA AIüfetA üAsüI leCIySerGIuPhoCIuLsuArgArgGinAIaCytCF yArglhrArg CACCACCACTGAGATCCGGCTGCTAA roProProLouAr gSorGIyCy«£nd GCCATGGCGATATCGGATCCGAAT TCGAGC T CCG TCGACAAGCT TGCGGCCGCACTCGAGCACCACCACCACCACCAC TGAGATCCGGC TGC TAA pET-32b(+; AIattotAI a I IaSerAapProAanSerSerSor YaIAspLyaLouA IaAIaA1 aLouCIuHiaHi»HibHisHisHiaLnd GCCATCGGATATCTGTCGArcCGAATTCGAGC!CCGrCGACAAGCTTGCCGCCGCACrCGAGCACCACCACCACCACCAC!GAGATCCGGCTGCIAA pET-32c(+) A i cif-et y TyrLeu J rpl 1 eA'-q 1 eA rqA ; af'i-oEe r 'I* v £c-LtijAj qlVoH i sSe-rSai- ThrrhrrhrThrlhrThfCl^llaArgLe^LoLlh!-_ \1 terminator CAAAGCCCGAAAGGAAGCIGACTTGGCrGClGCCACCGCICAGCAATAACrAGCATAACCCCl LysProGluArgLysLöuSerTrpLeuLouProPrDLeuSerAsnAirEnd GGGGCC feiAAACGCG fCTIGACCGGT7TT TTG 77 terminator primer #69337-3 pET-32a-c{+) cloning/expression region Promoter characteristics: • Strong promoter (ptac, ptrp, XpL, pT7) Target protein should result in accumulation of protein making up 10-30 % and more of the total cellular protein. • Easily transferable to various E.coli strains • Simple and cost effective induction - Thermal induction (^pL) - Chemical induction (ptac, ptrp, pT7): IPTG (isopropyl-P-D-thiogalactopyranosid) • It should exhibit minimal level of basal expression Large-scale gene expression preferably employs cell growth to high density and minimal promoter activity, followed by induction of the promoter. The tight regulation of a promoter is necessary for the production of proteins which may be detrimental to the host cell. Expression vector configuration pET-32a(+) sequence landmarks T7 promoter 764 780 T7 transcription start 763 Trx'Tag coding sequence 366-692 His*Tag coding sequence 327-344 S'Tag coding sequence 249-293 Multiple cloning sites (\co\-Xhol) 158-217 His'Tag coding sequence 140-157 T7 terminator 26-72 lad coding sequence 1171-2250 pBR322 origin 3684 bict coding sequence 4445-5302 fl origin 5434-5880 The maps for pET-32b(+) and pET-32c(+) are the same as pET-32a(+) (shown) with the following exceptions: pET-32b(+) is a 5899bp plasmid; subtract lbp from each site beyond ftwnH I at 198. pET-32c(+) is a 5901bp plasmid; add lbp to each site beyond BamU I at 198 except for EcoR V. which cuts at 209. Ava 1(156) Xho 1(158) Eag 1(166) Not 1(166) Hind 111(173) Sal 1(179) Sac 1(196) EcoR 1(192) BamH 1(198) EcoR V(206) Nco 1(212) Bsa 1(4576) j Eam11051(4357« AlwN 1(4036) Bst£ 11(1702) Bmg 1(1730) Apa 1(1732) BssH 11(1932) Hpa l(2D27) BspLUII 1(3622) Sap 1(3506) Bst1107 1(3393) TH1111 1(3367) BspG 1(3148) PshA 1(2366) Psp5 11(2628) T7 piomoter operate* XbS\ TAA TAG CAC TCAC'ATAGGCGAAT T r ^ - T .1.1 TATACATATGAGC 315BD MotSor I05aa GTGAGGGGATAACAAT tCCCCTC IAGAAAIAATTTTG7T _ Msz\_ His-fag S'Taq CTGGCCGGTTCTGGTTCrGGCCATATGCACC LeuAlaGlvSorGlyScrGlyN.sHetHiiH _ „—r,-*■ S-Tag pnmer»69945-3 N$PV_ BgVIL Kpn' ATCATTCTTCEGGTCTGGTGCCACGCGGfTCT sH sSerSer-Gi yLe^Vgi P-oAr-gC I y5e:- thrombin GGTATGAAAGAAACCCCIGC1GCTAAATICGAACGCCAGCACA7GGACAGCCCAGAIC rGGCIACCGACCACGACGACAAG G t /Met LysG ulhrA aA! oA I aLysPheG kiArgG rcHi sHot AjpSerProAspLeuG ly thr-Asp^pAapAsi^s pET-35a(+) f„l Ara| Heel fgjRVBamHI EcdR I Sac I Salt Knd II Not I Xho\__Hra-Tag 1 enterokinase GCCATGGC TGATATCGGATCCGAAT TCGAGCTCCCTCGACAAGCT TGCGCCCGCAC TCGA AIafe t A aAsoi IeC!ySerGluPhoGluLouArgArgGlnAlaCysCfyArglhrArg CACCACCACTGAGATCCGCCTGCtAA roProProLeuArgSarGIyCyStnd GCC AT GGGA TA TC TGTGGATCCGAATTCGAGC'CCGrCGACAAGCTTGCCGCCGCACrC GAGCACC ACC ACC ACC ACC AC I GAG A JC CGGC T GC I AA pt I *32c(+] CAAAGCCCGAAWGAAGCTGAGTT&GCTGCrKCACCGClGAGCAATAMTAGCA LysProG111 ArgLysLeu SerT rpLeuLeuProP roLeu SerAsnAsf1- ! ,' H'iniin.iiii 77 terminator primer #69337-3 pET-32a-c{+) cloning/expression region Ribosome-binding site ^Transcription terminator Expression vector configuration ■35 -10 TTGACA (N)i 7 TATAAT mRNA 5' 16SrflNA 3' UAAGQAGG (N)8 HOAUUCCUCC START codon AUG (91%) GUG (8%) UUG (1%) STOP codon UAAU UGA UAG RiboSOme-binding Site consists of the Shine-Dalgarno (SD) sequence and the translational start codon Length between SD sequence and start codon is 4-13 nucleotides. These length influences effectivity of translation initiation (optimal length is 4-8 nucleotides), high content of AT base pairs . Transcription terminator T7 term, rrnTi,T2 (preclusion of promoter occlusion, improvement of mRNA stability) Expression system for recombinant protein production in E.coli Cultivation condition Production of recombinant protein in BL21(DE3) E. coli host strain Toxicity of recombinant protein to the host strain • Toxicity to the host strain is not limited to foreign genes but may also result from the overproduction of the specific native genes. Proteins that are lethal for the host: • Recombinant proteins containing hydrophobic regions are toxic for the cells because they associate with the membranes or incorporate to the membrane system of the cell and disturb membrane potential. • Proteins inactivating ribosomes. Selection of E.coli host strain considering problem with protein toxicity to the host • Toxicity to the host strain is not limited to foreign genes but may also result from the overproduction of the specific native genes. • Tight regulation of the expression system Bacterial strains with various level of the expression regulation are commercially available. BL21(DE3) BL21(DE3)pLysS Different level of expression system regulation >/BL21(DE3) BL21(DE3)pLysS/E BL21 App. 10 % level of basal expression (before induction of expression) of certain gene. Different level of expression system regulation BL21(DE3) ^BL21(DE3)pLysS/E BL21 • pLysS and pLysE plasmids enabling tight regulation of expression system using T7 promoter. These plasmids harbor gene coding for lysozyme. Lysozyme inactivate T7 RNA polymerase to reduce basal expression. App. 1-3 % level of basal expression (before induction of expression) of certain gene. Different level of expression system regulation • Induction of expression by CEG bacteriophage infection (gene for T7 RNA polymerase) The highest level of repression!! E. Coli codon usage • Genes in both prokaryotes and eukaryotes show a nonrandom usage of synonymous codons. • Codons that are rarely used by E. coli may occur in heterologous genes originating from eukaryotes, archebacteria. The frequency of use of synonymous codons usually reflects the abundance of their cognate tRNAs. Escherichia coliK12 [gbbct]: 14 CDS's (5122 codons) fields: [triplet] [frequency: per thousand] ([number]) UUU 19.7( UUC 15.0( UUA 15 . 2 ( UUG 11.9( CUU 11.9( CUC 10.5( CUA 5.3( CUG 4 6.9( AUU 3 0.5( AUC 18.2( AUA 3.7( AUG 24.8( GUU 16.8( GUC 11.7( GUA 11.5( GUG 2 6. 4 ( 101) 77) 78) 61) 61) 54) 27) 240) 156) 93) 19) 12 7) 86) 60) 59) 135) ucu ucc UCA UCG 5.7 ( 5.5 ( 7 . 8 ( 8 . 0 ( CCU 8.4( CCC 6.4( CCA 6.6( CCG 2 6.7( ACU 8.0( ACC 22.8( ACA 6.4( ACG 11.5( GCU 10.7( GCC 31.6( GCA 21.1( GCG 38.5( 29) 28) 40) 41) 43) 33) 34) 137) 41) 117) 33) 59) 55) 162) 108) 197) UAU 16.8 UAC 14.6 UAA 1. 8 UAG 0 . 0 CAU 15.8 CAC 13.1 CAA 12 . 1 CAG 27.7 AAU 21.9 AAC 24.4 AAA 33.2 AAG 12.1 GAU 37.9 GAC 2 0.5 GAA 43.7 GAG 18.4 86) 75) 9) 0) 81) 67) 62) 142) 112) 125) 170) 62) UGU 5.9( UGC 8.0( UGA 1.0( UGG 10.7( CGU 21.1( CGC 26. 0 ( CGA 4.3( CGG 4.1( AGU 7.2( AGC 16■6( AGA AGG 1.4 ( 1 -6f 194) GGU 21.3( 105) GGC 33.4( 224) GGA 9.2( 94) GGG 8.6( 30) 41) 5) 55) 108 ) 133) 22 ) 21) 37 ) 85) 7) 8) 109) 171) 47) 44 ) Coding GC 52.35% 1st letter GC 60.82% 2nd letter GC 40.61% 3rd letter GC 55.62% Arabidopsis thaliana [gbpln]: 80395 CDS's (31098475 codons) fields: [triplet] [frequency: per thousand] ([number]) UUU 21.8(678320) UUC 20.7(642407) UUA 12.7(394867) UUG 2 0.9(649150) CUU 24.1(750114) CUC 16.1(500524) CUA 9.9(307000) CUG 9.8(305822) AUU 21.5(668227) AUC 18.5(576287) AUA 12.6(391867) AUG 24.5(762852) GUU 27.2(847061) GUC 12.8(397008) GUA 9.9(308605) GUG 17.4(539873) UCU 25.2(7 82 818) UCC 11.2(348173) UCA 18.3(568570) UCG 9.3(290158) CCU 18.7(580962) CCC 5.3(165252) CCA 16.1(502101) CCG 8.6(268115) ACU 17.5(544807) ACC 10.3(321640) ACA 15.7(487161) ACG 7.7(240652) GCU 28.3(880808) GCC 10.3(321500) GCA 17.5(543180) GCG 9.0(280804) UAU 14.6(455089) UAC 13.7(427132) UAA 0.9( 29405) UAG 0.5( 16417) CAU 13.8(428694) CAC 8.7(271155) CAA 19.4(604800) CAG 15.2(473809) AAU 22.3(693344) AAC 20.9 (650826) AAA 30.8(957374) AAG 32.7(1016176) GAU 36.6(1139637) GAC 17.2(535668) GAA 34.3(1068012) GAG 32.2(1002594) UGU 10.5(327640) UGC 7.2(222769) UGA 1.2( 36260) UGG 12.5(388049) CGU CGC CGA CGG 9.0(280392) 3.8(117543) 6.3(195736) 4.9(151572) AGU 14.0(435738) AGC 11.3(352568) \GA 19. 0 (51)9788) AGG 11 ■ 0 ( 840922) GGU 22.2(689891) GGC 9.2(284681) GGA 24.2(751489) GGG 10.2(316620) Coding GC 44.59% 1st letter GC 50.84% 2nd letter GC 40.54% 3rd letter GC 42.38% http://www.kazusa.or.ip/codon/ Low-usage codons in E. coli Codon(s) Amino acid AGA, AGO. PGA, CGG..............................................................Am UGU. UGC.....................................................................................Cvs GGA.GGG.....................................................................................Glv AUA.................................................................................................He CUA, CUC......................................................................................Leu CCC. CCU. CCA............................................................................Pro UCA, AGU.UCG. UCC..............................................................Ser ACA.................................................................................................Thr Makrides, 1996 Expression of low-usage codons containing heterologous genes leads to the translation errors! • Premature termination of translation (truncated product) • Open reading frame shift (shift by two amino acids in AGA codon position) • Change of amino acid - often arginine (AGA codon) for lysine Selection of E.coli host strain considering problems with low usage codons • Comercially available strains that produce tRNA of low usage codons. •BL21 (DE3) CodonPlus-RIL •AGG/AGA (arginine,R), AUA (isoleucine, I) and CUA (leucine, L) •BL21 (DE3) CodonPlus-RP •AGG/AGA (argmme, R) and CCC (proline, P) •Rosetta or Rosetta (DE3) •AGG/AGA (argmme, R), CGG (arginine, R), AUA (isoleucine, I) CUA (leucine, L)CCC (proline), and GGA (glycine, G) Plasmids complementing tRNA. IBS)? teuW OR: Site directed mutagenesis of low usage codons Protein degradation E.Coli proteolytic system includes a large number of proteases that are localized mainly in the cytoplasm, but also in the periplasm, and the inner and outer membranes. • In- complete polypeptides • Proteins with amino acid substitutions • excessively synthesized subunits of multimeric complexes • Proteins damaged through oxidation or free radical attack • Foreign, recombinant proteins (proteins < 10 kDa are problematic) Selection of E.coli host strain considering proteolysis of recombinant protein Protease-deficient host strains • Mutation eliminating production of proteases and thereby proteolytic degradation of recombinant proteins. BL21 expression strain deficient in: Ion cytoplasmatic protease 0/7?prperiplasmatic protease Targeted protein expression Cytoplasmatic expression Periplasmatic expression Extracellular expression (into the cultivation medium) Cytoplasm Nucleoid *y.y - Cell wall Cytoplasmic membrane Cytoplasmatic expression • mostly used Advantages • High protein yield • Simplier plasmid constructs • Inclusion bodies Disadvantages • Inclusion bodies • Reducing enviroment • Proteolysis • More complex purification Inclusion bodies • Insoluble protein aggregates (up to 2um3) consisted of native protein of limited solubility, of the unfolded state and partially folded intermediate state of protein. What causes their formation? 1. Microenvironment of E.coli may differ from that of original source in terms of redox potential (reducing environment in E.coli cytoplasm), pH, osmolarity, absence of chaperones, cofactors, lack of post-translational modifications. 2. High level of expression Hydrophobic stretches in the nascent polypeptide are present at high concentrations and associate intramolecularly. Inclusion bodies Advantages • Easy isolation in high purity and concentration • Protease protection • Advantage for lethal protein Disadvantages • Protein insolubility • Refolding to recover protein activity • Refolded protein may not recover its biological activity • Reduction of final yield Inclusion bodies E. Coli expression Soluble protein Isolation followed by refolding Soluble protein Inclusion bodies Modification of expression conditions E.g. • Lowering of cultivation temperature • Co-production of chaperones • Using of solubility enhancing tags (thioredoxin) • Selection of E.coli strain - e.g. thioredoxin reductase deficient strain Selection of E.coli host strain considering problems with insolubility • If the protein contains one or more disulfide bonds, folding is stimulated in oxidizing environment in cytoplasm, that is provided by following E.coli strains. AD494 • Mutation in gen for pro thioredoxinreductase (trxB) Origami • Two mutation in gen for thioredoxin reductase (trxB) and glutathionreductase (gor) Periplasmatic expression • Periplasm contains only 4% of all cellular proteins (app. 100 proteins) • Transmembrane transport is mediated by N-terminal signal peptide • Prokaryotic signal peptides succesfully used in in E.coli (ompA, ompT from E.coli, protein A from S. Aureus, endoglukanase z B.subtilis) Advantages • Simplier purification • Limited proteolysis • Improving disulfid bond formation/folding Disadvantages • Signal peptide does not always provide transport do periplasm • Formation of inclusion bodies Extracelular expression • Protein secretion into a cultivation medium •Effective transport through outer membrane is missing in E.coli (E.coli naturaly secretes limited number of proteins). • The manipulation with transport ways enabling protein secretion is still unsuccesfull. Advantages • Minimal contamination by other protein (simplier purification) • Limited proteolysis • Folding improvement Disadvantages • very low secretion • highly diluted protein ssion system for recombinant prot production in E.coli Vector Host strain Modification of cultivation conditions Possibilities for protein solubility enhancement: • High cell culture optical density • Medium composition (pH, addition of specific substrates and cofactors, type of cultivation medium - reach and minimal) • Temperature optimization for induction of expression. • Concentration of inducing agent • Length of induction Experimental setup for protein expression and solubility trials Overnight culture in LB medium, 3 mL Inoculate 800 uL into each flask containing 20 mL of TB medium. AA / pH 6 \ / pH 7 \ Agitate at 37 °C until OD60O = 0.6-0.8. Sample 1 mL of non-induced culture from each flask, spin down, remove the supernatant and store the pellet at -20 °C. Induce protein expression with IPTG (20 uL of 1 M IPTG). pH pH Split the content of each flask into 4 bacterial tubes (4 mL per tube) and grow them at different temperatures/times. pH pH 8 B 6 a v \J \j 18 °C overnight 8 B 7 s 8 8 28 °C overnight 37 °C 3h 22 °C overnight Sample 1 mL from each tube and place separately in TissueLyser adapter according to the scheme in Table 2. Centrifuge at 3220 x g at 4 °C for 2 minutes, remove supernatant with aspirator. A. Smitkowska et al, 2020 Example results from expression and solubility test Mw (kDa) 95 72 I 55 43 Mw (kDa) 95 72 55 43 Strain 1 NI6 28 29 30 T29 31 32 33 T32 34 35 36 T35 Strain 1 Strain 2 28 29 30 28 29 30 ▲ A Strain 2 NI6 28 29 30 T29 31 32 33 T32 34 35 36 T35 A A E. coli ER2566, TB medium pH 6.0, 3 h induction at 37°C, lysis buffer: MES pH 6.0 (28) or Tris-HCl pH 7.5 (29) A. Smitkowska et al, 2020 Evaluation of cultivation temperature optimisation Production of AHP proteins in soluble form (in %) t(°C) growth/induction AHP1 AHP2 AHP3 AHP4 AHP5 AHP6 37°C/28°C 8% 85% 100% 0 76% 0 37°C/22°C 82 % 73% 100% 0 81 % 51 % 22°C/22°C 71 % 78% 100% 30 % 81 % 73% Production of heterologous proteins in yeasts ADVANTAGES: • easy gene manipulations • fast growth into the high densities (fermentor), low price • ability to posttranslationaly modify expressed protein • ability of extracellular secretion of produced protein • ability to produce protein with proper conformation • expressed protein without contamination by endotoxines DISADVANTAGES: • using type of N-oligosacharides structurally different from mammals • hyperglycosylation Yeast expression system- Sacharomyces cerevisiae - Pichia past oris - Sacharomyces cerevisiae -first yeasts used for recombinant protein production - P. pastoris - most used yeasts expression system - P. pastoris - ability to grow to high densities (S. cerevisiae produce high amount of secondary metabolites, which limit to reach high density of the yeast culture) Pichia pastoris - P. pastoris use different type of N-glycosylation compared to S. cerevisiae: glycans contains 8-17 mannose molecules (linked by a 1,2 bond) in P. Pastoris O-linked glycans W-linked glycans Glycans contain 50-150 mannose molecules (so called hyperglycosylation, terminal mannoses linked by a 1,3 bond) in S. cerevisiae B O-linked glycans N-linked glycans Vector configuration for expression in P. pastoris (784S) Aatll (7846) Zral (/406) Seal (7516) Pvul iiipi (se8) Nsil «,//) Xcml (/06) li.lllllll (f>■>.'} DspQI s,i|il !',8!iC>) N* I HAol GkNAcManS Vi-.i GaJT ■ ■4 So2(WJ(!ICHAc!llin3 MnT (Ocnlp) ■ i ■ 1.2 .1 ■ . 1.6 HnTi ■ • JJH.N-GkNAq • lH.4-GtfiNAfi ■ jv1.?.GleMAe ■ (M.*-Hifi ■ d 16-Man ER (R pastoris, Aalg3. Aocftl) MjtmrrmiitTnTMiin Production of heterologous proteins in insect cells (insect cells with baculoviruses) ADVANTAGES: Ensuring the native conformation of the protein Posttranslational modifications There is no contamination of the final product with endotoxins Protein secretion DISADVANTAGES: Negative effect of baculovirus infection on cell viability Heterologous genes are not produced continuously (each expression requires a new infection of cells with baculovirus) A method of glycosylation different from mammalian cells Lower yield, more time consuming, expensive media, harder to handle (risk of contamination) What is baculovirus? enveloped, ds DNA virus with rod-shaped capsid during the life cycle there are two different forms of "budding virus" and encapsulated virus highly species specific - infect only invertebrates the most common hosts are immature larval forms of insects AcMNPV virus (Autographa californica multiple nuclear polyhedrosis virus) is one of the most studied and used baculoviruses. Baculovirus Multicapsid nucleopolyhedrovirus Budded Virus Occluded Virus Occlusion Body CJjud B-c: 0(1 .flit- iwpidl Mt "ibnnt 50 i-.ni, .1! -. sppravrnaAb scale Coil rtesy: u w \\ r^ n s\> lji's.c<>m/Ui |j uti ims Insect cells with baculoviruses expression system - based on infection of cultured insect cells with a recombinant AcMNPV virus (Autographa californica multiple nuclear polyhedrosis virus) carrying a gene for the production of a target protein. • insect host cells - ovarian cells of butterflies of the species Spodoptera frugiperda (Sf9, Sf-21) CuHured insect • C4J» ff/ lysis / -w I Virions Polyhed ' ä 3; baculovirus vector (so-called n ^6lc) bacmid) - Large shuttle vector (AcMNPV) - contains all the genes necessary for the production of viral particles inseci ceil (in vivo) Generaiion ol recombinant bacijlovioig Insect cell {in vitw) very strong polyhedrin gene promoter (polyh) Bac-to-Bac expression system : „from Bacterium to Baculovirus" PFaetBac donor plasmid i Gene of Interest Recombinant Donor Plasmid transformation -► Competent DHIOBac E.coli Cells Determine viral titer via plaque assay boooooooooooood Recombinant virus particles 1f f t Site specific transposition Antibiotic selection Transfection into the insect cells Infection of insect ceils looooooooooooood pPclh E cali (iae7~) Containing Recombinant Bacmid Recombinant bacmid isolation Recombinant Bacmid DNA Expression verification & Virus amplification Virus stock After successful transfection, recombinant viral particles are recovered from the medium (viral stock) and can be amplified to higher amounts so that they can be used to infect a sufficient number of cells. Bac-to-Bac expression system: „from Bacterium to Baculovirus" Expression evaluation + After two generation a titer of titru 106 - 108pfu/ml is reached. Worst Best