Gene Technology - Introduction1 GENE TECHNOLOGIES Introduction: Chemical structure of nucleic acids, transcription and its regulation in prokaryotes (sigma factor, LAC operon, activators and repressors) and eukaryotes (transcription enhancers, epigenetics), translation and its regulation in prokaryotes and eukaryotes. Gene Technology - Introduction2 Syllabus 1. Chemical structure of nucleic acids, transcription and its regulation in prokaryotes (sigma factor, LAC operon, activators and repressors) and eukaryotes (transcription enhancers, epigenetics), translation and its regulation in prokaryotes and eukaryotes. 2. Model organisms used in biotechnology - bacteria (E. coli), yeasts (Pichia, Saccharomyces) and fungi (Penicillium), Caenorhabditis elegans (nematode), Drosophila melanogaster, Danio rerio (Danio rerio), house mouse, animal cell cultures, Arabidopsis thaliana (Goosefoot), viruses (bacteriophages, retroviruses). DNA replication in eukaryotes and prokaryotes, repair processes, in-vitro DNA synthesis (PCR, reverse transcription). 3. Manipulation of DNA, RNA, and proteins (Cell fractionation, isolation of proteins and nucleic acids). 4. PCR techniques. DNA sequencing, high-throughput sequencing methods. 5. Methods of studying gene expression and function (Mapping techniques, DNA libraries, gene expression, metagenomics). 6. Gene Cloning Strategies (Restriction endonucleases, plasmids, and cloning vectors, optimization of gene expression, expression in foreign hosts). 7. Technologies in Immunology (Antibodies (structure, function), targeted antibody design, monoclonal antibodies, ELISA, vaccines (design and production, identification of potential new antigens, DNA vaccines) Gene Technologies3 Structure of nucleic acids ̶ DNA and RNA - polymers consisting of subunits called nucleotides ̶ Nucleotide - phosphate group, sugar (ribose, deoxyribose), base (A,G,C,T,U) ̶ A phosphate links two sugar residues via a phosphodiester bond ̶ The most stable structure - a double-stranded DNA molecule in anti-parallel strand orientation (double helix) ̶ Purine bases pair with pyrimidine bases (A-T, G-C, A-U) via hydrogen bonds; G-C pair more stable due to three bonds Gene Technologies4 Structure of nucleic acids Clark and Pazdernik, 2016 Gene Technologies5 Nucleic acid base analogues ̶ Lock Nucleic Acid (LNA), Bridged Nucleic Acid (BNA) ̶ The 2'-O and 4'-C atoms of the ribose ring are connected by a methylene bridge ̶ Fixation of the ribose ring in the optimal conformation for Watson-Crick pairing ̶ The pair forms faster and has higher stability ̶ LNA oligonucleotides are ideal for the detection of short or very similar targets within DNA/RNA ̶ Higher specificity of probes in qPCR (SNPs detection), unique resolution of microRNA families, higher stability (in vitro/in vivo), very efficient inhibition of small RNAs in vivo. Gene Technologies6 Nucleic acid base analogues ̶ Peptide Nucleic Acids (PNAs) - Nielsen et al., Science, 1991 ̶ DNA analogues, the phosphodiester bond is replaced by N-(2-aminoethyl)glycine ̶ Synthetic skeleton - unique properties in hybridization ̶ PNA is uncharged = no electrostatic repulsion during hybridization = high stability of PNADNA, PNA-RNA duplexes ̶ PNA hybridizes independently of the concentration of salts in solution ̶ PNAs are not degraded by nucleases or proteases and are not recognised by polymerases ̶ PNA can bind in anti-parallel and parallel arrangement, forming a triplex (Hoogsteen pairing) Gene Technologies7 Peptide Nucleic Acids (PNAs) Brind'Amour, Julie. (2020) Hoogsteen (1963) Acta Crystallographica. 16Watson and Crick (1953). Nature 171 Gene Technologies8 Peptide Nucleic Acids (PNAs) ̶ The use of PNAs in vivo is limited by low cell permeation - association with DNA oligomers, receptor ligands or cell-penetrating peptides ̶ Use of PNA: - specific delivery to the core - use in PCR and Q-PNA PCR - nucleic acid binding (DNA/RNA capture) - hybridization techniques (PNA-FISH) Pellestor et al. 2004 Gene Technologies9 DNA conformation ̶ Right-hand double helix, 1 turn approx. 10 pairs of bases, 34 Å ̶ It can take on different conformations: B-form - low concentration of salts (10 bp/swirl) A-form - high concentration of salts (11 bp/rev) Z-form - left-hand double helix (12 bp/turn) Gene Technologies10 G-quadruplexes ̶ Formation in G-rich areas ̶ Structure stabilized by Hoogsteen pairing and monovalent cation (K+ > Na+ >Li )+ ̶ Four-stranded non-canonical DNA structure ̶ Key functions in transcription, replication, genome stability and epigenetic regulation ̶ Significance in cancer therapy (use of molecules stabilizing the G4 structure) ̶ Discovery of a number of proteins interacting specifically with G4 Spiegel et al. 2020 Gene Technology - Introduction11 Nucleic acid packing ̶ DNA molecule is too long - condensation required ̶ In bacteria, supercoiling occurs by the enzyme DNA gyrase (left-handed twisting) ̶ Unfolding of the condensed structure by topoisomerase I ̶ In eukaryotes, DNA is wound on histones carrying a positive charge - chromatin ̶ The nucleosome consists of about 200 bp and nine proteins (H2A (2x), H2B (2x), H3 (2x), H4 (2x) and H1) ̶ Chromatin is further coiled into a helical structure (30-nm filaments, 6 nucleosomes/turn) ̶ The fibers are attached to the chromosomal axis by so-called "matrix attachment regions" (MAR) ̶ MAR have approx. 200-1000 bp and are rich in A/T Gene Technologies12 Nucleic acid packing Bacteria Eukaryota 30-nm filaments, 6 nucleosomes/turn Gene Technologies13 The central dogma of molecular biology ̶ Key characteristics of living creatures - the ability to reproduce its own genome - create your own energy ̶ The need for organisms to make proteins encoded within their DNA ̶ Proteins - energy production, replication control, intra- and intercellular communication ̶ The central dogma of molecular biology: DNA is transcribed into RNA, which is then translated into proteins Clark and Pazdernik, 2016 Gene Technologies14 Transcript ̶ Making copies of RNA based on DNA code ̶ Includes: - DNA unwinding - filament untangling at the beginning of transcription - histone removal - RNA production by RNA polymerase enzyme (5´→3´processivity) ̶ "Housekeeping" genes transcribed continuously ̶ Inducible genes transcribed only under specific conditions (lac operon) ̶ The resulting encoded product - protein (mRNA), RNA (tRNA, rRNA, snRNA, ribozyme) ̶ Cistron (structural gene) - coding regions of genes for proteins or non-translated RNA ̶ Open reading frame (ORF) - a stretch of DNA encoding a protein not interrupted by a STOP codon Gene Technologies15 Transcript ̶ Each gene has a promoter upstream of the coding sequence ̶ Bacterial promoters - region -10(TATAA) and -35(TTGACA) ̶ Constitutive genes - a great consensus ̶ Controlled genes - activation proteins/transcription factors ̶ Transcription: - Place of beginning of transcription - 5´ untranslated region (5´UTR) - ribosome binding - open reading frame (ORF) - custom protein - 3´ untranslated region (3´UTR) - regulation of translation rate Clark and Pazdernik, 2016 Gene Technologies16 RNA polymerase ̶ Composed of several subunits - sigma subunit - -10 and -35 area recognition - own enzyme catalyzing the synthesis (5´→3´) ̶ The enzyme has five subunits (2 x ,  and ´, ) −  and ´ - the actual catalytic site −  subunits help recognize the promoter ̶ After RNA polymerase binding - transcription bubble formation ̶ RNA polymerase uses a non-coding strand (antisense) ̶ RNA sequence identical to the coding sequence ̶ RNA synthesis starts from purine surrounded by pyrimidines (CAT, CGT) ̶ Synthesis rate 40 bases/second Clark and Pazdernik, 2016 Gene Technologies17 Jie Chen et al. PNAS 2010;107:28:12523-12528 Gene Technologies18 Jie Chen et al. PNAS 2010;107:28:12523-12528 Gene Technologies19 Transcription termination ̶ The transcription is terminated by a termination signal ̶ Rho-independent terminator - typically GC-rich hairpin followed by poly-T site - RNA polymerase usually unbinds in the middle of the poly-T sequence ̶ Rho-dependent terminator - contain two inverted hairpins - Rho is a homohexameric RNA-dependent ATPase - binds to the C-rich region upstream of the termination site - moves along the RNA until it reaches the RNA polymerase at the hairpin Gene Technologies20 Organisation of chromosomes Clark and Pazdernik, 2016 ̶ Prokaryotes - distance between genes small - genes of one metabolic pathway next to each other (operon) - polycistronic mRNA ̶ Eukaryota - monocistronic mRNA - in polycistronic only the first cistron is transcribed Gene Technologies21 Transcription in Eukaryotes ̶ Involvement of three RNA polymerases: - RNA polymerase I (transcription of large ribosomal RNAs) - RNA polymerase II (transcription of protein-coding genes) - RNA polymerase III (transcription of tRNA, 5S rRNA, small RNA) ̶ RNA pol. II is required for transcription: - initiation box, TATA box, transcription factor binding elements - basic transcription factors - specific transcription factors - TATA box protein (TBP) Yokoyama, 2019 Gene Technologies22 RNA polymerase II activation ̶ TFIID→TFIIB →RNA pol. II/TFIIA →TFIIF →TFIIE,TFIIJ,TFIIH ̶ TFIIH phosphorylates RNA pol. II ̶ TFIIH remains associated with RNA polymerase II Clark and Pazdernik, 2016 Gene Technologies23 Transcription regulation in prokaryotes ̶ Involvement of transcription activators and repressors - activators - positive regulation - repressors - negative regulation ̶ Binding to the promoter region of DNA ̶ Blocking RNA polymerase binding or the beginning of transcription ̶ Most genes are controlled by a combination of factors ̶ Regulatory proteins can slow down elongation or terminate it prematurely ̶ Anti-terminator proteins bypass the transcription termination site ̶ Key role of different sigma () subunits - s70 (RpoD) - recognizes most house-keeping genes - s32 (RpoH) - activation of heat shock related genes (chaperonins and proteases) Gene Technologies24 Temperature shock Gene Technologies25 Lactose operon ̶ Transcription regulatory proteins exist in active (binding) and inactive (non-binding) forms ̶ Transition between forms by binding signaling molecules or inducers ̶ lac operon = polycistronic - lacZ (b-galactosidase), lacY (lactose permease), lacA (lactose acetylation) ̶ lacI = repressor of lac operon, coded in the opposite direction ̶ The promoter contains a lacO binding site (operator) and a Crp site for binding CRP protein (cAMP receptor protein ̶ In case of glucose deficiency and the presence of lactose: - elevated cAMP levels - formation of allolactose (isopropyl-thiogalactoside analogue, IPTG) by -galactosidase Clark and Pazdernik, 2016 Gene Technologies26 Lactose operon Wheatley et al., 2016 Gene Technologies27 Lactose operon Clark and Pazdernik, 2016 Gene Technologies28 Two component control system ̶ Often, covalent modification of the activator/repressor occurs by various groups (methyl, acetyl, AMP-/ADP- ribose. ̶ In the case of a two-component regulatory system, phosphate is transferred from the sensor kinase to the activator/repressor ("phosphorelay system") Clark and Pazdernik, 2016 Gene Technologies29 Regulation of transcription in eukaryotes ̶ Far more complex compared to prokaryotes - DNA is wound on histones - the nuclear membrane does not let most proteins into the nucleus - large role of epigenetic modifications (DNA, histones) ̶ All transcription factors have two binding domains - DNA and the transcription apparatus Gene Technologies30 GAL4 transcription factor Clark and Pazdernik, 2016Ashkenazy et al, 2010 Gene Technologies31 Regulation of transcription in eukaryotes ̶ Far more complex compared to prokaryotes - DNA is wound on histones - the nuclear membrane does not let most proteins into the nucleus - large role of epigenetic modifications (DNA, histones) ̶ All transcription factors have two binding domains - DNA and the transcription apparatus ̶ Transcription factors work through the so-called mediator complex ̶ Mediator complex: - transmits the signal from activation proteins to RNA polymerase II - contains 26 distinct subunits forming the nucleus - is directly associated with RNA polymerase II, where it waits for information ̶ Transcription factors can also bind to so-called transcription enhancers Gene Technologies32 Regulation of transcription in eukaryotes Clark and Pazdernik, 2016 Gene Technologies33 Insulators (Insulators) ̶ DNA sequences that prevent transcription enhancers from mistakenly activating genes ̶ They are placed between the amplifier and the genes that may not regulate ̶ Insulator binding protein (IBP) binds to these sequences and blocks transcription enhancers ̶ IBP cannot bind to methylated DNA Clark and Pazdernik, 2016 Gene Technologies34 AP-1 (activator protein-1) ̶ Activates a broad spectrum of genes ̶ The best AP-1 stimulators include growth factors and UV radiation ̶ Dimer of two proteins from the Fos and Jun family ̶ Belongs to the bZIP family of DNA binding proteins ̶ AP stimulation - increased expression of Fos and Jun proteins - increased stability of Fos and Jun proteins - phosphorylation of the activation domain by JNK (Jun aminoterminal kinase) Clark and Pazdernik, 2016 Gene Technologies35 Processing of eukaryotic mRNA Gene Technologies36 Processing of eukaryotic mRNA ̶ cap is added to the 5´ end of the mRNA (m7GTP) ̶ the polyA end is added to the 3´-end by a poly-adenylation complex Yadong Sun et al. PNAS 2018 Gene Technologies37 Processing of eukaryotic mRNA ̶ Introns are removed from the primary transcript by spliceosome splicing factors Shi. Nature 2017 https://www.youtube.com/watch?v=JnBf3tq_aXY Gene Technologies38 Epigenetics ̶ Any change within the DNA other than in the nucleotide sequence A) post-translational modifications of histones B) DNA methylation C) nucleosome remodelling D) RNA-mediated silencing ̶ Most epigenetic changes affect the access of regulatory proteins to DNA - loose chromatin (euchromatin) - easy access of regulatory proteins - condensed chromatin (heterochromatin) - access of regulatory proteins is prevented Frontiers In Bioscience, Landmark, 23, 2018 Gene Technologies39 Epigenetics Clark and Pazdernik, 2016 Gene Technologies40 Histone acetylation ̶ Histone acetyltransferases (HATs) - transfer of acetyl to Lys residues at the ends of histones ̶ Histone deacetylases (HDACs) - removal of acetyl from Lys residues Gene Technologies41 Histone methylation ̶ Lysine methyltransferases (KMTs) - transfer of the methyl group to Lys residues at the ends of histones ̶ Lysine demethylases (KDMs) - removal of the methyl group from Lys residues KDM1 (KDM1A) KDM2-8 (JHDM (JmjC)) Gene Technologies42 DNA methylation ̶ In prokaryotes, methylation distinguishes the newly synthesized filament from the template. ̶ In eukaryotes, methylation silences various genes and prevents their expression. ̶ Methylation occurs in CpG or CpNpG motifs - maintenance methylases - methylation of newly synthesized DNA strand - de novo methylases - newly added methylation to DNA - demethylases - removal of unwanted methylation from DNA ̶ A number of genes in the vicinity of the so-called CpG islands ̶ During methylation of large stretches of DNA, CpG islands bind methylcytosine-binding proteins that also activate histone deacetylases = heterochromatin formation Gene Technologies43 Methylation of cytosine Gene Technologies44 The process of translation ̶ Transfer of information in mRNA to a specific protein ̶ Each amino acid is encoded within the mRNA by three bases called triplets/codons ̶ Individual codons within the mRNA recognize transfer RNA (tRNA) molecules ̶ Amino acids are attached to the corresponding tRNAs by the enzymes aminoacyl-tRNA synthetases Clark and Pazdernik, 2016 Gene Technologies45 Protein synthesis in prokaryotes ̶ It takes place in ribosomes - 30S (16S rRNA + 21 proteins) - 50S (5S, 23S rRNA + 34 proteins) ̶ Large subunit = three binding sites - A (acceptor), P (peptide) and E (exit) ̶ Translation starts at the AUG codon after the Shine-Dalgarno sequence (UAAGGAGG) ̶ Translation is initiated by a Met derivative (N-formyl-methionine) bound to the 30S subunit ̶ Initiation factors - composition of the 30S initiation complex ̶ tRNAi fmet binds to the P-site of the ribosome, the A-site is occupied by the tRNA, peptidyl transferase activity of 23S rRNA catalyzes peptide binding ̶ Adding additional AKs within the elongation requires elongation factors, stop codon binds RFs ̶ Several ribosomes usually bind to mRNA to form a polysome Gene Technologies46 Clark and Pazdernik, 2016 Gene Technologies47 pET28 vector Gene Technologies48 Protein synthesis in eukaryotes ̶ Translation takes place in the cytoplasm (rough ER) ̶ There is no coupling of the transcription and translation process ̶ It takes place in ribosomes - 40S (18S rRNA + 32 proteins) - 60S (5S, 5.8S and 28S rRNA + 47 proteins) ̶ mRNA does not contain Shine-Dalgarno sequence, cap recognition and Kozak sequence ̶ The first amino acid is Met without modification ̶ A number of eukaryotic proteins are subsequently modified post-translationally Gene Technologies49 Protein synthesis in eukaryotes Clark and Pazdernik, 2016 Gene technologies50 GENE TECHNOLOGIES Model organisms: Model organisms used in biotechnology - bacteria (E. coli), yeasts (Pichia, Saccharomyces) and fungi (Penicillium), Caenorhabditis elegans (nematode), Drosophila melanogaster, Danio rerio (Zebra fish), house mouse, animal cell cultures, Arabidopsis thaliana, viruses (bacteriophages, retroviruses). Gene technologies51 Model Organisms ̶ DNA is found in all living organisms and viruses ̶ Only a fraction of so-called model organisms are studied in detail ̶ In model organisms, we now know the complete genome ̶ We use model organisms: - as a model for studying similar organisms - in a wide range of biotechnological processes Gene technologies52 Bacteria ̶ Master of model organisms ̶ Makes up approx. 50% of all living organisms (5 x 1030) ̶ Ability to survive in extreme conditions -temperature (Thermus aquaticus), pH (Acidothiobacillus) ̶ Escherichia coli is the most commonly used: - Gram-negative rod - has about 10 flagella and thousands of pili on its surface - most strains are harmless - E. coli O157:H7 - two toxins responsible for bloody diarrhea Clark and Pazdernik, 2016 Gene technologies53 Bacteria Clark and Pazdernik, 2016 Gene technologies54 E. coli ̶ Rapid growth of culture ̶ Can only grow in a medium containing mineral salts and sugar ̶ Liquid culture will last for weeks in the refrigerator ̶ Can be frozen at -70°C for up to 20 years ̶ Can grow under both aerobic and anaerobic conditions ̶ Has one circular chromosome containing about 4000 genes Clark and Pazdernik, 2016 Gene technologies55 Plasmids ̶ Survival strategy requires cooperation with other organisms ̶ A number of bacteria secrete toxins called bacteriocins ̶ E. coli produces so-called colicins (E1, M) - perforation of the plasma membrane, DNA/RNA degradation ̶ The bacteria's immune proteins neutralise the effect of the toxins ̶ The ability to produce colicins is due to the presence of plasmids (ori site) ̶ These plasmids have been modified for biotechnological purposes Clark and Pazdernik, 2016 Gene technologies56 Plasmid pET28 Gene technologies57 ̶ Bacillus subtillis - production of proteases and amylases ̶ Pseudomonas putida – the ability to degrade a range of aromatic compounds ̶ Streptomyces coelicolor - degrades cellulose and chitin, production of a range of antibiotics (Clorobiocin, Undecylprodigiosin, Actinorhodin) ̶ Corynebacterium glutamicum - production of L-glutamate and L-lysine ̶ Streptococcus zooepidemicus - production of hyaluronic acid Bacteria in Biotechnology Gene technologies58 Eukaryots ̶ The entire line of eukaryotes is diploid (two copies of each chromosome) ̶ In contrast, a whole range of plants are polyploid (wheat = hexaploid, tomato = tetraploid) ̶ In animals, there is a difference in germ and somatic cells - diploid germ lines give rise to haploid gametes (eggs and sperm) - somatic cells are diploid - somatic mutations are transmitted within the organism - somatic mutations are not transmitted to offspring ̶ In most plants, cells are totipotent ̶ In animals, only stem cells carry this property 59 iPSC (induced Pluripotent Stem Cell) Abbar et al., 2020 ̶ Method first described in Takahashi and Yamanaka (2006) for induction of iPSCs from fibroblasts ̶ Requires the expression of 4 transcription factors - octamer-binding transcription factor 3/4 (Oct3/4), SRY (sex determining region Y)-box 2 (Sox2), Krüppel-like factor 4 (Klf4) and cellular-Myelocytomatosis (cMyc) (OSKM). Gene technologies Gene technologies60 Somatic mutations Clark and Pazdernik, 2016 Gene technologies61 Yeasts and Fungi ̶ Fungi are traditionally used in biotechnology - Penicillium roqueforti (Roquefort), P. candidum, caseicolum and camembertri (Camembert), Aspergillus oryzae (soy sauce), Penicillium notatum (Penicillin), Aspergillus niger (citric acid) ̶ Usually cultivated in bioreactors ̶ Yeasts have the advantages of both bacteria and eukaryotes ̶ The most commonly used yeast is Saccharomyces cerevisiae ̶ The yeast genome is separated by a nuclear membrane ̶ S. cerevisiae has 16 chromosomes containing telomeres and centromeres ̶ Some yeasts have extrachromosomal elements, the so-called 2.micron circle. Gene technologies62 Yeasts ̶ Yeasts multiply by budding ̶ Budding produces identical cells - division by mitosis ̶ Yeasts have diploid and haploid phases within the life cycle ̶ Under critical conditions, yeast undergo meiosis - formation of haploid spores, called ascospores in the ascus) ̶ Under favorable conditions, spores germinate and conjugate to form diploid cells ̶ In yeast, conjugation can only occur between two different mating types (a, ) Clark and Pazdernik, 2016 Gene technologies63 Pichia pastoris Gene technologies64 Caenorhabditis elegans ̶ Small nematode (nematodes) living in soil with mainly root vegetables ̶ It has two sexes - 99.9% hermaphrodite (self-fertilizing) and 0.1% male ̶ Body consists of a simple tube covered with a cuticle ̶ Inside the body - 959 somatic cells including about 300 neurons ̶ The head has a variety of sensory organs (taste, smell, temperature, touch) ̶ Body is translucent = easy to use fluorescence techniques, generation cycle 3 days ̶ RNA interference performed for the first time - ideal tool for reverse genetics ̶ First known complete genome of a multicellular organism (100 Mbp) https://www.hsph.harvard.edu/mair-lab/c-elegans/ Gene technologies65 Drosophila melanogaster (fruit fly) ̶ A widely consumed organism in genetic studies ̶ Easy to grow, 2-week life cycle ̶ Egg hatches into a larva (24h), several larval stages after adult ̶ Many mutants available - identification of genes involved in development (homology with humans) ̶ Genome is 165 Mb - 3 pairs of autosomal and X/Y chromosomes ̶ Polytene chromosomes during rapid larval development Clark and Pazdernik, 2016 Gene technologies66 Danio rerio (Zebra fish) ̶ A simple model vertebrate used in molecular biology ̶ Easy to grow and propagate in aquaria, availability of a wide range of mutants ̶ Embryonic development outside the mother's body, development from a single cell to an individual takes 24 hours ̶ Embryo is translucent - easy to monitor the effect of mutations on development ̶ Genome contains 25 pairs of chromosomes (1700 Mb), 70% of protein-coding genes in humans have orthologs in Danio ̶ Model for studying a range of human diseases ̶ Embryos are often used for screening new drugs https://theconversation.com/animals-in-research-zebrafish-13804 Gene technologies67 Arabidopsis thaliana ̶ The most widely used model organism in plant genetics and molecular biology ̶ Similar response to stress factors and diseases as economic crops ̶ Many of the genes responsible for development and reproduction are identical to those of economic crops ̶ Easy to grow, space-saving, generation time 6-10 weeks, many seeds ̶ Can be maintained in a haploid state ̶ Small genome - five chromosomes (125 Mb), 25 000 genes - Rice (430 Mb), 40-50 thousand genes - wheat (17 Gb), tomato (950 Mb), tobacco (4.5 Gb) Gene technologies68 Viruses ̶ Entities at the edge of the definition of life, pathogens attacking host cells ̶ Consists of a protein envelope called a capsid that encases the DNA/RNA genome ̶ Found in all living organisms (bacteria, plants, animals) ̶ Bacterial viruses = bacteriophages (phages) - attach to the host - entry of the viral genome - replication of the viral genome - production of new viral proteins - assembly of a new viral particle - release of virions from the host ̶ Many viruses go through a latent phase - lysogeny in bacteria ̶ Integration of the virion into the host genome often occurs - provirus (prophage) formation Clark and Pazdernik, 2016 Gene technologies69 Viruses ̶ We can divide based on the shape of the capsid (spherical, complex, fibrous) ̶ Complex = bacteriophages (T4, P1, Mu) ̶ ssRNA viruses have a positive (+) or negative (-) genome ̶ Retroviruses contain reverse transcriptase (transcription of RNA to DNA), integrate into the genome using long terminal repeats (LTRs) Clark and Pazdernik, 2016 Gene technologies70 The Life Cycle of RNA Viruses Mechanism of SARS-CoV-2 (+RNA) replication Mechanisms of retroviruses replication Dubois et al. 2018 V’kovski et al. 2021 Gene Technologies71 GENE TECHNOLOGIES Manipulation of DNA, RNA, and proteins Cell fractionation, isolation of proteins and nucleic acids. Gene Technologies72 Isolation of DNA and RNA ̶ Different types of samples = different strategies • blood, brain tissue, heart tissue, liver tissue • stool, urine, swabs from the urethra, throat, vagina, rectum, conjunctiva, cerebrospinal fluid • seeds, leaves, roots, wood • gram positive and negative bacteria, yeasts, fungi • food (cheese, meat, egg, milk) • soil, water, manure Plants tissues Animal and human tissues Bacteria and environmental samples Gene technologies73 Desintegration of sample ̶ Soft animal tissues - lysis at 50-60°C by Proteinase K ̶ Proteinase K - digests preferentially after hydrophobic amino acids - active in a wide range of temperatures (20 and 60°C ), pH and buffers - activity is stimulated when up to 2% SDS or up to 4 M urea are included in the reaction ̶ Solid animal tissues and plant tissues - must be crushed mechanically ̶ Microorganisms - grinding with sea sand or garnet beads, lysozyme (G+) ̶ Mechanical grinding Gene technologies74 Mechanical grinding ̶ Liquid nitrogen and mortar and pestle ̶ Retsch mill, Precellys, Cryomill ̶ Garnet beads https://www.youtube.com/watch?v=Z8UvIQXRJFY https://www.youtube.com/watch?v=k6mPWPuR8PY https://youtu.be/OwoUAO7vaJA?list=TLGGIXBeSy4AvBcyODA5MjAyMg Gene Technologies75 Lysis buffer ̶ The goal of lysis buffer is to suppress the activity of nucleic acid-degrading enzymes and to separate proteins from nucleic acids ̶ EDTA – chelating of Mg2+ ions = inhibition of nucleases ̶ RNAsin – inhibitor of RNAses ̶ Detergents - sodium and lithium salts of lauryl sulfate or Triton X-100 and Tween20 - nuclease inhibitors and at the same time release the nucleic acid from its binding to the proteins/histones Gene Technologies76 Deproteinization ̶ Phenol - one of the most effective denaturing agents, but phenol can degrade nucleic acids with repeated use. ̶ Chloroform mixed with isoamyl alcohol – effectively denatures proteins (chloroform denatures proteins and isoamyl alcohol reduces foaming) ̶ Guanidine hydrochloride – breaks the structure of proteins and biologically inactivates them. It can be used to isolate both DNA and RNA. ̶ Sodium perchlorate – removes detergents from extraction solutions by forming their complexes with proteins Gene technologies77 Removing of saccharides ̶ Cetrimonium bromide (CTAB) - can be used to precipitate DNA and RNA, while the saccharide remains in the liquid. ̶ Tetraethylammonium bromide (TEAB) –isolation of RNA from the saccharide of a 50% ethanol solution of TEAB. The saccharides precipitate and the RNA remains in the liquid. The saccharides are removed by centrifugation. ̶ 2.5 M LiCl - LiCl precipitation is useful following RNA isolation or in vitro transcription, because RNA is efficiently precipitated, while protein, carbohydrates, and DNA are very inefficiently precipitated or are not precipitated at all Gene Technologies78 Phenol-chloroform isolation of NA ̶ The phenol-chloroform extraction method is most often used to isolate NA from plant tissues or enviromental samples and large amounts of DNA from blood. ̶ A mixture of phenol, chloroform and isoamyl alcohol is added to the sample. ̶ TriReagent, TRIZOL - A mixture of phenol, chloroform and GuHCl ̶ Chloroform does not mix with the aqueous solution of the cell lysate, so the mixture is divided into two phases - upper aqueous and lower chloroform. By shaking, the phases are mixed, during which the phenol precipitates the proteins present in the aqueous lysate. ̶ Using of acidic phenol (pH≈4) – isolation of RNA to upper aqueous phase/DNA in interphase ̶ Using of basic phenol (pH≈8) – isolation of DNA to upper aqueous phase ̶ DNA/RNA is precipitated from aqueous phase by isopropanol Gene Technologies79 Phenol-chloroform isolation of NA Acidic phenol Basic phenol Gene Technologies80 NA precipitation ̶ Precipitation of RNA and DNA can be facilitated by addition of co-precipitant – Glycogen, GlycoBlue GlycoBlue - dye covalently linked to glycogen, a branched chain carbohydrate, which is useful as a nucleic acid coprecipitant. Gene Technologies81 Isolation of NA using commercial kits ̶ Types of isolation techniques used by commercial kits: - resins - bind DNA specifically - membranes (filters) - silica columns – specific binding of nucleic acids - paramagnetic particles with a differently modified surface silica columns membranesparamagnetic particles Gene Technologies82 Silica columns Gene Technologies83 Paramagnetic particles (MPs) ̶ One of the methods of isolation of nucleic acids, which has become more widespread ̶ MPs are particles with a size of 5 nm–100 μm formed from a metal core, which is most often gamma-Fe2O3 (maghemite) or Fe3O4 (magnetite). ̶ The core is covered by a layer that has a prepared specific surface. This can be adjusted according to which molecules we want to isolate from the given material. ̶ The size of MPs itself can be adjusted according to what we are isolating: 5-50 nm proteins; 20 –450 nm nucleic acids, viruses; 10–100 μm cells. ̶ The principle of isolation is based on the physico-chemical properties of MPs. Gene Technologies84 Paramagnetic particles Binding of DNA fragments depends on the concentration of ethanol https://www.beckman.com/resources/technologies/spri-beads?wvideo=kh244puadj Gene Technologies85 Purification of DNA from RNA ̶ For some applications it is necessary to have RNA without DNA contamination ̶ Precipitation of DNA with 1/10 volume of isopropyl alcohol - DNA precipitates and RNA remains in solution; however, the method is not 100% ̶ Treatment of sample with DNAse I (RapidOut DNA removal kit) - DNase I binds to Inhibition reagent (beads) - special DNase I with lower Km Gene Technologies86 Quantification and Purity ̶ Measure of concentration and purity by spectrophotometer (NanoDrop) - RNA: A260/280 = 2.0, A260/230 >1.5,  = 40 (μg/mL)-1cm-1 - DNA: A260/280 = 1.8, A260/230 >1.5,  = 50 (μg/mL)-1cm-1 ̶ Measure of concentration by Qubit (fluorometry) ̶ Measure of RNA integrity by Fragment Analyzer or TapeStation (Electrophoresis) - RIN (RNA integrity number) > 7 Gene Technologies87 Fragment Analyzer (RIN) Gene Technologies88 Protein isolation ̶ RIPA buffer (from tissue cultures) - 30mM HEPES, pH 7.4,150 mM NaCl, 1% Nonidet P-40, 0.5% sodium deoxycholate, 0.1% sodium dodecyl sulfate, 5mM EDTA, 1mM NaV04, 50mM NaF, 1mM PMSF, 10% pepstatin A, 10 μg/ml leupeptin, and 10 μg/ml aprotinin ̶ Homogenization in SDT buffer - 4% SDS, 0.1M DTT, 0.1M Tris-HCl pH=7.6 ̶ Homogenization in Urea buffer - 9M Urea, 20mM HEPES pH 8.0 The hydrogen bond interaction between urea and the peptide groups opens the entrance for water and contributes to the unfolding denaturation of protein. Gene Technologies89 Proteins quantification ̶ Bradford assay (A595) - interferuje SDS ̶ Bicinchoninic assay (BCA) (A562) - strong interference –SH group and EDTA - no interference with SDS (up to 5%) ̶ Folin assay (A750) ̶ Measurement of Trp fluorescence (280/350 nm) Gene Technologies90 Manipulation of DNA, RNA and proteins PCR techniques. DNA sequencing, high-throughput sequencing methods Gene Technologies91 Chemical synthesis of DNA ̶ H. Gobind Korana synthesized the first active tRNA molecule of 72 nucleotides (1970) ̶ Artificial DNA synthesis is in the 3' → 5' direction - attaching the first base to CPG (controlled pore glass) - the 5' end is blocked with DMT (dimethyloxytrityl) - the DMT group is removed using a weak acid (TCA) - another nucleotide is added in the form of so-called phosphoramidite activated by tetrazole - 5'- OH ends of unreacted nucleotides are acetylated using acetic anhydride - repeating the process (from the left) Har Gobind Khorana, Robert W Holley, Luis W Alvarez, Marshall W Nirenberg, Lars Onsager and Yasunari Kawabata at the awarding of the Nobel Prize in 1968. Gene Technologies92 Chemical synthesis of DNA DMT-Dimethoxytrityl Gene Technologies93 Chemical synthesis of DNA https://www.youtube.com/watch?v=1S0x3aRCviM Gene Technologies94 Polymerase Chain Reaction K Mullis, F Faloona, S Scharf, R Saiki, G Horn, H Erlich. Specific enzymatic amplification of DNA in vitro: the polymerase chain reaction. Cold Spring Harb Symp Quant Biol;1986;51 Gene Technologies95 Modifications of PCR ̶ Inversion PCR ̶ Reverse Transcription PCR (RT-PCR) - 5´RACE, 3´RACE ̶ PCR mutagenesis ̶ Emulsion PCR ̶ Droplet Digital PCR Gene Technologies96 RACE PCR (Rapid Amplification of cDNA Ends) SMARTer® Pico PCR cDNA Synthesis Kit Gene Technologies97 Specific fluorescence probes Hydrolyzation probes TaqMan Hybridization probes Molecular beacon, FRET Real-Time PCR Gene Technologies98 Emulsion PCR Used in NGS technology (454, ion torrent) https://www.youtube.com/watch?v=qKouzbp1RWI Gene Technologies99 Digital PCR Real-time PCR/ qPCR Digital PCR Quantitative, relative or absolute but standard curves or reference samples needed Quantitative, absolute and no standards or references needed Bulk PCR • flexible reaction volumes • impacted by changes in PCR efficiency as data is collected at the exponential phase • prone to inhibitors Sample partitioning • higher inhibitor tolerance / increased robustness • unaffected by changes in amplification efficiency • higher statistical power subject to the Poisson statistics Measures PCR amplification at each cycle Measures at the end of the PCR cycles Detects mutation rate at >1% Detects mutation rate at ≥ 0.1% (high signal-to-noise ratio) Well-established protocols Higher precision for higher reproducibility across laboratories https://www.qiagen.com/us/applications/digital-pcr/beginners Gene Technologies100 Droplet digital PCR https://www.youtube.com/watch?v=lAVVoyZxlTU Gene Technologies101 Digital PCR (Qiagen) https://www.qiagen.com/cz/applications/digital-pcr/beginners Gene Technologies102 Sequencing of human genome ̶ at the time of the beginning (the year 1990) a monumental task ̶ started in 1990 with the participation of the DOE and NIH ̶ sequencing done using contig maps and BACs ̶ the initial plan envisaged duration of 15 years ̶ finally, sequencing using the Sanger method was almost completed already in 2000 ̶ resulting sequence map published on April 14, 2003, with 99.99% accuracy (National Human Genome Research Institute) ̶ total cost of the project 3 billion dollars ̶ in 2000, President Bill Clinton asserted the unpatentability of DNA https://https://www.genome.gov/25019885/online-education-kit-how-to-sequence-a-human-genome// https://upload.wikimedia.org/wikipedia/commons/thumb/8/88/Vitruvian_man.jpg/220px-Vitruvian_man.jpg Gene Technologies103 Celera Genomics Project ̶ founded by scientist Craig Venter and started a sequencing project in 1998 ̶ the total cost of 300 million dollars was fully covered by private sources ̶ the "whole genome shotgun sequencing" method was used for the first time ̶ used an approach developed by Gene Myers to analyze the sequencing data ̶ this approach required extreme computational demands ̶ final calculation performed on 7000 processors to obtain 1000 times the speed of Pentium computers ̶ this innovative approach allowed sequencing to be completed in just 9 months Gene Technologies104 The strong role of diplomacy It is hard to imagine today’s politicians reminding scientists that cooperation has as much value as competition. In 26 June 2000, US President Bill Clinton and UK Prime Minister Tony Blair presided over a carefully choreographed piece of scientific theatre. Through a video link connecting Washington DC and London, they announced to the world that scientists had completed a rough first draft of the human genome sequence. Craig Venter (left), Francis Collins, Bill Clinton (right) Gene Technologies105 Sanger sequencing • Synthesis of DNA in-vitro using "terminators" - dideoxynucleotides that prevent further elongation after being incorporated into DNA. Deoxyribosa Dideoxyribosa • It requires the use of an initial primer, DNA polymerase and a mixture of dNTPs with labeled ddNTPs • The synthesized strands are separated using polyacrylamide gel electrophoresis or capillary electrophoresis • Possibility of fully automated separation using fluorescently labeled ddNTPs Gene Technologies106 Sanger sequencing Throughput/Performance by Run Module XLRseq: 768 samples per day (690 Kbases) LongSeq: 1152 samples/day (980 Kbases) StdSeq: 2304 samples/day (1550 Kbases) FastSeq: 2304 samples/day (1600 Kbases) RapidSeq: 3840 samples per day (2100 Kbases) Gene Technologies107 • it enables rapid sequencing of short stretches of DNA - sequencing of 30 to 50 bases takes approximately 30 to 45 minutes. • it is bio-luminometric DNA sequencing based on the detection of inorganic pyrophosphate (PPi) released during nucleotide incorporation. Pyrosequencing (1990) Gene Technologies108 454 a GS Junior system 020100_b1 Průchodnost 1 miliarda bazí za den Doba analýza 10.0 hodin Délka čtení 400 Počet čtení/analýzu 1 000.000 Správnost >99.0% správnost jednoho čtení na 400 bazích Potřebné množství DNA Méně než 100 ng DNA Multiplexování Až 192 vzorků/běh Gene Technologies109 Qiagen – PyroMark instruments ̶ https://www.labtube.tv/video/MTAxNzE1 ̶ https://www.qiagen.com/us/knowledge-and-support/knowledge-hub/explainer- videos-and-demos/pyrosequencing-cascade-reaction Gene Technologies110 Preparation of Sequencing Library DNA sample – fragmentation (Covaris, fragmentase) End-repair (DNA polymerase) Adaptor ligation (ligase) Selection of fragments (SPRI beads) Amplification of fragments Sequencing (Illumina, IonTorrent, Nanoballs) Fragmentase • A mixture of endonucleases (NEases) cleaving one strand and then the opposite one • A mixture of two enzymes (DNase I and SD (strand-displacement) polymerase) Gene Technologies111 Nextera technology https://doi.org/10.1038/nmeth.f.272 • uses in vitro transposition • transposases catalyze the random insertion of excised transposons • transposase makes random, staggered double-stranded breaks in the target DNA and covalently attaches the 3′ end of the transferred transposon strand to the 5′ end of the target DNA. • for integration only free transposon ends are sufficient Gene Technologies112 Targeted Enrichment PCR enrichment DNA capture Inversion probes Gene Technologies113 Amplicon sequencing Gene Technologies114 Quantification of NGS library Electrophoretic methods • Fragment Analyzer (Adv. Anal.) • TapeStation (Agilent) • BioAnalyzer (Agilent) Fluorometric methods • Qubit (Thermo Scientific) • Quantus (Promega) Real-Time PCR • KapaBiosystem • NEB KBC/MMB115 Illumina sequencing system MiniSeq, MiSeq, NextSeq, HiSeq, NovaSeq Gene Technologies116 Sequencing in clusters KBC/MMB117 1st step – hybridization on flow-cell Gene Technologies118 2nd step – bridge PCR Gene Technologies119 2nd step – bridge PCR Gene Technologies120 2nd step – bridge PCR Gene Technologies121 2nd step – bridge PCR Gene Technologies122 2nd step – bridge PCR Gene Technologies123 3rd step - linearisation Gene Technologies124 4th step – separation of reverse strand Gene Technologies125 5th step – blocking of 5´end Gene Technologies126 6th step – hybridization of seq. primer Gene Technologies127 Reverse terminators https://youtu.be/fCd6B5HRaZ8 https://youtu.be/fCd6B5HRaZ8 Gene Technologies128 Sequencing by synthesis (SBS) Gene Technologies129 Gene Technologies130 7th step – pair-end sequencing Gene Technologies131 7th step – pair-end sequencing Gene Technologies132 7th step – pair-end sequencing Gene Technologies133 7th step – pair-end sequencing Gene Technologies134 Index read Gene Technologies135 Single index read Gene Technologies136 Dual index read iSeq, MiniSeq, NextSeq,HiSeq Gene Technologies137 PGM analyser (ion torrent) Application Targeted Exome Transcriptome Genome Leveraging Consumer Technology for scientific breakthroughs https://www.thermofisher.com/cz/en/home/life-science/sequencing/next-generation-sequencing/ion-torrent-next-generation-sequencing-technology.html KBC/MMB138 DNBSeq (MGI) https://en.mgi-tech.com/products/ https://youtu.be/xUVdJN0m38c • it uses phi29 polymerase for amplification of one-strand template • this process creates nanoballs • sequencing cell contains regions with positive charge for binding of nannoballs • different technology of sequencing KBC/MMB139 Current techniques of 2nd generation KBC/MMB140 3rd generation of sequencers (PACBIO) SEQUEL IIe SYSTEM Long-read sequencing https://www.pacb.com/sequencing-systems/ REVIO SYSTEM Long-read sequencing ONSO SYSTEM Short-read sequencing SBB CHEMISTRY KBC/MMB141 PACBIO • Sequencing based on Single Molecule, Real-Time (SMRT®) technology • It uses so-called Zero-Mode Waveguides (ZMWs) enabling the illumination of only the lower part of the well, in which the DNA polymerase is immobilized at the bottom • The main advantage is the possibility of long reads (up to 20 kb) • Another advantage is the possibility of direct detection of methylated bases (epigenome) KBC/MMB142 Library preparation https://www.youtube.com/watch?v=v8p4ph2MAvI https://www.youtube.com/watch?v=v8p4ph2MAvI KBC/MMB143 3rd generation – Oxford Nanopores • The technology is based on nanopores • At the beginning of sequencing, NA is bound to a nanopore formed by a protein • It is then denatured and passes through the nanopore, generating a change in current • Based on the observed change, individual bases are read in real-time • Enables sequencing of very long chains (tens to hundreds of kilobases) • The disadvantage is a higher error rate, correctness >95% Gene Technologies144 GENE TECHNOLOGIES Methods of studying gene expression and function Mapping techniques, DNA libraries, gene expression, metagenomics Gene Technologies145 Mapping techniques ̶ Genome maps provide a series of markers for assembling sequence data: ̶ Creation of a genome map: - genetic maps (crossbreeding, pedigree analysis, gene transfer) - linkage maps - physical maps (radiation hybrid panel, FISH) ̶ Genetic maps based on linkage = the probability that two mapped markers will separate from each other in a cross ̶ To determine the relative distance of markers, the percentage of times they are found together is crucial ̶ A variety of markers are used today Type of mapping Markers Methods of localization Genetic Gene, biochemical properties, DNA markers (RFLP, VNTRs, microsatellite, SNPs) Linkage analysis using crossing or mating Kinship analysis Physical STSs, EST, VNTRs, microsatelites Restriction analysis, Radiation hybrid panel, FISH, Cytogenetic mapping Gene Technologies146 Genetic markers ̶ RFLP analysis of related individuals, easy identification ̶ Variable Number Tandem Repeat (VNTR, minisatellites) – tandem repeats with a length of 9-80bp (forensic testing, paternity tests) ̶ Microsatellite polymorphism – tandem repeat of 2-5bp length ̶ Single Nucleotide Polymorphism (SNP) ̶ SNPs, VNTRs RFLPs are also used in physical mapping ̶ For large genomes we need additional markers - STSs (Sequence Tagged Sites) – the unique sequence of 100-500 bp - ESTs (Expressed Sequence tags) – identification in cDNA libraries ̶ Digestion of gDNA using restriction enzymes - physical mapping method Clark and Pazdernik, 2016 Gene Technologies147 Genetic markers Clark and Pazdernik, 2016 Gene Technologies148 Physical mapping techniques ̶ FISH (Fluorescence in-situ hybridization) – the location of a specific DNA sample on chromosomes in metaphase relative to banding (chromosome painting) ̶ radiation hybrid mapping – large segments of the cloned genome may contain two fragments from different parts of the genome Clark and Pazdernik, 2016 Gene Technologies149 Number of genes x Genome Organismus Velikost genomu (Mbp) Počet protein-kódujcích genů Wheat 17 000 95 000 Rice 520 45 000 Paris Japonica (Pieris japonský) 149 000 26 000 Trichomonas vaginalis 160 46 000 Encephalozoon intestinalis 2.25 1833 Marbled lungfish 130 000 ? Human 3200 21 850 Nematode 97 20 493 Fruit fly 180 13 600 Streptomyces coelicolor 8.7 7800 E. coli 4.6 4300 Mycoplasma genitalium 0.58 470 Gene Technologies150 DNA libraries ̶ Used for: - finding new genes - genome sequencing - comparison of genes from different organisms ̶ Basic steps in creating a library: - isolation of chromosomal DNA - cleavage of DNA with a restriction enzyme - linearization of the vector - insertion of fragments into the vector - transformation into E. coli Clark and Pazdernik, 2016 Gene Technologies151 ̶ The vector contains the sequence necessary for transcription and translation ̶ Constructed from complementary DNA (cDNA) ̶ Identification of new genes, splicing variants Clark and Pazdernik, 2016 Eukaryotic expression libraries Gene Technologies152 Medical genomics ̶ The largest application of genomic data in disease diagnosis ̶ Genetic testing – determination of the presence of a gene associated with the disease: - muscular dystrophy (dystrophin gene) - cystic fibrosis (CFTR gene) - Huntington's disease (HTT gene) Gene Technologies153 Medical genomics ̶ To identify causal mutations, it is more advantageous to sequence the exome (2%) than the genome ̶ Currently, more than 3,000 diseases have been identified using genomics and pedigree analysis - the so-called Mendelian disease (a mutation in one gene leads to the disease) ̶ Many diseases are polygenic (contribution of multiple genes to the development of the disease) - Crohn's disease - autoimmune disease - psychiatric disorders (schizophrenia, AD, mild cognitive impairment) ̶ Within these diseases, the use of GWAS (genome-wide association study) - analysis of single point polymorphisms (SNPs) - frequency lower than 1% - influence of genotype and environment on disease development Gene Technologies154 Gene expression – WGAs, ChIP ̶ WGAs (whole-genome tiling arrays cover all genome ̶ Firstly, in Arabidopsis (25-mer oligonucleotides) ̶ Discovery of new genes, splicing variants ̶ ChIP (chromatin immunoprecipitation): - analysis of DNA regions of individual transcription factors - DNA analysis of regions associated with histone PTMs Clark and Pazdernik, 2016 Gene Technologies155 Gene expression – RT-qPCR Using of specific fluorescence probes Hydrolysis probe TaqMan Hybridization probes Molecular beacons, FRET Using intercalating dye Gene Technologies156 Gene expression - RNAseq ̶ Advantages of the RNAseq method: - does not depend on probes (more correct quantification of given RNA molecules) - large dynamic range - detection of alternative splicing and the possibility of their quantification - the possibility of analysis without knowledge of the genome sequence - the possibility of analysis from one cell Clark and Pazdernik, 2016 Gene Technologies157 MetaRibo-Seq Fremin et al. 2020 ̶ Riboseq – translation arrest and subsequent sequencing of the translatome Gene Technologies158 Metagenomics ̶ A study of the genetic material contained in a sample ̶ ShotGun approach X sequencing of specific phylogenetic regions (16S, 18S, ITS, mcrA) Johnson et al. 2019 Microbiome Microbiota Metagenome Microorganisms (and their genes) living in a specific environment Microorganisms (by type) living in a specific environment The genes of microorganisms in a specific environment Gohl et al. 2016 Gene Technologies159 Monitoring of gene expression ̶ A whole range of details about a gene obtained using reporter genes - adding a reporter gene behind the promoter - adding a reporter gene behind the CDS ̶ Using the following genes: - lacZ gene (-galactosidase) - phoA gene (alkaline phosphatase) - lux/luc gene (luciferase) - gfp gene (Green Fluorescent Protein) Clark and Pazdernik, 2016 Gene Technologies160 Analysis of methylome ̶ Analysis of gDNA methylation sites ̶ Methylation usually silences transposon elements ̶ Silencing of one copy of the X chromosome in females ̶ Analysis using the bisulfite method - the addition of sodium sulfite leads to the conversion of non-methylated cytosines to uracil - subsequent sequencing without and with the addition of sulfite leads to the detection of methylation sites ̶ 3rd generation sequencers (Nanopores, PacBIO) are able to directly read cytosine methylation Clark and Pazdernik, 2016 Gene Technologies161 GENE TECHNOLOGIES Gene Cloning Strategies Restriction endonucleases, plasmids, and cloning vectors, optimization of gene expression, expression in foreign hosts Gene Technologies162 Restriction enzymes ̶ Bacterial enzymes binding to a specific sequence and cleaving both strands ̶ Protection of bacteria from foreign DNA (viruses) ̶ Sensitive to DNA methylation ̶ Two basic types: Type I - cleaves the DNA strand 1000 or more bases from the recognized sequence Type II - cleaves the DNA strand at the location of the recognized sequence (blunt, sticky ends) ̶ The number of bases recognized = the degree of DNA fragmentation ̶ Joining fragments - ligase (T4 ligase) Clark and Pazdernik, 2016 Gene Technologies163 Restriction enzymes (structure) Pingoud and Jeltsch, 2001 Gene Technologies164 Fragmentase ̶ Used for DNA fragmentation in NGS ̶ A mixture of endonucleases (NEas) cleaving one strand and then the opposite one ̶ A mixture of two enzymes (DNase I and SD (strand-displacement) polymerase) Ignatov et al. 2019 Gene Technologies165 Cloning vectors ̶ Specialized plasmids (other elements) carrying foreign DNA for study/manipulation ̶ Currently, we also use artificial chromosomes and viruses ̶ Basic properties of cloning vectors: - small size (easy handling and isolation) - easy transfer between cells by transformation - easy isolation from the host organism - easy detection and selection - occurrence in a larger number of copies (ori site) - multiple cloning sites for insertion of cloned DNA - method confirming the presence of inserted DNA in the vector Clark and Pazdernik, 2016 Gene Technologies166 Cloning vectors ̶ DNA insertion control options - insertional inactivation (ATB resistance gene) - ccdB gene (death gene interfering with DNA gyrase activity) https://link.springer.com/article/10.1007/BF00280310) - alpha complementation (-galactosidase) ̶ Yeast vectors based on a 2 circle - ori site from two organisms, the Cen sequence - selection based on AA synthesis Clark and Pazdernik, 2016 Gene Technologies167 Virus vectors ̶ Bacteriophage vectors - modified to carry non-viral DNA in the capsid - connection of cos sequences = formation of a replication form (RF) replicated by a rolling circle - an insert with a size of 37 to 52 kb can be used - use of helper viruses to package DNA into virus capsid ̶ Cosmids - a highly modified lambda vector having only cos sites - the necessity of packaging by helper phage Clark and Pazdernik, 2016 Gene Technologies168 ̶ Used for handling large pieces of DNA (150 – 2000 kb) ̶ Include - yeast artificial chromosomes (YACs) - bacterial artificial chromosomes (BACs) - P1 bacteriophage artificial chromosomes (PACs) ̶ YACs contain a centromere and telomeres for permanent maintenance in yeast ̶ BACs are circularized and propagated in bacteria (ori site and resistance gene) Trends in Biotechnology 2000, DOI: (10.1016/S0167-7799(00)01438-4) Artificial chromosomes Gene Technologies169 DNA transformation ̶ Transformation is the process by which foreign DNA is introduced into a cell. ̶ Competent E. coli cells: - the use of calcium ions and thermal shock to increase the permeability of the cell wall and membrane - use of electroporation to open the cell wall and membrane ̶ Competent yeast: - a combination of lithium acetate, single-stranded carrier DNA and polyethylene glycol (PEG) Gene Technologies170 Cloning strategies ̶ TOPO Cloning (Thermo) - use of topoisomerase I - Vaccinia virus topoisomerase I specifically recognizes the sequence 5'-(C/T)CCTT-3’ - topoisomerase is covalently attached to the 3' end of the vector Gene Technologies171 Cloning strategies ̶ TA cloning - using the property of Taq DNA polymerase to add A to the 3' end - pMiniT 2.0 (toxic mini-genes) (NEB) - pGEM-Teasy (blue-white selection) (Promega) Gene Technologies172 ̶ GATEWAY cloning vectors (Invitrogen-Thermo) - use of phage lambda integrase and excisionase enzymes - use of ENTRY and DESTINATION vectors - the BP reaction removes the gene of interest from attR sites and inserts it into attL sites. - the LR reaction removes the gene of interest from attL sites and inserts it into attR sites Clark and Pazdernik, 2016 Cloning strategies Gene Technologies173 https://www.youtube.com/watch?v=tlVbf5fXhp4 ̶ In 2009 Dr. Daniel Gibson and colleagues at the J. Craig Venter Institute developed a new method to easily assemble multiple linear DNA fragments ̶ Advantages I. There is no need for specific restriction sites. II. Join any fragments regardless of order. III. The reaction takes place in one tube. ̶ Gibson's Mix consists of three different enzymes I. T5 Exonuclease II. Phusion DNA Polymerase III. Taq DNA ligase Cloning strategies Gene Technologies174 Expression vectors ̶ The most commonly used lacUV promoter (modified lac promoter) - RNA polymerase binding site - lacI repressor site - transcription start site - transcription termination site ̶ Another frequently used promoter is the lambda left promoter (PL) - lambda repressor binding site - most frequent activation by increased temperature (42°C) ̶ Expression systems also use a promoter binding only bacteriophage T7 RNA polymerase - E. coli strains carrying T7 RNA polymerase after inducer control ̶ Expression vectors often contain sequences for various tags (6xHis, Myc, FLAG, S-tag, MBP) Clark and Pazdernik, 2016 Gene Technologies175 Bacterial Expression Vectors ̶ pET, prSET E. coli T7 expression vectors - expression in BL21(DE3)pLysS cells ̶ pMAL expression vectors - carry maltose-binding protein (MBP) Gene Technologies176 Yeast Expression Vectors ̶ Inducible AOX promotor (methanol) ̶ Possibility of intra- and extracellular expression ̶ Expression in yeasts P. pastoris and S. cerevisiae Gene Technologies177 ̶ Special plasmids (expression vectors) are used to increase proteins expression - strong promoter, adequate ori site, selection marker for antibiotic ̶ Expression of eukaryotic proteins is more problematic - promoter modification, absence of splicing, low rate of translation - weak interaction of the ribosome with the RBS site, mRNA instability, limited amount of tRNA ̶ The necessity of using specially modified vectors Clark and Pazdernik, 2016 Expression in Bacteria Gene Technologies178 E. Coli OrigamiTM 2 Exprese Oncostatinu M (OSM): A (37°C), B (18°C). C-kontrola bez IPTG, I-lyzát, P-pelet, S-solubilní frakce (Nguyen et al., 2019, SciRep) ̶ They carry a mutation in the gene thioredoxin reductase (trxB) and glutathione reductase (gor) ̶ Increase in the formation of disulfide bonds in the cytoplasm of E. coli ̶ Suitable for proteins requiring the formation of S-S bridges for proper composition Berkmen, 2012 Gene Technologies179 Translational Expression Vectors ̶ Designed for protein expression (pET, pRSET) - maximum translation initialization - consensus RBS site - ATG codon at an optimal distance of 8 bases from the RBS - cloning site directly in the ATG codon (Nco I) ̶ The possibility of further complications in protein folding Clark and Pazdernik, 2016 Gene Technologies180 Codons Effect ̶ Protein expression in other organisms (eukaryotic in bacteria) ̶ Different organisms prefer different codons for a given AA - optimization of the codons used in gene synthesis - up to a 10-fold increase in production - delivery of tRNA carrying rare codons to the organism - E. coli ROSETTA – seven tRNAs for rare codons (AGA, AGG, AUA, CUA, GGA, CCC, and CGG) Clark and Pazdernik, 2016 Gene Technologies181 Toxic effect of overexpression Lactose operon Arabinose operon Clark and Pazdernik, 2016 Gene Technologies182 Autoinduction Medium Gene Technologies183 Inclussion Bodies ̶ Misfolded proteins accumulate in inclusion bodies ̶ Molecular chaperones – they help with proper packing ̶ Possible secretion of proteins into the periplasm or medium ̶ Proteins can be solubilized from inclusion bodies with a chaotropic agent and renaturation Clark and Pazdernik, 2016 Gene Technologies184 Secretion of Proteins ̶ Possible expression into the periplasm or medium ̶ Secretion controlled by a hydrophobic sequence at the N-terminus cleaved by signal peptidase - possible addition of a signal sequence to the protein (risk of inclusion bodies) - possible fusion with a naturally secreted protein (maltose-binding protein in E. coli) - possible secretion in gram-positive bacteria (Bacillus) - use of a special Type I secretion system (hemolysin secretion, E. coli) or Type II (Endotoxin A, Pseudomonas) - use of autotransport proteins Gene Technologies185 Clark and Pazdernik, 2016 Secretion system of type I and II Autotransporter proteins Secretion of Proteins Gene Technologies186 Protein glycosylation ̶ A whole range of proteins in higher organisms is glycosylated ̶ Glycosylation is necessary for proper function - e.g. membrane proteins ̶ The bacterium carries out O-glycosylation (N-glycosylation was also discovered in the genus Campylobacter) ̶ Eukaryotic organisms mostly have N-glycosylation ̶ Insect cells are the solution for the expression of glycosylated proteins - a different pattern of glycosylation compared to mammals - the solution is modified insect cells with a mammalian glycosylation pathway ̶ A change in the glycosylation pattern can affect the properties of the protein - recombinant human erythropoietin contains an extra N-glycosylation site (Asn-Xxx-Ser/Thr) - lower affinity to the receptor, but a longer half-life prolongs the overall clinical activity Gene Technologies187 Clark and Pazdernik, 2016 Protein glycosylation Gene Technologies188 Protein Expression in Eukaryotic Cells ̶ A number of eukaryotic proteins are more efficiently expressed in eukaryotic cells ̶ Possibility of post-translational modifications - chemical modifications forming new amino acids - formation of disulfide bridges - glycosylation - addition of functional groups (fatty acids, acetylation, phosphorylation, methylation, sulfurization) - cleavage of pre-cursor proteins required for secretion, assembly, and/or activation Clark and Pazdernik, 2016 Gene Technologies189 Yeasts ̶ A whole range of advantages - easy cultivation on a small and large scale - the yeast S. cerevisiae is considered a safe organism - yeasts secrete very few of their own proteins – an advantage in secreting the expressed protein - DNA can be easily transformed (chemically, enzymatically, electroporation) - characterization of a whole series of promoters for targeted expression - capable of a whole range of post-translational modifications characteristic of eukaryotic organisms - glycosylation takes place only in secreted proteins ̶ Frequent secretion of recombinant proteins by the signal sequence of the mating factor  gene ̶ The signal peptidase recognizes the Lys-Arg sequence Clark and Pazdernik, 2016 Gene Technologies190 Yeasts ̶ Currently expressed in the yeast S. cerevisiae and P. pastoris - insulin - clotting factor VIIIa - various growth factors - viral proteins for the production of vaccines or diagnostics (HIV, HBV, HCV) ̶ The most common expression problems in yeast - loss of expression plasmids in large-scale cultivations - secreted proteins remain between the PM and the cell wall - hyper-glycosylation of secreted proteins occurs (solution by strain modification) Sheng et al. 2017 Gene Technologies191 GENE TECHNOLOGIES Technologies in Immunology Antibodies (structure, function), targeted antibody design, monoclonal antibodies, ELISA, vaccines (design and production, identification of potential new antigens, DNA vaccines) Gene Technologies192 Introduction ̶ The surrounding environment is full of infectious microorganisms and virusesOchrana organismu pomocí buněk imunitního systému ̶ Protection of the body by the cells of the immune system ̶ Antigens - mostly proteins on the surface of microorganisms = activation of immune response ̶ Antibodies - recognize and bind to antigens = produced by B-cells of the adaptive immune system ̶ Antibodies mostly secreted into the lymph, some bind to surface = B-cell receptors ̶ Massive proliferation of B-cells producing antibodies recognizing a given antigen ̶ Immune system records all successfully used antibodies = faster and more massive response Gene Technologies193 Introduction Gene Technologies194 Antigen, antibody, epitope ̶ Antigen - a foreign molecule that activates the immune system ̶ Strongest immune responses = glycoproteins and lipoproteins ̶ Very often polysaccharides on the surface of microorganisms serve as antigens ̶ DNA can also serve as an antigen ̶ The animal immune system is based on specific (acquired) immunity divided into: - humoral immunity (mediated by immunoglobulins) - cell-mediated immunity (T-lymphocytes = TH and TC) ̶ Antibody = binding to whole proteins ̶ T-lymphocytes = binding to protein fragments ̶ Epitope - region of protein recognized by antibody Gene Technologies195 T-lymphocytes ̶ recognize only antigens expressed on the surface of other cells, mainly macrophages, virus-infected cells or B-lymphocytes ̶ T-lymphocytes recognise these cells via class I and II major histocompatibility complex (MHCs) receptor proteins ̶ Class I activates TH cells and class II activates TC cells ̶ MHC receptors are encoded by a family of genes specific to each individual ̶ MHC receptors are also called major histocompatibility complexes HLA Gene Technologies196 T-lymphocytes Gene Technologies197 Structure and Function of Immunoglobulins ̶ Antibodies divided into 5 basic classes ̶ The most abundant are IgG in serum ̶ Only IgG antibodies cross the placenta ̶ IgA - secretory antibodies important in suppressing respiratory and gastrointestinal infections ̶ IgM - 10 binding sites = coating microorganisms and stimulating cells ̶ IgE - on the surface of mast cells, stimulation of allergic response by histamine release Gene Technologies198 Structure and Function of Immunoglobulins ̶ IgG antibody consists of two light and two heavy chains ̶ Light chains encoded by one of two gene loci  or  ̶ Each of the light and heavy chains consists of one to four constant regions and one variable region ̶ The variable regions form the so-called paratopeantigen binding ̶ We have millions of different variable regions ̶ In the Pant region, antibodies can be divided chemically (by papain) into Fc and two Fab fragments Gene Technologies199 Diversity of Antibodies ̶ There are an almost infinite number of antigens = an almost infinite number of antibodies are needed ̶ Genetic problem concerning the number of genes encoding each antibody ̶ The entire human genome would encode only a few million antibodies ̶ The immune system generates a large number of sequences from a relatively small number of genes in the process of V(D)J recombination ̶ The immune system assembles genes for antibodies from collections of short DNA segments ̶ V(D)J recombination occurs in the bone marrow during B-cell development and is initiated by RAG1 and RAG2 proteins followed by NHEJ Gene Technologies200 V(D)J Recombination Smith et al. 2019 Backhaus et al. 2018 Chr. 2 Chr. 14 https://youtu.be/QTOBSFJWogE Gene Technologies201 Monoclonal Antibodies ̶ Antibodies find wide clinical use ̶ Need for one specific antibody against an antigen ̶ One antigen has many epitopes = polyclonal antibodies ̶ Polyclonal antibodies = mixture of antibodies with different degrees of specificity and binding ̶ Monoclonal antibody = one specific antibody from one B-cell ̶ Viability of B-cells outside the body is very low = fusion with myeloma cells ̶ The resulting cell is called a hybridoma = a forever living cell producing the targeted antibody Gene Technologies202 Use of Antibodies ELISA Rapid tests FACS (Fluorescence-activated cell sorting) Gene Technologies203 "Humanization" of Monoclonal Antibodies ̶ Human immune system recognises mouse antibodies ̶ Several solutions: - Replacing the C-region with a human variant of the antibody - Replacement of V-regions not involved in antigen recognition with a human variant - Complementarity Determining Region (CDR) - hypervariable region recognizing Ag Gene Technologies204 Herceptin and Casirivimab ̶ Monoclonal antibody recognises the epidermal growth factor receptor type 2 (HER2) ̶ In breast cancer patients, HER2 overproduction is associated with resistance to chemotherapy ̶ Binding of antibodies to the receptor prevents its internalization = better efficacy of chemotherapy ̶ Casirivimab - a monoclonal antibody that recognizes the SARS-CoV-2 coronavirus spike protein Gene Technologies205 Nanobodies ̶ Antibodies from camels, alpacas and llamas have only heavy-chain antibodies (hcAb) ̶ The antigen is bound by the terminal variable region of the heavy chain called the VHH (12-15 kDa) ̶ Recombinant antibodies containing only this part are called nanobodies (Nb) ̶ The VHH region has a very high affinity for the antigen ̶ Nanobodies can cross into the brain Gene Technologies206 Vaccines ̶ The immune system remembers foreign antigens - immune memory ̶ Special memory B-cells mediate immune memory ̶ Vaccines consist of derived infectious agents that can no longer cause disease but are still antigenic Vaccines: - Attenuated = pathogens still alive but no longer producing disease-causing toxins or proteins - Subunits = effective against only one component of the pathogen, often requires the use of adjuvants - multivalent = targets several proteins from one or more viruses ̶ Vaccines from attenuated microorganisms usually induce best immune response Gene Technologies207 Vaccines Attenuated vaccines Subunit vaccines Gene Technologies208 Search for Suitable Antigens and Adjuvants ̶ Reverse vaccinology = sequential cloning of pathogen genes and expression of proteins used for immunization (vaccine for Neisserie meningitidis serogroup B) Adjuvant Composition Vaccines Aluminum One or more of the following: amorphous aluminum hydroxyphosphate sulfate (AAHS), aluminum hydroxide, aluminum phosphate, potassium aluminum sulfate (Alum) Anthrax, DT, DTaP (Daptacel), DTaP (Infanrix), DTaP-IPV (Kinrix), DTaP-IPV (Quadracel), DTaP-HepB-IPV (Pediarix), DTaP –IPV/Hib (Pentacel), Hep A (Havrix), Hep A (Vaqta), Hep B (Engerix-B), Hep B (Recombivax), HepA/Hep B (Twinrix), HIB (PedvaxHIB), HPV (Gardasil 9), Japanese encephalitis (Ixiaro), MenB (Bexsero, Trumenba), Pneumococcal (Prevnar 13), Td (Tenivac), Td (Mass Biologics), Tdap (Adacel), Tdap (Boostrix) AS04 Monophosphoryl lipid A (MPL) + aluminum salt Cervarix MF59 Oil in water emulsion composed of squalene Fluad AS01B Monophosphoryl lipid A (MPL) and QS-21, a natural compound extracted from the Chilean soapbark tree, combined in a liposomal formulation Shingrix CpG 1018 Cytosine phosphoguanine (CpG), a synthetic form of DNA that mimics bacterial and viral genetic material Heplisav-B No adjuvant ActHIB, chickenpox, live zoster (Zostavax), measles, mumps & rubella (MMR), meningococcal (Menactra, Menveo), rotavirus, seasonal influenza (except Fluad), single antigen polio (IPOL), yellow fever Gene Technologies209 Search for Suitable Antigens and Adjuvants Reverse vaccinology Differential fluorescence induction (DFI) In-vivo induced antigen technology (IVIAT) Gene Technologies210 Adenovirus vaccines Bermejo et al, 2020 https://sputnikvaccine.com/about-vaccine/ Gene Technologies211 mRNA vaccines Gene Technologies212 mRNA vaccines Versteeg et al, 2019