C6215 Advanced Biochemistry and its Methods Lesson 1 Introduction into Genomics Jan Hejátko Funkční genomika a proteomika rostlin, Středoevropský technologický institut (CEITEC) a Národní centrum pro výzkum biomolekul, Přírodovědecká fakulta, Masarykova univerzita, Brno hejatko@sci.muni.cz, www.ceitec.eu ▪ Definition Of Genomics ▪ Forward vs Reverse Genetics ▪ Gene Structure and Identification ▪ Nucleic Acid Sequencing ▪ Analysis of Gene Expression Outline ▪ Definition Of Genomics Outline ▪ Sensu lato (in the broad sense) – it is interested in STRUCTURE and FUNCTION of genomes ▪ Sensu stricto (in the narrow sense) – it is interested in FUNCTION of INDIVIDUAL GENES – FUNCTIONAL GENOMICS ▪ It uses mainly the reverse genetics approaches ▪ Necessary prerequisite: knowledge of the genome (sequence) – work with databases GENOMICS – What is it? ▪ Definition Of Genomics ▪ Forward vs Reverse Genetics Outline 3 : 1 Forward Genetics Reverse Genetics ? 5‘TTATATATATATATTAAAAAATAAAATAA AAGAACAAAAAAGAAAATAAAATA….3‘ BIOINFORMATICS FUNCTIONAL GENOMICS ▪ Definition Of Genomics ▪ Forward vs Reverse Genetics ▪ Gene Structure and Identification Outline ▪ Promoter ▪ Transcriptional start ▪ 5´UTR ▪ Translational start ▪ Splicing sites ▪ Stop codon ▪ 3´UTR ▪ Polyadenylation signal TATA ATG….ATTCATCAT ATTATCTGATATA 5´UTR 3´UTR ….ATAAATAAATGCGA Gene Structure ▪ Omitting 5‘ and 3‘ UTR ▪ Identification of translation start (ATG) and stop codon (TAG, TAA, TGA) ▪ Finding donor (typically GT) and acceptor (AG) splicing sites ▪ Many ORFs are NOT real coding sequences ▪ Using various statistic models (e.g. Hidden Markov Model – HMM, see recommended literature, Majoros et al., 2003) to evaluate and score the weight of identified donor and acceptor sites Identification of Genes Ab Initio ▪ Alteration of phenotype after mutagenesis ▪ Forward genetics ▪ Identification of sequence-specific mutant and analysis of its phenotype ▪ Reverse genetics ▪ Analysis of expression of a particular gene and its spatiotemporal specifity ▪ Principles of experimental identification of genes using forward and revers genetics Experimental Gene Identification ▪ Alteration of phenotype after mutagenesis ▪ Forward genetics ▪ Principles of experimental identification of genes using forward and revers genetics Experimental Gene Identification Identification of CKI1 via Activation Mutagenesis  CKI1 overexpression mimics cytokinin response Kakimoto, Science, 1996 NO hormones tZ ctrl1 ctrl2Plasmid Rescue Pro35S::CKI1 Signal Transduction via MSP NUCLEUS PM AHK sensor histidine kinases • AHK2 • AHK3 • CRE1/AHK4/WOL REGULATION OF TRANSCRIPTION INTERACTION WITH EFFECTOR PROTEINS HPt Proteins • AHP1-6 Response Regulators • ARR1-24 ▪ Alteration of phenotype after mutagenesis ▪ Forward genetics ▪ Identification of sequence-specific mutant and analysis of its phenotype ▪ Reverse genetics ▪ Principles of experimental identification of genes using forward and revers genetics Reverse Genetics Identification of insertional cki1 mutant CKI1 Regulates Female Gametophyte Development CKI1/CKI1CKI1/cki1-i Hejátko et al., Mol Genet Genomics (2003) ♂ ♀ P CKI1/cki1-i F1 Anticipated: CKI1 cki1-i CKI1 cki1-i CKI1/CKI1 CKI1/cki1-i CKI1/cki1-i cki1-i/cki1-i 1 CKI1 : 2 CKI1/cki1-i : 1 cki1-i Observed: 1 CKI1 : 1 CKI1/cki1-i cki1-i reveals non-Mendelian inheritance CKI1 and Megagametogenesis A. ♂ wt x ♀ CKI1/cki1-i B. ♂ CKI1/cki1-i x ♀ wt C. ♂ wt x ♀ CKI1/cki1-i D. ♂ CKI1/cki1-i x ♀ wt CKI1 specific primers (PCR positive control) cki1-i specific primers  cki1-i is not transmitted through the female gametophyte FG 0FG 1FG 2FG 3FG 4 CKI1 and Megagametogenesis cki1-iCKI1 late FG5FG6FG7 24 HAE48 HAE Hejátko et al., Mol Genet Genomics (2003) CKI1 and Megagametogenesis ▪ Alteration of phenotype after mutagenesis ▪ Forward genetics ▪ Identification of sequence-specific mutant and analysis of its phenotype ▪ Reverse genetics ▪ Analysis of expression of a particular gene and its spatiotemporal specifity ▪ Principles of experimental identification of genes using forward and revers genetics Experimental Gene Identification FG0-FG1 FG3-FG4 FG4-FG5 FG7 CKI1 and Megagametogenesis Paternal CKI1 is Expressed Early after Fertilization 12 HAP (hours after pollination) 24 HAP48 HAP72 HAP ♀ wt x ♂ ProCKI1:GUS 24 HAP Hejátkoetal.,MolGenetGenomics(2003) ▪ Definition Of Genomics ▪ Forward vs Reverse Genetics ▪ Genes Structure and Identification ▪ Nucleic Acid Sequencing Outline Frederick Sanger 1958 – Nobel prize – insulin structure 1975 - Dideoxy sequencing method 1980 – second Nobel prize for NA sequencing Sanger Sequencing Sanger Sequencing NGS Sequencing ▪ Definition Of Genomics ▪ Forward vs Reverse Genetics ▪ Genes Structure and Identification ▪ Nucleic Acid Sequencing ▪ Analysis of Gene Expression Outline ▪ Methods of gene expression analysis ▪ Quantitative analysis of gene expression ▪ DNA chips ▪ Next generation transcriptional profiling ▪ Qualitative analysis of gene expression ▪ Preparation of transcriptional fusion of promoter of analysed gene with a reporter gene ▪ Preparation of translational fusion of the coding region of the analysed gene with reporter gene ▪ Use of the data available in public databases ▪ Tissue- and cell-specific gene expression analysis Gene Expression Assays ▪ Methods of gene expression analysis ▪ Quantitative analysis of gene expression ▪ DNA chips Expression Assays ▪ DNA čipy ▪ metoda umožňující rychlé porovnání velkého množství genů/proteinů mezi testovaným vzorkem a kontrolou ▪ nejčastěji jsou používané oligo DNA čipy ▪ k dispozici komerčně dostupné sady pro celý genom ▪ firma Operon (Qiagen), 29.110 70-mer oligonulkleotidů reprezentujících 26.173 genů kódujících proteiny, 28.964 transkriptů a 87 microRNA genů Arabidopsis thaliana ▪ možnost používat pro přípravu čipů fotolitografické techniky-usnadnění syntézy oligonukleotidů např. pro celý genom člověka (cca 3,1 x 109 bp) je touto technikou možno připravit 25-mery v pouźe 100 krocích) Affymetrix ATH1 Arabidopsis genome array DNA Chips ▪ čipy nejen pro analýzu exprese, ale např. i genotypování (SNPs – jednonukleotidové polymorfizmy, sekvenování pomocí čipů, …) ▪ For the correct interpretation of the results, good knowledge of advanced statistical methods is required ▪ Control of accuracy of the measurement (repeated measurements on several chips with the same sample, comparing the same samples analysed on different chips with each other) ▪ It is necessary to include a sufficient number of controls and repeats ▪ Control of reproducibility of measurements (repeated measurements with different samples isolated under the same conditions on the same chip – comparing with each other) Che et al., 2002 ▪ Identification of reliable measurement treshold nespolehlivé spolehlivé ▪ Finally comparing the experiment with the control or comparing different conditions with each other -> the result DNA Chips ▪ Currently there‘s been a great number of results of various experiments in publicly accessible databases ▪ Methods of gene expression analysis ▪ Quantitative analysis of gene expression ▪ DNA chips ▪ Next generation transcriptional profiling Gene Expression Assays WT hormonal mutant □ Transcriptional profiling via RNA sequencing mRNA Sequencing by Illumina and number of transcripts determination mRNA cDNA cDNA Next Gen Transcriptional Profiling □ Transcriptional profiling yielded more then 7K differentially regulated genes… gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant AT1G07795 1:2414285-2414967 WT MT OK 0 1,1804 1.79769e+308 1.79769e+30 8 6.88885e-05 0,000391801 yes HRS1 1:4556891-4558708 WT MT OK 0 0,696583 1.79769e+308 1.79769e+30 8 6.61994e-06 4.67708e-05 yes ATMLO14 1:9227472-9232296 WT MT OK 0 0,514609 1.79769e+308 1.79769e+30 8 9.74219e-05 0,000535055 yes NRT1.6 1:9400663-9403789 WT MT OK 0 0,877865 1.79769e+308 1.79769e+30 8 3.2692e-08 3.50131e-07 yes AT1G27570 1:9575425-9582376 WT MT OK 0 2,0829 1.79769e+308 1.79769e+30 8 9.76039e-06 6.647e-05 yes AT1G60095 1:22159735-22162419 WT MT OK 0 0,688588 1.79769e+308 1.79769e+30 8 9.95901e-08 9.84992e-07 yes AT1G03020 1:698206-698515 WT MT OK 0 1,78859 1.79769e+308 1.79769e+30 8 0,00913915 0,0277958 yes AT1G13609 1:4662720-4663471 WT MT OK 0 3,55814 1.79769e+308 1.79769e+30 8 0,00021683 0,00108079 yes AT1G21550 1:7553100-7553876 WT MT OK 0 0,562868 1.79769e+308 1.79769e+30 8 0,00115582 0,00471497 yes AT1G22120 1:7806308-7809632 WT MT OK 0 0,617354 1.79769e+308 1.79769e+30 8 2.48392e-06 1.91089e-05 yes AT1G31370 1:11238297-11239363 WT MT OK 0 1,46254 1.79769e+308 1.79769e+30 8 4.83523e-05 0,000285143 yes APUM10 1:13253397-13255570 WT MT OK 0 0,581031 1.79769e+308 1.79769e+30 8 7.87855e-06 5.46603e-05 yes AT1G48700 1:18010728-18012871 WT MT OK 0 0,556525 1.79769e+308 1.79769e+30 8 6.53917e-05 0,000374736 yes AT1G59077 1:21746209-21833195 WT MT OK 0 138,886 1.79769e+308 1.79769e+30 8 0,00122789 0,00496816 yes AT1G60050 1:22121549-22123702 WT MT OK 0 0,370087 1.79769e+308 1.79769e+30 8 0,00117953 0,0048001 yes Ddii et al., unpublished AT4G15242 4:8705786-8706997 WT MT OK 0,00930712 17,9056 10,9098 -4,40523 1.05673e-05 7.13983e-05 yes AT5G33251 5:12499071-12500433 WT MT OK 0,0498375 52,2837 10,0349 -9,8119 0 0 yes AT4G12520 4:7421055-7421738 WT MT OK 0,0195111 15,8516 9,66612 -3,90043 9.60217e-05 0,000528904 yes AT1G60020 1:22100651-22105276 WT MT OK 0,0118377 7,18823 9,24611 -7,50382 6.19504e-14 1.4988e-12 yes AT5G15360 5:4987235-4989182 WT MT OK 0,0988273 56,4834 9,1587 -10,4392 0 0 yes Results of –omics Studies vs Biologically Relevant Conclusions ▪ Methods of gene expression analysis ▪ Quantitative analysis of gene expression ▪ DNA chips ▪ Next generation transcriptional profiling ▪ Qualitative analysis of gene expression ▪ Preparation of transcriptional fusion of promoter of analysed gene with a reporter gene Gene Expression Assays ▪ Identification and cloning of the promoter region of the gene ▪ Preparation of recombinant DNA carrying the promoter and the reporter gene (uidA, GFP) TATA box Iniciation of transcription promoter 5’ UTR ATG…ORF of reporter gene Transcriptional Fusion ▪ Identification and cloning of the promoter region of the gene ▪ Preparation of recombinant DNA carrying the promoter and the reporter gene (uidA, GFP) Transcriptional Fusion ▪ Preparation of transgenic organisms carrying this recombinant DNA and their histological analysis LacZ Reporter in Mouse Embryos ▪ Methods of gene expression analysis ▪ Quantitative analysis of gene expression ▪ DNA chips ▪ Next generation transcriptional profiling ▪ Qualitative analysis of gene expression ▪ Preparation of transcriptional fusion of promoter of analysed gene with a reporter gene ▪ Preparation of translational fusion of the coding region of the analysed gene with reporter gene Gene Expression Assays Translational Fusion ▪ Identification and cloning of the promoter and coding region of the analyzed gene ▪ Preparation of a recombinant DNA carrying the promoter and the coding sequence of the studied gene in a fusion with the reporter gene (uidA, GFP) TATA box promoter 5’ UTR ATG…ORF of analysed gene…..….ATG…ORF of reporter gene….….....STOP Histone 2A-GFP in Drosophila embryo by PAMPIN1-GFP in Arabidopsis Translational Fusion ▪ Preparation of transgenic organisms carrying the recombinant DNA and their histological analysis ▪ Compared to transcriptional fusion, translation fusion allows analysis of intercellular localization of gene product (protein) or its dynamics Translational Fusion ▪ Methods of gene expression analysis ▪ Quantitative analysis of gene expression ▪ DNA chips ▪ Next generation transcriptional profiling ▪ Qualitative analysis of gene expression ▪ Preparation of transcriptional fusion of promoter of analysed gene with a reporter gene ▪ Preparation of translational fusion of the coding region of the analysed gene with reporter gene ▪ Use of the data available in public databases ▪ Tissue- and cell-specific gene expression analysis Gene Expression Assays Fluorescence-Activated Cell Sorting (FACS) □ High-Resolution Expression Map in Arabidopsis Brady et al., Science, 2007 Gene Expression Assays BAR ePlant https://bar.utoronto.ca/eplant/ Expression Maps - RNA □ High-Resolution Expression Map in Drosophilla NikosKaraiskosetal.Science2017;science.aan3235 Drosophila Virtual Expression eXplorer https://shiny.mdc-berlin.de/DVEX/ Ponten et al., J Int Med, 2011 □ Human Protein Atlas (http://www.proteinatlas.org/) Expression Maps - Proteins □ Human Protein Atlas (http://www.proteinatlas.org/) Expression Maps - Proteins □ Human Protein Atlas (http://www.proteinatlas.org/) Expression Maps - Proteins Summary ▪ Definition Of Genomics ▪ Forward vs Reverse Genetics ▪ Genes Structure and Identification ▪ Nucleic Acid Sequencing ▪ Analysis of Gene Expression Discussion