Exprimované geny a přírodní selekce Produkty funkčních genů a jejich význam v ekologických studiích Geny a adaptace •studium selekčních tlaků daných prostředím a evoluční odpovědi na ně ® vznik adaptací, tj. geneticky podmíněné přizpůsobení se prostředí (vs. fenotypová plasticita) • •např. interakce s abiotickým prostředím Proč geny v molekulární ekologii? •Geny mají funkční význam – geneticky determinovaný polymorfismus •-> studium proximátních mechanismů • •Př.: Proč je samec hýla rudého červeně zbarven? •ultimátní vysvětlení – aby se líbil samicím a zplodil s nimi více potomků •proximátní vysvětlení – protože karotenoidy získané z potravy ukládá více do peří a méně je používá v imunitní odpovědi (protože má dobré geny) Funkční vs. neutrální genetická variabilita 97% lidské DNA nic nekóduje!!! kódující DNA = funkční geny nekódující DNA (repetice, pseudogeny, introny atd.) fenotyp adaptace ke specifickému prostředí přírodní výběr Jak relevantní je informace získaná z genetických dat Příklad: 10 microsatelitů = „neutrální znaky“ üpopulačně-genetická struktura üinbreeding übottleneck • Xadaptace Xproximátní evoluční mechanismy • a priori neutrální k působení přírodního výběru • • •Exons à protein coding, under selection •Introns à non-coding, neutral •Intergenic regions à non-coding, neutral • • • • • • Struktura genů The first two are as said the motors generating variation. The remaining ones are what happens in populations. Shapes variation and reshuffels variation. Their relative importance determines what happens to new variants in populations. The study of allele frequency changes induced my them is the domain of population genetics. But of course all of them are reflected also in DNA sequence, and all of them can be studies using phylogenetic methods. he sequence data we look at is influenced by all of them, and depending on the application we have to consider some of them that we are not even interested in, in order to not falsely interpret results. In the following I’ll go into applications. And will refer to which assumptions we make in which case, and how we have to take these factors into account. à3rd position evolves neutrally à 1st and 2nd position under selection à Degenerovaný genetický kód The first two are as said the motors generating variation. The remaining ones are what happens in populations. Shapes variation and reshuffels variation. Their relative importance determines what happens to new variants in populations. The study of allele frequency changes induced my them is the domain of population genetics. But of course all of them are reflected also in DNA sequence, and all of them can be studies using phylogenetic methods. he sequence data we look at is influenced by all of them, and depending on the application we have to consider some of them that we are not even interested in, in order to not falsely interpret results. In the following I’ll go into applications. And will refer to which assumptions we make in which case, and how we have to take these factors into account. • • Fylogenetická analýza (detekce na úrovni sekvencí) • • Populačně-genetická analýza (detekce na úrovni frekvence alel) • • • • • • • • • • Jak poznat, že na daný gen působí selekce? First polymorphisms are segregating within species. After speciation the gene-flow is usually interrupted. Thus the evolution is uncoupled, and mutation, drift and selection happen independently in both taxa. Over time, the polymorphisms (can) get species specific. It is the accumulation of • • Gene-tree versus species tree • • • • • • • • • • • population split speciation complete A B Species tree Gene tree Studium selekce – fylogenetická analýza Jak poznat, že na daný gen působí selekce? First polymorphisms are segregating within species. After speciation the gene-flow is usually interrupted. Thus the evolution is uncoupled, and mutation, drift and selection happen independently in both taxa. Over time, the polymorphisms (can) get species specific. It is the accumulation of • • Gene-tree versus species tree • • • • • • • • • • • A B C A B C A B C Studium selekce – fylogenetická analýza Gene 1 Gene 2 The two genes show incongruencies. Still, if you imagine many species and many gene-trees, they will have common signals corresponding to species. There is a growing number of methods to estimate species trees from gene-trees. It is here that certain population processes come into play! • • Gene-trees & Selection: The case of C4 photosynthesis • • Christin et al. 2007, Curr Biol • PEPC gene • àC4 photosynthesis • • • • • • • • Studium selekce – fylogenetická analýza The first two are as said the motors generating variation. The remaining ones are what happens in populations. Shapes variation and reshuffels variation. Their relative importance determines what happens to new variants in populations. The study of allele frequency changes induced my them is the domain of population genetics. But of course all of them are reflected also in DNA sequence, and all of them can be studies using phylogenetic methods. he sequence data we look at is influenced by all of them, and depending on the application we have to consider some of them that we are not even interested in, in order to not falsely interpret results. In the following I’ll go into applications. And will refer to which assumptions we make in which case, and how we have to take these factors into account. • •Y • •x • • • • • • • • species-tree ◄intronic sequences gene-tree coding sequences► Christin et al. 2007, Curr Biol The first two are as said the motors generating variation. The remaining ones are what happens in populations. Shapes variation and reshuffels variation. Their relative importance determines what happens to new variants in populations. The study of allele frequency changes induced my them is the domain of population genetics. But of course all of them are reflected also in DNA sequence, and all of them can be studies using phylogenetic methods. he sequence data we look at is influenced by all of them, and depending on the application we have to consider some of them that we are not even interested in, in order to not falsely interpret results. In the following I’ll go into applications. And will refer to which assumptions we make in which case, and how we have to take these factors into account. Selekce na úrovni sekvencí The first two are as said the motors generating variation. The remaining ones are what happens in populations. Shapes variation and reshuffels variation. Their relative importance determines what happens to new variants in populations. The study of allele frequency changes induced my them is the domain of population genetics. But of course all of them are reflected also in DNA sequence, and all of them can be studies using phylogenetic methods. he sequence data we look at is influenced by all of them, and depending on the application we have to consider some of them that we are not even interested in, in order to not falsely interpret results. In the following I’ll go into applications. And will refer to which assumptions we make in which case, and how we have to take these factors into account. • •Testing for selection: dN/dS • • • • • • •dN/dS = 1 as many syn as non-syn substitutions à neutral evolution • •dN/dS < 1 less non-syn than syn substitutions à purifying selection • •dN/dS > 1 more non-syn than syn substitutions à positive selection • • • • • • • dS Rate of synonsymous substitutions, ‘neutral’ evolutionary rate dN Rate of non-synonsymous substitutions Selekce na úrovni sekvencí The first two are as said the motors generating variation. The remaining ones are what happens in populations. Shapes variation and reshuffels variation. Their relative importance determines what happens to new variants in populations. The study of allele frequency changes induced my them is the domain of population genetics. But of course all of them are reflected also in DNA sequence, and all of them can be studies using phylogenetic methods. he sequence data we look at is influenced by all of them, and depending on the application we have to consider some of them that we are not even interested in, in order to not falsely interpret results. In the following I’ll go into applications. And will refer to which assumptions we make in which case, and how we have to take these factors into account. coding sequences► without positively selected sites Christin et al. 2007, Curr Biol ◄intronic sequences The first two are as said the motors generating variation. The remaining ones are what happens in populations. Shapes variation and reshuffels variation. Their relative importance determines what happens to new variants in populations. The study of allele frequency changes induced my them is the domain of population genetics. But of course all of them are reflected also in DNA sequence, and all of them can be studies using phylogenetic methods. he sequence data we look at is influenced by all of them, and depending on the application we have to consider some of them that we are not even interested in, in order to not falsely interpret results. In the following I’ll go into applications. And will refer to which assumptions we make in which case, and how we have to take these factors into account. • •Y • •x • • • • • • • • PEPC gene 12 codons with dN/dS > 1 Sequences of unrelated but ecologically similar species more similar at these positions than they are in related species. à convergent evolution! Christin et al. 2007, Curr Biol The first two are as said the motors generating variation. The remaining ones are what happens in populations. Shapes variation and reshuffels variation. Their relative importance determines what happens to new variants in populations. The study of allele frequency changes induced my them is the domain of population genetics. But of course all of them are reflected also in DNA sequence, and all of them can be studies using phylogenetic methods. he sequence data we look at is influenced by all of them, and depending on the application we have to consider some of them that we are not even interested in, in order to not falsely interpret results. In the following I’ll go into applications. And will refer to which assumptions we make in which case, and how we have to take these factors into account. ch7_MHC-proteins Major histocompatibility complex (MHC) Class I Class II Oblast rozeznávající antigen Buňka nabízející antigen spustí imunitní odpověď •Srovnání populačně-genetické struktury na MHC genech a neutrálních znacích (mikrosatelity) neutrální znaky migrace náhodný drift MHC – pokud nepůsobí selekce migrace náhodný drift neutrální znaky migrace náhodný drift MHC migrace náhodný drift selekce diverzifikující selekce MHC migrace náhodný drift selekce balancující selekce Studium selekce – populačně-genetická analýza Je možno kvantifikovat např. pomocí FST Studované lokality – 7 populací ve stejné fázi populačního cyklu Canton Nozeroy (pohoří Jura, région Franche-Comté) 07 01 06 05 04 03 02 Bryja et al. 2007, Molecular Ecology Důkaz přírodního výběru v současnosti: analýza populačně-genetické struktury - Srovnání neutrálních znaků a MHC à 2001-2003: fáze růstu populační hustoty April 2002 October 2002 April 2003 October 2003 Diferenciace populací v průběhu růstu denzity Pokles diferenciace s nárůstem denzity (nárůst disperze, tj. toku genů) MHC (zejména DQA1) – signifikantně odlišné od mikrosatelitů * * * Signifikantní rozdíl DQA1 vs. mikrosatelity Bryja et al. 2007, Molecular Ecology Závěr: Typ selekce na MHC závisí na početnosti populace arvicola6 Nízká denzita Lokální rozdíly ve společenstvech patogenů Lokální diverzifikující selekce Vysoká denzita Nárůst diverzity parazitů v důsledku disperze Balancující selekce Bryja et al. 2007, Molecular Ecology Metody studia funkční variability 1.Sledování kandidátních genů 2. 2. 2.Genomické přístupy (mnoho genů najednou) pytlouš Chaetodipus intermedius Hoekstra, Nachman et al. •Tmavé a světlé zbarvení •Odpovídá barvě prostředí (tmavé zbarvení na lávě) •Arizona • •Korelace zbarvení s prostředím i na malé škále • •mtDNA nekoreluje se zbarvením • •Sekvenování kandidátních genů (známých z inbredních myší) • •melanocortin-1 receptor MC1R • •Záměna 4 aminokyselin • •Jednoduchá dědičnost alel a zbarvení • • • MC1R u člověka, mamuta a dalších •U člověka zrzavé vlasy a neschopnost se opálit •Zbarvení krav, koňů a psů •Výskyt dvou odlišných variant u mamutů Measuring of expression - qPCR •Relative comparison of quantity of particular DNA, e.g. level of specific gene expression (i.e. particular RNA = cDNA) (e.g. comparison of different tissue types, elevations, treatement vs. non-treatement etc.) •housekeeping genes – use as standard for quantification •same number of copies in all cells (e.g. genes encoding proteins of cytoskeleton) •constitutive genes - expressed in all cells, independent on experimental treatment •validation of housekeeping genes should be performed before their use in gene expression experiments A gene which is to be used as a loading control (or internal standard) should have various features - see slide. Sledování mnoha genů najednou Genome Transcriptome Proteome DNA mRNA Proteins Transcription Translation Genomics Transcriptomics Proteomics GP_FA7TYN03040706_inv 3 billion bases 20-30,000 genes ~100,000 proteins Functional Genomics Exome Exome-Seq Targeted exome capture •targets ca. 20,000 coding sequences •high depth of coverage for more accurate variant calling Transcriptomics – analysis of mRNAs 1. microarrays 2. RNA seq (NGS) 1. Analysis of gene expression by microarrays Ranz JM, Machado CA: Uncovering evoutionary patterns of gene expression using microarrays. TREE, 21(1): 29-37 Microarray analysis of transcriptome (~ specific DNA hybridization) Target (i.e. mix of transcripts in a form of cDNA = mRNA přepsaná do DNA reverzní transkriptázou, tj. neobsahuje introny) Probe (i.e. synthesized oligonucleotides complementary to particular genes) The GeneAtlas System Sledování exprese genů microarrays •Sledování exprese mnoha (tisíce) genů najednou •Založeno na hybridizaci •Sleduje se rozdíl vůči kontrole ("heterologous hybridization") = dvoukanálový experiment http://upload.wikimedia.org/wikipedia/commons/f/f2/Cdnaarray.jpg Affymetrix Agilent Technologies Vyhodnocení chipu – analýza obrazu (srovnání úrovně exprese mezi kontrolou a experimentem) http://upload.wikimedia.org/wikipedia/commons/0/0e/Microarray2.gif - komerčně dostupné pro kompletní transkriptom cca 25 druhů (Affymetrix) (další jsou rychle vyvíjeny, i na zakázku) - celkově ale microarrays ustupují před RNAseq desítky tisíc transkriptů Populus trichocarpa x deltoides a Malacosoma disstria bourovec Ralph et al. 2006 •cDNA microarray • •15496 genů > ¾ genomu • •Po 24 hodinách 1191 genů up-regulated 537 down-regulated • •Obrana: endochitinázy, inhibitory proteáz •Signální funkce • •Transport, metabolismus, regulace transkripce 2. RNAseq („high-throughput sequencing of cDNAs“) Adapted from Zeng and Mortazavi, Nature Immunology 13 (2012) RNA-Seq workflow for gene expression analysis Fragmented mRNA List of differentially expressed genes RNA-Seq quantification The reference transcriptome is required (RPKM = reads per kilobase per million reads) Gene ontology (http://geneontology.org/) = functional annotation analysis •založena na databázích dostupných anotovaných genů u modelových organismů •Cellular Component - the parts of a cell or its extracellular environment •Molecular Function - the elemental activities of a gene product at the molecular level, such as binding or catalysis •Biological Process - operations or sets of molecular events with a defined beginning and end, pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms. •molecular function: oxidoreductase activity •biological proces: oxidative phosphorylation •cellular component: mitochondrial matrix Example of GO annotation: cytochrome c Examples Jaká je úroveň exprese v různých tkáních? quantitative real-time PCR vliv aklimatizace k chladu (20° vs 15°C) - srdeční, červená a bílá svalovina - rozdíly v expresi 113, 81 a 196 genů - rozdíly jsou způsobeny tkáňově specifickým stupněm endotermie http://www.biomedcentral.com/content/figures/1471-2164-15-754-1-l.jpg •cca 334 miliónů sekvencí („reads“); 42 mil./sample • •210 DEGs („differentially expressed genes“) – 119 up-regulated, 91 down-regulated u žlutých jedinců • • xanthophory melanophory erythrophory Konzistence výsledků http://www.biomedcentral.com/content/figures/1471-2164-15-754-1-l.jpg •změny v expresi jdou stejným směrem u RNA-seq i RT-qPCR vybraných genů • •hierarchical clustering of expression level •„žlutí“ a „červení“ jsou si navzájem podobní Functional annotation clustering (= gene ontology) up-regulated in yellow down-regulated in yellow •xanthophory u žlutých jedinců jsou asociovány s melanogenezí •v dalším kroku je možné studovat roli jednotlivých kandidátních genů • Figure 1 Figure 2 Figure 3 20721 SNPs (ddRAD) – no genetic difference at neutral loci Only differentially expressed genes are responsible for morphological changes (zobák, zbarvení) Mason and Taylor 2015 •~55-Mb inversion of chromosome 1 („supergene“) •multiple candidate genes related to melanogenesis, carotenoid coloration, and bill shape •latitudinal gradient in ecotype distribution – balanced polymorphism of supergene haplotypes Sequencing of 73 genomes of all three „species“ expression data Závěr •Molekulární ekologie se rychle vyvíjí • •Metody se zásadně vylepšují a mění • •Co platilo dnes, nemusí platit zítra – těšme se tedy na zítřek • •