03/05/2016 1 Genetic structure of populations, drift, mutations • Drift → differentiation of populations random changes in allele frequencies (may lead to fixation of alternative alleles) • Mutations & selection increase differentiation AA AA AA AA AA AA aa Aa AA aa Aa AA aa Aa Aa aa AA AA aa aa aa aa aa aa Aa AaAa Aa drift Gene flow - acts against differentiation of subpopulations AB ac GENE FLOW = the transfer of genes/alleles from one population to another → CHANGE OF ALLELIC FREQUENCIES 03/05/2016 2 MIGRATION versus GENE FLOW • movement of individuals between pops • immigrants may not be reproducing in a new pop! (even a strong migration/dispersal does not mean necessarily any gene flow) • detectable (with substatntial difficulties) by direct ecological methods • movement of alelles (genes) between pops • via dispersion of individuals, propagules (gametes – pollen, seeds) • passive in plants, mostly active in animals • if strong → homogenization of allele frequencies between the pops • prevents pop differentiation, divergence of pos, establishment of pop structure, and ultimately to speciation ---- by mixing the genepools • prevents decrease of abilitiy to survive due to inbreeding • estimable from genetic data Quantifying gene flow 1. Direct methods: • observation • Capture-Mark-Recapture sampling • telemetry 2. Indirect methods – methods of population genetics  we have information about pop structure (expected subpopulations or estimated from genetic data)  based on distribution of genetic variation  based on deviations from Hardy-Weinberg equilibrium  estimation based on FST  model-based methods based on the coalescent theory (eg. MIGRATE software) 03/05/2016 3 http://popgen.sc.fsu.edu/Migrate/Migrate-n.html IBD 03/05/2016 4 Models of gene flow • island model (Wright 1931) assume same size of subpops assume symetrical flow of genes assume equal probability of gene exchange between subpops • stepping stone model (Kimura 1953) exchange only between adjacent subpops • Private alleles (Slatkin 1985) – useful for highly polymorphic markers = alleles occuring only in a single subpopulation p(1) - frequency of private alleles lnp(1) = -0,505 ln(Nem) - 2.44 • F statistics mN F e ST 41 1   Nem = number of adult, reproducing migrants between subpops per a generation (island model assumed!) It is just a rough estimation at a scale of „few“ and „a lot“ (only for Fst > 0.05-0.10) Assumptions for using Nem: • island model (= infinite number of subpops, no natural selection, equal size of all subpops, equal probability of migrant exchange between all subpops) • migration-drift equilibrium (= no population expansion, no habitat fragmentation, no population bottleneck) 03/05/2016 5 but be aware!!! • even in a case of two very very distant populations • FST → will never be equal to zero, Nem → there had been exchange of individuals in the past • even pops which have never exchanged any migrants will have never Nem equals to zero • extreme case of a complete genetic isolation: Nm = 0, Fst = 1 • 1 migrant every forth generation: Nm = 0.25, Fst = 0.5 • 1 migrant every second generation: Nm = 0.5, Fst = 0.33 • 1 migrant every generation: Nm = 1, Fst = 0.2 • 2 migrants every generation: Nm = 2, Fst = 0.11 Inference of Recent Migration • BayesAss: Bayesian Inference of Recent Migration Using Multilocus Genotypes • Reference: G.A. Wilson and B. Rannala 2003. Bayesian inference of recent migration rates using multilocus genotypes. Genetics 163: 1177-1191. • http://www.rannala.org/?page_id=245 03/05/2016 6 Assignment tests • assign individuals to their most likely population of origin • done by comparison of individual genotypes to the genetic profiles of various populations • vs Nem based indirect methods: not comparing overall genetic similarities between pops, but a maximum likelihood method to estimate probabilities that a given genotypes arose from alternative pops (Paetkau et al. 1995) • all pops are assumed to be in HWE and the loci not in LD 03/05/2016 7 Population assignment tests • program GeneClass (Piry et al. 2004) • estimates probabilities of a certain genotype being from a certain predefined population – identification of recent migrants or samples of unknown origin (fight against poaching) • may combine data of various genetic markers Depends on the level of genetic difference between populations 5 microsatellite loci Fst = 0.14 99.9% assigned correctly 5 microsatellite loci Fst = 0.04 90.2% assigned correctly 03/05/2016 8 Subspecies identification of chimpanzees in Czech ZOOs • chimpanzees in ZOOs often of unclear origin • genetic data from natural populations are available (300 msats, Becquet et al. 2007) • 30 most informative microsatellites – genotypization of all chimpanzees in CZ • GeneClass: assignment to the subspecies/populations Mapua et al. (2011) • some individuals are genetically clearly assigned to ESU (Evolutionary Significant Units = subspecies) – Zoo in Liberec, Dvůr Králové • but also quite a few of hybrids (mainly Ostrava, Brno, etc.) Mapua et al. (2011) 03/05/2016 9 BayesASS GeneClass 2 • Giant panda -Bayesian estimates of gene flow over few last generations -identification of two possible first-generation migrants -recommendations for conservation management – migration corridor construction Zhu et al. 2011 Mol Ecol Models of gene flow • Island model (Wright 1931) assume same size of subpops assume symetrical flow of genes assume equal probability of gene exchange between subpops • Stepping stone model (Kimura 1953) exchange only between adjacent subpops • Isolation by distance Gene flow rate dicreases with increasing distance between subpops 03/05/2016 10 Isolation by distance (IBD) = the amount of gene flow between pops is inversely proportional to the geographic distances between them • Sewall G. Wright (1943) • regression of log-transformed gene flow estimate (eg. FST) and appropriate log-transformed geographic distances • significance of correlation tested by Mantel test (does not assume independent population pairwise comparisons) • relevant geographical scale (depends on dispersal abilities) • migration-drift equilibrium must occur • IBD (isolation-by-distance) is not – in very recently isolated populations – in completely isolated populations – in case of high amount of migration IBD detection • correlation between matrices of genetic and geographic distances • Mantel test • e.g. Genepop 03/05/2016 11 Isolation by distance Crotaphytus collaris Hutchinson & Templeton 1999 no barriers for tens of thousands years equilibrium between drift and migration postglacially fragmentation incluence of drift postglacially no barriers influence of migration postglacially increasing fragmentation influence of drift at big scales equilibrium at small scales Wellenreuther et al. 2010 03/05/2016 12 Population assignments Classical problems of population genetics • Populations are defined, individuals a priori assigned to populations, we are interested in population characteristics (F-statistics) → i.e. pop genetic diversity and structure • Populations are defined, but we want to assign individuals of unknown origin to them • Cryptic population structure = nothing is known at the beginning → we want to estimate clusters (i.e. natural homogeneous populations) and assign the individuals to the clusters (population assignments) A. Direct methods • morphological variation (geographical races) • leg-bands or similar markers (ex. over one million Ficedula hypoleuca have been ringed in UK and Sweden – only six recaptured on wintering grounds in Africa • satellite telemetry – expensive, not useful for small animals B. Biogeochemical approaches • ratios of stable isotopes of naturally occurring elements (C, H, N, Sr) vary across the landscape • determined by the relative frequency of C3 and C4 plants, climate, and bedrock (1) geographical structure of isotopic ratio distributions (2) knowledge about where animals incorporate isotopes (3) tissue samples from individuals at different parts of their annual cycle 03/05/2016 13 C. Genetic approaches • « few birds have rings, but everybody has genotype » • genetic data about population structure • problems: (1) low genetic differentiation between pops (intense dispersal), (2) low differentiation in temperate zone – recent postglacial colonization • Solutions: (1) use more genetic markers, (2) study of parasite DNA (e.g. avian malaria) – parasites have quicker evolution, are more differentiated • cryptic population structure • unknown number of clusters • level of an individual • identify clusters and assign individuals to them simultaneously • we have individual genotypes (sometimes also geographical coordinates) • Data: msats (other codominant loci, SINE), AFLP Individual-based assignments 03/05/2016 14 Dendrogram based on microsatellite genotype distances between individuals (Cavali-Sforza distances) May be biased in case of too few markers 03/05/2016 15 LANDSCAPE GENETICS • approach combining population genetics, spatial statistics (GIS) and landscape ecology • aiming to quantify the influence of landscape features and environmental variables on the distribution of allele frequencies among populations = to understand the relationship between habitats and gene flow • „landscape“ – the area that the organism of interest is utilizing (ie. number of various habitats of varying suitability) • homogeneous vs. heterogeneous landscape ??? • homogeneous: panmictic population • homogeneous, but larger than the dispersal distance of an individual: IBD • heterogeneous (ie. various habitats): gene flow in not equal throughout the landscape Spatially explicit analyses = spatial genetics = landscape genetics • based on Bayesian clustering approach (of STRUCTURE type) – individual-based models • for modelling is added information of both genetic data and geographical coordinates • e.g. programs BAPS, TESS, Geneland (the „best“ number of clusters – K – is estimated automatically) Bayesian spatial clustering 03/05/2016 16 Spatial models use Voronoi diagrams Voronoi polygons, Dirichlet tessellation - type of decomposition of metric space defined by distances to a given discrete set of objects in space, e.g. a discrete set of points - separation of plane according to a given set of points M - Voronoi diagram is a separation of plane in such a way that each point b from M is provided by an area V(b) whose all points are closer to the point b than to any other point of M G. F. Voronoi (1868-1908) http://is.muni.cz/th/143320/fi_b_a2/animace/voroneho_diagram.html http://ivankuckir.blogspot.cz/2011/03/voroneho-diagram-v-as3.html The example of very fragmented populations: the best model in BAPS for Central and Southern Dinaromys populations (spatial clustering of groups of individuals): K=13 (i.e. evidence of very high structuration) Best Partition: Cluster 1: {C9, C13} Cluster 2: {S6} Cluster 3: {C8, C14} Cluster 4: {C4} Cluster 5: {C1, C2} Cluster 6: {S1, S2, S3, S4} Cluster 7: {C6} Cluster 8: {C3, C15} Cluster 9: {C5, C7} Cluster 10: {C10} Cluster 11: {C11, C12} Cluster 12: {S5} Cluster 13: {C16} program BAPS software for Bayesian Analysis of genetic Population Structure http://www.helsinki.fi/bsg/software/ 03/05/2016 17 GENELAND Population genetic and morphometric data analysis using R and the Geneland program 03/05/2016 18 R platform Posterior probability maps Spatial population genetics Fontaine et al. 2007 Phocoena phocoena 03/05/2016 19 Comparison of features of various „individual-based assignment“ programs STRUCTURE vs. BAPs Robust support of the population structure K = 7 03/05/2016 20 Population structure - summary Connected populations (gene flow) Isolated populations (no gene flow) Ne   Genetic drift   Genetic diversity   Population differentiation  