POPULATION GENETICS SUBPOPULATIONS II. GENETIC STRUCTURE OF POPULATION Assumption for population structure analysis: • neutral loci = no effect of natural selection included • classical population genetics approach = populations are a priori (thought to be) known (e.g. we want to quantify level of genetic differentiation between two localities / ?populations) • BUT populations are not usually known (e.g. due to no obvious spatial heterogeneity over the distribution range) - we want to reveal any potential population differentiation/structure according to our genetic data -> non-a priori methods Genetic structure - any pattern in the genetic make-up of individuals within a population AIMS: • Detection of any genetic structure (subdivision) in a population (in my dataset) • Are there any differences between ..different" (in space and time) populations? • Quantification of such differences = description of genetic structure in population (of genetic differentiation between (sub)populations) • What factors shape (have shaped) these differences? e.g. population history • Is there any migration/connection between different populations? = detection and quantification of gene flow, what influences gene flow (e.g. spatial heterogeneity) • What happens during migration/connection of populations? = hybridisation Population genetic structure neutral markers genetic drift - creates subpopulation differentiation (changes in allele frequencies -extremely up to fixation of distinct alleles) mutation may increase differentiation inbreeding increase of homozygotes proportion migration (gene flow) - AGAINST subpopulation differentiation drift Effect of population structure on heterozygosity • Wahlund effect - first documented by Swedish geneticist Sten Wahlund (1901-1976) in 1928 • both SUBPOPULATIONS are in HWE, but the pooled dataset (the whole POPULATION) shows deficit of heterozygotes • Extreme ex.: two isolated subpopulations with fixed distinct alleles (more generally - subpopulations with different allele frequencies) —► isolation breaking Homozygosity reduction when subpopulations merge Wahlund, S. (1928) Zusammensetzung von Population und Korrelationserscheinung vom Standpunkt der Vererbungslehre aus betrachtet. Hereditas, 11: 65-106 Wahlund effect - an example Bunnersjoarna lake (northern Sweden) - Salmo trutta one trait with 2 alleles Přítok Odtok 170/170 170/172 (= Ho) 50 0 (0) 1 13 (0.26) 172/172 0 36 Total 50 50 1.000 0.150 2pq (=He) 0.000 0.255 Whole lake 51 13(0.13) 36 100 0.575 0.489 DECREASE OF HETEROZYGOSITY DUE TO POPULATION SUBDIVISION Ryman etal. 1979 Wright's F-statistics 'IS' ^STj ht ^1 SB**" Masatoshi Nei *1931 Sewall Wright 1889-1988 Wright (1951), Nei (1987) > for two alleles at a single locus (Wright 1951) > more complicated for more alleles (Nei 1987) detecting and describing heterozygosity decrease describing heterozygosity (and its deviation from HWE) at different levels TADY SKONČIL F-statistics and heterozygosity Hj - averaged observed heterozygosity of an individual in a subpopulation Hs - expected heterozygosity of an individual in a subpopulation under HWE HT - expected heterozygosity of an individual over the total population under HWE HT - total heterozygosity - expected H0 - observed het. H0 - observed H, - individual h - observed -mean of H0 hs - subpop. het) expected het. -mean of Hs Hs - expected het. Hs - expected het. F-statistics and heterozygosity - averaged observed heterozygosity of an individual in a subpopulation - expected heterozygosity of an individual in a subpopulation under HWE - expected heterozygosity of an individual over the total population under HWE = ^ jfc Hx = observed heterozygosity in subpopulation x x=l J i=\ p. 2 = frequency of i-th l,X allele in subpopulation x ^ averaged expected j-j s _ j-j J heterozygosity over x=l subpopulations pQ = allele frequency in the total population F-statistics Hs - Hj Heterozygosity decrease of an individual due to IS ~ ~Jjs non-random mating in a subpopulation (vs. HWE) Heterozygosity Mftan hfttflrn7vnnsitv within giihnnnulatinng popi|ations ^^^^HT - Hs ^Influence of division of the total population in ^st ~ subpopulations (i.e. heterozygosity decrease due to _T Wahlund effect)_ HT — Hj Total coefficient of inbreeding FIT - measures Fit ~ heterozygosity decrease of an individual in T relation to the total population (1-FIT)= (1-FST)(1-FIS) Weir & Cockerham (1984) f (~ FIS), 6 (~ FST), F (~ FIT) Correction for sample size and number of subpopulations Computation of F-statistics Computation of allele frequencies Mean allele A frequency in the whole population Subpopulation 1 (N1=40) Subpopulation 2 (N2=20) Locus AA AB BB Pm AA AB BB A>(y) Po(y) Note Loc 1 10 20 10 0.5 5 10 5 0.5 0.5 HWE Loc II 16 8 16 0.5 4 4 12 0.3 0.4 heterozygote deficit Loc III 12 28 0 0.65 6 12 2 0.6 0.625 heterozygote excess Loc IV 0 0 40 0.0 20 0 0 1.0 0.5 alternatively fixed alleles Computation of heterozygosities Observed heterozygosity Expected heterozygosity Wright's F-statistics Locus Hi« H2(j) Hi(j) HS(j) HT(J) FIS(J) FST(j) FIT(J) Loc I 0.5 0.5 0.5 0.5 0.5 0.0 ~0_0^ Loc II 0.2 0.2 0.2 0.46 0.48 ( 0.565 y 0.042 0.583 Loc III 0.7 0.6 0.65 0.4675 0.46875 ( ^0.39) 0.0027 -0.387 Loc IV 0.0 0.0 0.0 0.0 0.5 — 1.0 Mean 0.058 0.261 0.300 Mean values of F-statistics may hide distinct evolution history of different loci F-statistics • F|S decrease of heterozygosity in local subpopulation high values -> inbreeding • F|T summary measure - limited use • FST = subdivision measure = limited gene flow between subpopulations (i.e. existence of a barrier -> Wahlund effect) - originally developed for estimation of the amount of allelic fixation due to genetic drift (fixation index) FST computation - an example Přítok 50 0(0) 0 50 1.000 0.000 Odtok 1 13 (0.26) 36 50 0.150 0.255 Whole lake 51 13(0.13) 36 100 0.575 0.489 (expected) (33.1) (48.9) (18.1) P =HT-Hs ST H, 0.489-0.128 0.489 = 0.728 As a consequence of gene flow barrier: Heterozygosity is about 72.8% lower than would be under HWE Rymanetal. 1979 Permutation test of FST significance 1. Real measured populations Real Fst 2. Merged into a single dataset 3. 1000 x randomly re-separated populations 1000 x simulated Fst 0.8 % simulated values higher than real Fst p = 0.008 (i.e. significant difference) 35.4 % simulated values higher than real Fst p = 0.354 (e.g. non-significant difference) FST analysis - BE AWARE Global vs. pairwise indices Absolute values depends on heterozygosity level of used loci!!! (i.e. microsatellite-based FST cannot be compared to allozyme-based FST) Demands standardization: FST' = FST/FSTmax (Hedrick 2005) - e.g. GenAlEx In case of null alleles presence: needs to be corrected! (increase of homozygosity); FreeNA software Giant Panda 192 feces samples—► 136 genotypes-53 unique genotypes separation by a river (ca 26 ky ago) and by roads (recently) even the roads are important barriers; even if less (a) f>nl« River 108 Roud I,. Ml* Mil Ul. 403 Table 3 Pairwise in the Xiaoxian^ling and Daxiangling populations Patch c A EÍ ÚÚ33* c significant level after Bonferroni correction (P < 0.01). GQJ (Nei 1973) • Analogy of FSTfor haploid (haplodiploid) organisms, mtDNA sequences • Takes into account haplotype (gene) diversity instead of heterozygosity • Haplotype diversity = probability that any two randomly chosen sequences in a population will be different • Analogy of FST • Takes into account the size of alleles (number of repeats in microsatellite loci) • Assumption of a known mutation model assumption of SMM (stepwise mutation model) • Indicates traces of mutations • RSj>FST higher effect of mutations • RSt=Fst higher effect of genetic drift • Randomisation tests for RST significance (Hardy et al. 2003, program SPAGeDi 1.1)