Historical data available Genetic data available COLONIZED AREA Population history (& genetic data) Past evolutionary and demographic processes have left traces in the genetic variation - analyzing them we attempt to reconstruct evolutionary history of populations Studying population history = modelling - Selection of the most appropriate model (evolutionary scenario) - Estimation of parameters (e.g. time of events, number of founders, duration of bottlenecks, population size, mutation rate) Description of recent invasions (invasion genetics) Description of older history (phylogeography) Inferring population history - ABC modelling We have observed data (e.g. microsatellite genotypes) We know genetic variation and structure We would like to know which demographic processes and how and when have created such an observed data = population evolutionary history Why is ABC approach useful in modelling population history? It allows to deal with much more complex models with many parameters and a lot of complex data (many samples, populations, genetic loci, sequences) and hence models much more realistic Approximate Bayesian Computation ■ model choice and parameter estimation ■ exact LIKELIHOOD function is intractable in complex situations and can be bypassed (approximated) by a SIMILARITY MEASURE between many simulated (under various models) and a single real (observed) data ■ data SIMULATION under various models ■ COMPARISON of simulated and observed data - model choice ■ Acoording to the most supported model we can ESTIMATE VALUES of its parameters - parameter estimation Decreasing of dimensionality SIMULATED OBSERVED DATASET DATA VERSUS SUMMARY STATISTICS VERSUS Comparison of simulated and real dataset to infer probability of various models (evolutionary scenarios of population history) Do It Yourself: software allows to infer populaton history using the ABC approach (Cornuet et al. 2008, 2010, 2014) DIYABC http://www.montpellier.inrs.fr/CBGP/dig3bc/ Genetic data Sequences SNPs Genotypes ABC works in 3 steps 1. SIMULATION STEP: a very large reference table is produced and recorded prior parameter distributions scenario mutation model summary statistics (e.g. number of alleles, expected heterozygosity.fst) the most time-consuming step based on the genealogical tree of sampled genes and coalescent theory 2. REJECTION STEP: only the simulated data closest to the observed dataset are retained based on Euclidian distances in multidimensional space of summary statistics 3. ESTIMATION STEP: Estimating posterior distributions of parameters through a local linear regression procedure Posterior distributions of parameters are estimated according to the most supported scenario Black rat (Rattus rattuš) invasion in Senegal Konečný et al. 2013, Molecular Ecology Rattus rattus distribution Genetic data 14 microsatellites (9 - 22 alleles, mean: 14.14) mean allelic richness - 3.06 (range 1.87 - 4.71) mean expected heterozygosity - 0.538 (range 0.323 - 0.762) both allelic richness and heterozygosity decreased with longitude Genetic information Historic information Formulation of ourrnodels and related parameters = evolutionary scenarios Model choice Parameters estimation ABC analysis in four steps - four questions Comparison of our observed dataset with simulated ones and inferring posterior distributions of scenarios