Methods in climatology II. Extreme Value Analysis Motivation The Hot Summer of 2010: Redrawing the Temperature Record Map of Europe f,'JUn< a :......ii mEu- iteLy 50% of Europe. A ■■j rLLinL.I l-mjliiil:' Ii^Ii ■. ii:ul-. of .la' liiil-(for example. Moscow reached 3K.2"C:. night-ume (Kiei reached 25T). and daik irran (] lel-sinki reached 26.1 °C) temperature; (fig. SI). Preliuinai'. es[inia[l-> i it Ku»ia il-L-itl-lI a d..-inli tall of 55.COO. an annual crop failure of -25%. more dian I million ha ol" burned areas, arid -JJSS15 billion (~]% gwss domestic product) of [■■[Lik-LiiiiMi'L .i-.-. i i. During iiiL- -.auk- |lt\itI. pair., ofeasieiri Asia also e.vpei i 111 ltl! ■.■Mr.jr i1■:-.• 'Miriu lLJ[ii|'L-[ll[nrL-^. aiid l'aki>[ari v,li-. Iiii h lL-. l.siaiin^ riii'riM'i'n l)ivd>. f the 21st century. Increasing greenhouse gas concentrations are (6-1(1). Coisistent i e\pL-er.ed[0amplif. [lie• anabilif ofsumniL-i n>pe lias LWpeiiencij [L-mpenllures in P.urepe ll'-.-l. Miti^ ''irii reeenr. 'ears. 111.: ■:. . .:._.! :_-..r.-. .:i il. l.'i lil'.i^.I lore fre-iinenl. peisislenr.. and miens? heal'Ml' ..-> deaihs. mainK in ■ ill these e\pee[ations. P.n-devastating heatwaves in eptional summer of 2003 round 70.000 heat-related L 2011 VOL 332 SCIENCE *\wi.zaencemsq.arq European summertemperature The flood in Moravia and Silesia in July 1997 52 victims - material damage 63 billions of Czech crowns Terminology 1) Descriptive Extremity Indices ?finitions and mathematical formulas of the indices used in ECA&.D are prouided below. A core t of 26 indices follows the definitions recommended by the CCl/CLIVAR/JCOMM Expert Team on e Change Detection and Indices f ETC GDI i. Another IS indices are specifically for Europe. lay lead to additional indices or changes in the indices definitions in the http://www.ecad.eu/indicesextremes/ Examples: R2Dmm • Very heavy precipitation days (precipitation > 20 mm) (days) Let RR,j be the daily precipitation amount for day /of period / Then counted is the no of days where: RR,j > 20 mm R95p • Days with RR > 95th percentile of daily amounts (very wet days) (days) Let RRnfte the daily precipitation, amount at wet day w{RR> l.Q mm of period jand let be the 95th percentile of precipitation at wet days in trie 1961-1990 period. Then counted is the no of days where: Wended series. This means that they and updated using synoptical messages. pacific F-0 or in Project Info -TEC. Mean number of tropical days in CR in the 1961-2020 period Terminology 2) Extreme Value Analysis (EVA) Analysis of frequency occurrence and intensity of rare events -..extremes" that occur with low probability Analysis of annual maximum precipitation, air temperatures exceeding very high threshold, ... 1960 WO 1930 1990 2000 2010 2020 23 30 32 24 36 33 40 Variability of annual absolute maximum air temperatures at Brno, Turany (1961-2019), left - linear trends (1961-1990 and 1991-2019), right - density distribution • Extreme Value Theory (EVT) • Extreme Value Distribution (EVD) EVA example N [years]_2_5 10 20 50 100 Tm»[°C] 32-8 34-4 35-3 S6-1 36-9 37-5 Estimates of mean return periods N [years] of annual absolute maximum air temperatures at Brno, Turany (1961-2011), Purpose • find reliable estimates of X(T) for large T (i.e. rare events), • even for T larger than the period of observation, • including estimates of the uncertainty of X(T) min {xj.....xJ = - max{xj.....xj Main steps 1. Choose an appropriate parametric distribution function 2. Calibrate it such that it describes available data well 3. Extrapolate distribution function Which is ..appropriate" distribution? Extreme Value Theory (EVT) Input data CO -(O --i- -CM -O - I ill lil 1 i POT (Peak Over Threshold) BM (Block Maxima) i 5 I 10 1 i 20 CO -iD -*t - 2 1 i 0 —i Ď —1 Ť -i CM -o - ■ ii ill ,l t t M -S — I I l lil i I 1 ' 5 10 I 15 I 20 I 10 i 15 20 Extreme Value Theory (EVT) Appropriate distributions BM (Block Maxima) ill i ~i i i 5 10 15 20 4 Generalized Extreme Value distribution (GEV) Zobecněné rozdělení extrémních hodnot POT (Peak Over Threshold) il i-1- 10 15 20 Generalized Pareto distribution (SPD) Zobecněné Paretovo rozdělení extrémních hodnot 4 Extreme Value Distributions Generalized Extreme Value distribution (GEV) - zobecněné rozdělení extrémních hodnot The maximum of a large number of iid random variables is distributed like the Gumbel or Fréchet or Weibull Distributions independently of the parent distribution. 3 parametric distribution: location (u), scale (a), shape (£) GEV cumulative density function G£VU;/i,CT,£) = expj- 1 + 5 X-JJ, a where: 1 + g--— >0 a ^ =0: Gumbel, unbounded Š >0: Fréchet, lower bound ^ <0: Weibull, upper bound 2 3 4 5 iid - independent and identically distributed Modelling Block Maxima (BM) • Build Blocks Divide full dataset into equal sized chunks of data E.g. yearly blocks of 365/366 daily precipitation measurements ■ Extract Block Maxima Determine the Max for each block • Fit GEV to the Max and estimate X(T) Estimate parameters of a GEV fitting to the block maxima. • Maximum Likelihood (ML) Estimation - is prefered when i) samples are sufficiently large; ii) climate is not statinary. In this case LME may include „covariates" • L-Moments Estimation - when samples are small • Method of moments - underestimate long-period return values • Calculate the return value function X(T) and its uncertainty (confidence intervals) 5 Extreme Value Distributions Generalized Pareto distribution (GPĎ) - zobecněné Paretovo rozdělení extrémních hodnot Estimate X(T) (for rare extremes) by parametric exceedances above a large threshold. modelling of independent o For large u exceedances EJy) asymptotes to a limit distribution: CO Eu (v) = GPD(y; for w °° GPD(y,o,^) = 1-^1 + ' " |° o ~ Y (SPD cumulative density function « depending on shape parameter ° ^ =0 Exponential distribution g . ii 0 12 3 4 y Modelling Peak Over Threshold (POT) ■ Select a threshold u should be large enough to be in asymptotic limit • Extract the exceedances from the dataset n values out of the total N data values exceedances need to be mutually independent ■ Fit SPD to exceedances, yields conditional distr.: prob(Z >x \X> u} = 1-GPD(x- it:a,|) ■ Estimate uncond. distribution and return values pmb(X > x) = prob(Z > u) ■ (l - GPD(x - u; with prob(X>u) estimated as n/N (the third model parameter) Return values X(T) from the unconditional distribution 6 Modelling Peak Over Threshold (POT) Exceedances are identically distributed <° H • may be violated e.g. by seasonality, by trends il Exceedances are independent • may be violated by serial correlation • much more critical than for block maximum approach • in general solved by declustering of original data • e.g. exceedances should be separated by at least x days. 15 20 Modelling Peak Over Threshold (POT) Threshold Selection • mean residual life plot - the idea is to find the lowest threshold where the plot is nearly linear; taking into account the 95% confidence bounds. Mean Residual Life Plot: Fort Prec Modelling Peak Over Threshold (POT) Threshold Selection Fitting data to a SPD Over a Range of Thresholds and stability of the parameter estimates is checked \\\\\\\\ I'd i ■■ '' [ ■, 1 1 0.0 0.2 0.4 1 0.6 06 1.0 "hreshoti Htltlt} r, i. ~i-1-1-1-1-r DO 0.2 0.4 0.6 0.6 1.0 Threshold BM versus POT Source: Analysis of Climate and Weather Data, Extreme Value Analysis - An Introduction, Christoph.frei [at] meteoswiss.ch ftp://ftp.pmodwrc.ch/pub/people/anna.shapiro/analisys7o20of7o20climate/Xstat7o5Bl7o5D.pdf BM versus POT • The POT approach typically utilizes more of the available data than the block maxima approach. • How/ever, it can be common for threshold excesses to cluster above a high threshold; especially with atmospheric data - consequently confidence intervals too narrow • The block maxima approach may include points that are not very extreme • In some cases it might miss extreme values simply because a larger value occurred somewhere else in the block (e.g., the second, or third, point that exceeds the threshold). • The block maxima approach typically satisfies the independence assumption to a good approximation, and is easily interpretable in terms of return values. POT I . 1 III ill 1 E SM BM versus POT BM POT • Theoretical assumptions are less critical in practice. • More efficient if a "small" threshold is justified. (More independent exceedances than block maxima.) • Independence of maxima can be achieved by selecting large block size. • Independence assumption is critical in practice. Need declustering techniques. • Estimation uncertainties can be large because small sample size • Needs diagnostics for threshold selection. Choice somewhat • More easy to apply ambiguous in practice. • Less easy to apply in practice. 9 General comments Quality control and dealing with „outliers" Fitted distribution may be very sensitive to the inclusion/exclusion of the outlier • Inclusion - quality of the fit is reduced • Exclusion - return periods are underestimated - not recommended approach Peaks over Threshold 2* - Confidence Interv.: t - Vivian, Feb 1990 yf ---- ML (Delta Method) Q95 (m/s) 12 13 14 15 16 / —___— Index for "Storminess" f in Europe. / ERA40, 1958-2002 (^J Confidence interval implies that there is non-zero probability that upper bound is smaller than maximum observed value 0.19 0.5 1 5 10 100 1000 T Source: Analysis of Climate and Weather Data, Extreme Value Analysis - An Introduction, Christoph.frei [at] meteoswiss.ch ftp://ftp.pmodwrc.ch/pub/people/anna.shapiro/analisys7o20of7o20climate/Xstat7o5B17o5D.pdf EVA tools Climate Explorer https: //climexp. knmi. nl Maximum July air temperatures, Brno, Turany, 1961 - 2010 Abs. Max - 36.2°C (2007) EVA tools Climate Explorer Jul temperature brnot rrax 1901:2010 (95% CI) J - A f / gey fit — 2007 — 100 return period [yr] EVA tools in2extRemes http://www.asscssmcnt.ucar.edu/toolkit/ Tlie We^t • Contact Information > Program Documents > 2004 Review Documents n t Projects - Publications ' Presentations a Supporting Institutions * Other NCAR Initiatives t Upcoming Events * Mailing List * Website Statistics 0 Project Abstract Extreme value statistics are used primarily to qjantify the stochastic behavior of a process at unusually large (or small} values. Particularly, such analyses usually require estimation of the probability of everts that are more extreme than any previously observed, Many fields have begun to use extreme value theory and some have been using it for a very long time including meteorology, hydrology, finance and ocean wave modeling to name just a few. The extremes value analysis softwarE package in2extRemes is an interactive (point-and-click) software package for analyzing extreme value data using the R statistical programming language. A graphical user interface to the package extRemes (version >= 2,0] is provided, so a knowledge of R is not necessarily required. The software packages come with tutorials (available soon) that explain how they can be used to treat weather and climate extremes in a realistic manner (e.g., taking into account diurnal and annual cycles, trends, physically-based covariates). Extreme Value Analysis Software ** Please take a moment to Toolkit. Don't worry, gister so we may track usage of the Extreme: ing this for tracking purposes ONLY. No sparr Instructions and Tutorials for downloading and using the software. More general site about statistics of weather and climate extremes and their EVA tools in2extRemes ^ R Console Type 1 contributors (} ' for icq re Info neat ion and 1 citation() 1 on how to cite R or R packages in Type "demon " foi some demos, 'help[] " for on-lii ■iielp . start [) 1 for en i-ilML browser interface t^ri Type "qlJ" to quit R. 2 library [in2eÄtRemes)^N ^^rniliiiT required Ear^to^e: Lei.i Loading requiied package: extRar.es Loading required package: Lm.om.ents Loading required package: distillery Loading required package: car Attaching package: 1 entRenr.es' The following objects are m.asked from, package: s1 qqnorm., qqplot Packages imputeMDR imputeMissings irnputeTS ^ IndependenceTe&ts indicspecies JregiP. ineq Inf Dim inference InferenceSMR nferference inflection nfluence.ME influence,SEM nfluenceR nfoDecornpuTE Inform ati on Inform ationValue nformR nfotheo nfra nfuser nfutil njectoR IN LAB MA npdfr InPo&itio □ r EVA tools in2extRemes - GUI > in2extRemes 7^ Into the extRemes Package Extremal Index EVA tools in2extRemes - plot data 1940 1960 1980 year EVA tools in2extRemes - fitting SEV to data Analyze - Extreme Value Distributions EVA tools in2extRemes - estimate N return values (N=100) Analyze - Parameter Confidence Intervals location 23.31796949 30.4153253 32.5206811 scale 6.5393136b £.2712610 9.9427083 shape -0.03707258 0.1744151 0.3S59029 Preparing to calculate 95 % CI for 100-ye Model Is fixed ■Js-'.c Mrrir.sl ;.;j:;jn.i-.i:r. "ethcd. fevd(K = F_Les:ia, data = xdat, location.fun = -1, scale.fun = -1, shape.fun = -1, use.phi = TRUE, type = -GEV-, units = "rl, [1] "Normal Sppros." | [1] "95% Confidence Interval: (57.6761, 119.8934}" 14