Global change research methods II. Variability in time Intro to the time series analysis Using time series, we can examine the dynamics of phenomena over time. 1 0-r 1860 1B80 1900 1920 1940 1 950 1980 2000 2020 Knowing the time series dynamics is fundamental for: a) revealing the causes that are behind the time series development b) reconstructing their behavior in the past c) predicting their future development. Time series compilation Primary (data directly measured) and secondary (derived) sources Possible problems with TS compilation: • selection of observation time points (minutes to years) • ca\endar problems • length of the series • changes in measurements methodology • etc. Some of these problems lead to a violation of homogeneity Homogeneous TS • the series reflects only the natural fluctuations of the studied element. • External influences (such as a change in measurement methodology) are suppressed Important role of EDA (Explanatory Data Analysis) Explanatory Data Analysis (EDA) EDA is used to investigate data sets and summarize their main characteristics. It helps determine how best to manipulate data sources to get the answers you need, making it easier to discover patterns, spot anomalies, test a hypothesis, or check assumptions. EDA often employ visualization methods. EVA often leads to a transformation: 1) to meet the prerequisites for subsequent analysis 2) to highlight the signal we are searching (e.g. trend) Common methods of transformation • add a constant • linearization • remove a mean or trend • standardization (z-scores) y = y + C y = ln(y) y = y-y y fy-y} V Sd J 2 Time series components yt = Tt + St + Ct + et a) trend (Tt) b) seasonality (5t) c) cyclicity (Ct) d) random noise (et) ■US vuii.ijiunresiaiEii?*'. rw m ^ 'V MO V / * 1370 1980 1990 5010 2020 Time series trend A trend is a general tendency of the development of the investigated phenomenon over a long period. It is the result of long-term and permanent processes (on the scale of the assessed length of the time series). 19S0 1970 19S0 1990 2000 2010 2020 Year 1) Monotonic trend - parameters are stable (e.g. linear regression) / deterministic methods 2) Adaptive trend - parameters change through time (e.g. moving averages) Trend analysis involves: 1) Choosing the appropriate type of trend 2) Trend parameters estimate 3) Testing statistical significance of trend parameters 3 Linear regression trend Least squares method principle with residuals Year Regression line parameters bO (intercept) and bl (slope) are estimated by the least squares method > trend.1 <- lm[bt£AHN ~ ntSYEAK) > strrji'i.ary [trend. _) Call: lm(foinmla = bt$AHN - bt$YEAK} Residuals: Min 1Q Median 3Q Max -1.87025 -0.52417 0.05185 0.52435 1.17572 Coefficients: Estimate Std. Error t value Pr(>|t|} (Intercept) -7E.E22528 9.572056 -7.504 7.14e-ll btSYEBK 0.0442E5 0.005007 8.844 1.81e-12 *** Signif. codes: 0 0.001 l«" 0.01 ' 0.05 ".' 0.1 ■ ' Residual standard error: 0.7055 on 60 degrees of f reedorc. Multiple R-squared: 0.5655, Adjusted R-squared: 0.5587 F-statistic: 78.22 on 1 and 60 DFF p-value: 1.805e-12 Model verification Analysis of residuals > # plot resiuals > res <- resid(trend.1) > plot (fitted(trend.I), res) > abline(0r 0) > #create Q-Q plot for residuals > qqr.orm(res) > #add a straight diagonal line to the plot > qqlir.e (res) Root Mean Square Error (RMSE) RM.S.E. -1.5 -1.0 -0.5 0.0 0.5 fitled(lrend .1) Normal Q-Q Plot -1 o 1 Theoretical Quantiles > # calculate RMSE of regression model > sqrt(mean(trend.l?resid^als~2)) [1] 1.731824 RMSE represents the average distance between the predicted values from the model and the actual values in the dataset. Time series smoothing It is used when the trend is changing and cannot be settled "globally" by a single mathematical curve (adaptive trend). 1960 1970 1980 1990 2000 2010 2020 Year Moving averages |L |2 |3 J4 |5 |f6 jft V$ |9 yi__, --J- & _, ■—e—.-■ V7 Smoothing methods highlight low-frequencies in time series (low-pass filters) Moving averages > x <- as.vector(btSYEAR) > y <- as.vector(bt$AHN> > plot ix f yf type= " 1" , ylat=" Year" , ylafc=" rerperat^re" , r.a:_r = " Msvirg avfrsce; " ) > > # Moving averages > if ( ■ require [forecast) ) { install. packages ("forecast17) } > library[forecast) > > sm.l <- ma(yr order=5) > lines[kf sm.1r col="blue",lwd=3) > sm.l <- ma[yr order=15> > lines[kf sm.1r c01s"green™,lwd=3) > sm.l <- ma <■_■. order=30> > lir.es [xf sm. 1 r colored", lwd=3) > > # Add legend > legend [k=,*bottomrightn, legend=c ["ma_G5", "ma_L5™, "Tma_3uTT) , bty="n", lwd=3r col=c ["blue", "green", "red") ) 1960 1970 1980 1990 2000 2010 2020 Year Gaussian filter Weigthed moving averages Gaussian filter -III. -Illll--■llllll- 1960 1970 1980 1990 2000 201 Year > t Gaussian filter > if(!require(smoother)]{install.packages("smoother")} :> library fsr.OGtr.er) > > plot(*f y, type=rTlri,xIab="Yearylab=F1Ten,.perature"f mair:="Gat:3sian filter") > > sm.2 <- smtb.gaussian(y, window=10, tails=TRUE) > lines [x, sul. 2 r col=TrmagentaTrf lwd=3] > > sm.2 <- smtb.gaussian(y, window=30, tails^TRUE) > lines [a, sm. 2 r col=TTcyan", lwd=3> > legend (*="bottomricbtr, leger.d=c [ rgf__0 lwd=3f col=c["magenta"