M7777 Applied Functional Data Analysis 8. Functional Data Simulation Jan Koláček (kolacek@math.muni.cz) Dept. of Mathematics and Statistics, Faculty of Science, Masaryk University, Brno n Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 1/19 Functional Data Simulation 1. Wiener process (limit of a Random Walk) 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 Combinations with Wiener process xi(tk) = m(tk) + Sk Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 2 / 19 Functional Data Simulation 2. Regression model • Simulate N x M measurements x}(tk) = m(tk) + eik, iid sik ~ A/(0, a2), m(t)... any regression function, / = 1,..., A/, /c = 1,..., M • Smooth the data by FDA x,-(t), / = 1,..., N Regression Smoothed regression 0 100 200 300 0 100 200 Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 3 / 19 Functional Data Simulation 3. Basis expansion K 7=1 0*(t) = ..., K(t)) ■ ■ ■ a g'ven basis system Cjj . .. iid random basis coefficients for /-th curve, j = 1,..., K Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 4 / 19 Functional Data Simulation 4. Gaussian process Let us consider a regression model Xi{tk) = m(tk) + £ik with a covariance function c(r,s), i.e. Cov(£,y,£;s) = c(tr, £s)> usually c(u, \/) = a2 exp (u ~ v)2^j • Set m = (m(t1),...,m(tM)Y^ = (c(t/, t>-))#v=i»x# = (x/(fi)> • • ■ ^/(^m))' Thus x/ ~ A//\^(m, 5]) and x,-(t) = lim/^oo x/. Gaussian process CO Ö (D Ö -3-Ö CM Ö O Ö Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 5 / 19 Functional Data Simulation 5. Random Gaussian process Let be given (tj>yD • • •, (^,y^), L < M, and suppose xi(t*k)=y*k+e*k, ** ~ NL(0,a2JL). Then x/|y*~A/M(m*,S*) with m* = £tt* (St*t* +^l/.)"1y*J = Stt - Ett* (St*t* + cr*'/.) 1 £t*t- where £ab = Cov(a, b). Random Gaussian process 0 100 200 300 Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 6 / 19 Functional Data Simulation Regression Simulation O Generate y; with known a, x,-(t) and / = 1,..., 30 /\ 0 Get estimates /3(t),y by considered methods a) Estimation through a basis expansion . .. /3se(£), ybe b) Estimation with a roughness penalty .. . /3pp(£),ypp c) Regression on functional principal components . .. /3pc(t), ypc d) Nonparametric regression ... /3/v/?(t), y/v/? ✓V s\ s\ s\ © Compare @be, @rp, fipc, @nr with known /3 O Compare estimates 9be, 9rp, 9pc, 9nr with known model fits. Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 7 / 19 Functional Data Simulation Regression Simulation 1 Let (ti,..., tyf) = (1,2,... 365), we will simulate 30 regression curves x,-(t) as the Gaussian process with m{t) = sin(t/365), c(u, v) = 0.01 exp (-^^(u - vf The regression model takes the form 365 -10+ / ß(t)xi(t)dt + ei l with ß(t) = l + 2t/365 - (t/365)2 and e,- - A/(0,5). Jan Kolaiek (SCI MUNI) M7777 Applied FDA Fall 2019 8 / 19 Functional Data Simulation Simulated Data Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 9 / 19 Functional Data Simulation Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 10 / 19 Functional Data Simulation Comparison of fits 1. Basis expansion 2. Roughness Penalty 3. Functional PCA 4. Nonparametric Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 11 / 19 Functional Data Simulation Regression Simulation 2 Let (ti,..., tyf) = (0, 0.01,... 1), we will simulate 30 regression curves x,-(t) as the Gaussian process with m(t) = exp(t/27r), c(u, v) = 0.5 exp (—10(iv - \/)2) . The regression model takes the form l y; = 5 + ^ P(t)xi(t)dt + ei o with = sin(27rt) and e\ - A/(0,0.1). Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 12 / 19 Functional Data Simulation Simulated Data Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 13 / 19 Functional Data Simulation Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 14 / 19 Functional Data Simulation Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 15 / 19 Problems to solve O Conduct the following simulation (Kokoszka and Reimherr, 2017). • Generate 1000 random functions X{tJ) = Ztj + U + r1{tj) + e{tj), where tj are 101 equidistantly distributed points on [0,1], rj(tj) ~ A/(0,1), Z - A/(l, 0.22), U ~ UNIF(0, 5) and the random curves e(t) will be generated as 10 1 e(t) = ^2t {Zik sin(27rt/c) + Z2/c cos(27rt/c)} K k=l with independent standard normal Zi/c,Z2/c. • Consider a regression model of the form l y; = 0.01 J /3(t)Xi(t)dt + ei o with fi(t) = -fi{t) + 3f2(t) + f3(t) and e-, - A/(0, 0.4), where /i, f2, f3 are normal densities A/(0.2, 0.032), A/(0.5, 0.042), A/(0.75, 0.052), respectively. _ Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 16 / 19 Problems to solve • Try all regression approaches studied in the previous lesson, i.e. • estimation through a basis expansion, • estimation with a roughness penalty and • regression on FPCA. Plot the estimates @(t) and compare it with the original @(t) (see Figure 1). • Conduct the nonparametric regression. Plot estimated values y; against the simulated y; for all cases (see Figure 2). Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 17 / 19 Problems to solve Problems to solve Jan Koláček (SCI MUNI) Figure 2. M7777 Applied FDA Fall 2019 19 / 19