A Small Labor Market Model for the Czech Economy Jan Brůha a a Czech National Bank, Na Příkopě 28, 115 03 Praha 1, Czech Republic. Abstract An empirical small labor market model for the Czech Republic is estimated in the state space framework. Its purpose is a joint modeling of labor force, employment, wages, hours worked, output and GDP deflator in a consistent 'structural' framework suitable for short-run forecasting. The model entails, in the long run, five driving forces: a trend labour force component, a trend labour productivity component, a long-run inflation rate, an unemployment trend, and a trend hours worked component. In the short run, the dynamics is governed by a VAR model. The model aims at describing the co-movements in the labor-market variables, provides a model-based decomposition to trend and cyclical components of the underlying series, and outperforms unrestricted VARs in forecasting. The paper also presents an extension of the model to the 'data-rich' environment. The paper describes also the second moments of labor market data at various frequencies and discusses to which extent these properties can be replicated by data. Key words: Structural time series; labor market; forecasting. J.E.L. Classification: C51, C53, E17, J21 Email address: jan.bruha@cnb.cz; janJDruha@yahoo.co.uk (Jan Brůha). Preludium At the beginning of this paper, I would like to render homage to Osvald Va-sicek, one of the most distinguished scholars I met during my studies. I was a student at the Masaryk University during the 1990s, and Osvald was one of the most inspiring professors. I learnt a lot form him. He introduced me to the realm of stochastic processes, stochastic filtering, and dynamic macroeconomics. He helped me in my first steps when I was trying to understand filtering, its computer implementation and pointed me to interesting applications in macroeconomics. This paper is therefore dedicated as homage to Osvald. During my studies at the Masaryk University, Osvald was a chair of the Department of Applied Mathematics. This was one of the most inspiring environments, I ever met. There were other inspiring scholars, but let me mention two of them: Pavel Osecky and Miloslav Mikulik. The lectures of both of them has been inspiring and invited me to hard but exciting fields of statistics and numerical techniques. Let me express my deepest respect and homage also to these two great personages and let me applaud Osvald for creating of such an inspiring environment, which significantly influenced the rest of my career. 1 Introduction The main objective of this paper is to contribute to understanding of the dynamics of key variables of the Czech labor markets in the consistent framework of structural1 multivariate time series. For these purposes, a small labor market model, containing labor force, output, employment, hours worked, wages, and inflation, is proposed and estimated in the state space framework. The unobserved states possess a straightforward interpretation and the model variables can be decomposed to various frequency components (short-run movements versus movements in trends), which inter-alia implies that the decomposition of the observed movements in time series into trend and cyclical components is possible without the need of applications of ad hoc statistical approaches. The model entails long-run dynamics and short-run fluctuations. The long-run dynamics is derived from five primitive trends (labor force, labor productiv- 1 The term 'structural' is here understood in the sense of time series econometri-cians (cf. Harvey, 1989, p.2: 'A structural time series model is one which is set up in terms of components which have a direct interpretation.''); and should not be considered 'structural' in the sense of the Cowless commission. 2 ity, inflation rate, unemployment, and hours worked). These primitive trends then using theoretical restriction span trends in all observable variables. The long run restrictions are consistent with a frictionless economy where a Cobb-Douglas production function is used to derive the desired level of employment by firms. The short-run dynamics is based on an empirical VAR, which aims at replicating the second moments in the labor market variables. The model is cast in the state space form and is estimated on Czech data from 1996Q1-2010Q1. The proposed framework can be useful for the following purposes: • as a natural benchmark against the trend-cycle decomposition based on ad-hoc filtering methods; • to learn about the statistical properties of labor-market data at various frequencies; • to gain insights to the recent labor market development; • finally, to be used as an independent tool in the process of near-term forecasting. Since the estimation technique used allows distinguishing between cycles and the trend, there is no need of detrending data before estimation: the long and short run dynamics of the model is estimated jointly, hence avoiding the unfortunate practice of detrending data by purely statistical methods prior model estimation 2 . Moreover, the state-space framework means that the trend-cycle decomposition alleviates the so-called 'end-of-sample' bias, since the Kalman smoothing has an automatic adaptation property at the end of the sample. I show on actual data that the output gap based on the HP filter is subject to substantial revisions, which is not the case of the filter based on the presented model. Moreover, I show that there are periods when the assessment of the cyclical position of the economy significantly differs between the presented model and the HP filter; the leading example of such a discrepancy is the current recession. The model is cast in the state-space framework, which is very convenient for the shock decomposition, for incorporating external judgments, and for running conditional projections. For 'real-time' forecasting, two properties are especially important. First, the state space models can easily deal with missing data, which means that if some series is available sooner than other series, this earlier piece of information can be incorporated into the model without the need of waiting to the data when all series are available. This is especially useful for the model extension to the data rich environment (see below for the 2 Canova and Ferroni (2009), who adopted a non-structural approach to detrending using a parsimonious econometric specification, recently confirmed by simulations that an incorrect specification of trends distorts the estimation of parameters of the cyclical part of macroeconomic models. 3 discussion of such extension) as some series, dynamics of which can provide useful pieces of information about the core series, such as sentiment indicators or inflation measurements, are available sooner than the national account data. Second, the model can be calibrated so that the measurement noise variances are increased for the last observations, which can be useful if a significant revision of the core series is expected. The paper also characterizes the second moments of Czech data and confronts the moments implied by the model with that of reality. It is discussed to what extent a model with long-run neoclassical features is able to replicate the spectral properties of observed time series. Finally, following Bernanke et al (2005) and Boivin and Giannoni (2006), I present an extension of the model to the 'data-rich' environment: the statespace is extended to gain additional pieces of information from a larger set of economic time series. That set includes both alternative definitions of model variables (such as inflation or employment) as well as series directly unrelated to the model variables, but which may nevertheless improve the filtration and forecasting (this is the case of so-called leading indicators such as various measures of economic sentiments, import and energy prices, asset prices, or sectoral data). The rest of the note is organized as follows. The rest of this section reviews related papers. The next section 2 presents the model and discusses used data and estimation. Section 3 summarizes model properties. The last section 5 concludes. 1.1 Related papers The model presented here is related to Proietti and Musso (2007) who apply the framework of structural multivariate time series to the Euro Area. They specify a model with the production function and two Phillips curves and by specifying the permanent and transitory components in factor inputs identify the the potential output and the output gap. The main difference between the two papers is in the specifications of models spanning the trends and cycles. Hjelm and Jonsson (2010) overview various approaches to filter the trend component from economic time series, including multivariate model-based approaches. The presented model can be considered as an instance of multivari-ate, model-based, filtering. Andrle (2008) discusses the role of stochastic trends in macroeconomic modeling and the effects of detrending. He argues for a joint modeling of trends and cycles as the permanent shocks may spill over the whole frequency range. The 4 purpose of this paper is different as it does not take the structural approach and the decomposition to trends and cycles here is taken from the statistical - not economic - perspective. However, the papers have the similar emphasis on multivariate filtering, which respects selected economic relations. The paper is also related to growing literature on using of real-time data in economic forecasting. For example, Benes et al (2010) illustrate how to incorporate real-time data to the Small Quarterly Structural Model for the United States with real-financial linkage (see Carabenciov et al, 2008, for description of the model). Their main concern is the theoretically coherent approach to the asynchronous release of data (financial data are usually available in real time, while national accounts are available with about 2-quarter lags). This research here is also related to papers studying the relations between low frequency movements in employment, productivity and possibly other variables. For example, Ball (2009) argues that the low frequency movements in unemployment is caused by unemployment hysteresis. Farmer (2010) documents the low frequency correlation between output dynamics and unemployment and explains this correlation in a model where self-fulfilling beliefs select an equilibrium from a set of possible equilibria. For Farmer (2010), this is the preferred interpretation of original Keynes' ideas. Finally, King (2005) reviews literature on the low frequency correlation between productivity and unemployment and investigates the correlation of productivity with of job matching and job destruction. He concludes that the trend movements in unemployment cannot be explained by productivity alone. During past 5 years, there have been some interesting research on the Czech labor market3 . Some papers deal with issues related to the institutional features and hence to what New Keynesians would call the natural unemployment rate. Galuscak and Pavel (2007) investigate the effect of net replacement rates on work incentives contributing to the understanding of the equilibrium level of unemployment. Bicakova et al (2006) investigate inter alia employment effects of changes in taxes and net benefits. Other studies deal with labor-market rigidities, hence they have implications for fluctuations of labor-market variables over the cycle. For example, Babecky et al (2008) use a survey at the firm level to investigate the determinants of the wage and price formation in Czech firms. They find efficiency wage models relevant for the wage setting. Finally, some studies try distinguishing the structural and cyclical factors. Hurnik and Navratil (2005) estimate the time varying NAIRU to distinguish between the two factors. Galuscak and Munich (2007) address the issue of 3 The reader can consult papers quoted in the following paragraphs for references to older studies. 5 structural and cyclical unemployment using movements in the Beveridge curve parameters. 2 The model and the estimation strategy It is assumed that any observable variable xt is given as a sum of the trend component and the cyclical component: xt = Xt + Xt, where Xt is the trend component, and Xt is the cyclical component. These two components are not directly observable, and the model can be used to filter them. The dynamics of the cyclical components is modeled simply as a VAR process. The dynamics of the long-run component is described in the next sub-section. 2.1 The long run component The building block of the long-run dynamics is the production function and the labor demand equation. The production function links the long-run trend in the log 4 of employment et, and hours per employee h t to the trend component of the log of the real output yt using a log-linear specification: yt = et + ~h t + 9\, (i) where 9% is the long-run labor productivity. The labor demand links the trend log real wage (wt — pt) to the trend marginal product of labor: et + h t = pt — (wwt — pt). (2) The other three trends are the long-run growth in the GDP deflator 9p, the trend component in the labor force 9\, the trend component in hours per employee 9f, and the trend unemployment 9^. The former trend should be pinned down by the monetary policy, while the three latter trends reflect the institutional issues of labor market and the demographic factors, which are outside the scope of this paper. Therefore, the long-run trend in the log of the GDP deflator pt = 9p, the long-run trend in the log of the labor force is given simply as 1t = 9t, and the trend in the log of employment obeys et = It — 9? = 9t — &l All variables are in logs unless otherwise stated. 4 6 This simple model then implies the long-run elasticities, which are given in Table A.1. The trends 6^ (x E {l,y,u,h,p}) are modeled as random walks with drift processes: m _ nx I x I _x_x 6t = 6t-i + Pt-\ + °e ^it> where the drifts px follow AR processes: Pt - V = P (pt-1 - V ) + ap£2t, (3) where {£xt}£=0 and {sxt}%=0 are i.i.d. white-noise processes. Hence, the trends 6tx can be represented as ARIMA(1,1,0) processes. Note that the Harvey-Jaeger (1993) process would be obtained for vx = 0 and px = 1 (and thus the trend component would be given as an ARIMA(0,2,0)). It is worth explaining why I depart from the original Harvey-Jaeger formulation. First, for some processes, it is desirable to require the drift to fluctuate around a certain number. For example, most people would expect that the growth in the labor productivity would fluctuate around some positive constant, which is given by vy. Therefore, even if during a severe recession the growth of the trend productivity is perceived negative, if we expect the growth recovery in the long run5 , we would want the drift py to return to vy. But this is something, which does not happen under the Harvey-Jaeger framework: under it, if the labor productivity growth becomes negative, all its future expected values will remain negative 6 . Similarly, the drift in the price trend pp is the trend inflation, and Harvey-Jaeger ARIMA(0,2,0) model would suggest that it is a random walk. This is something that you would not want to have in an economy when monetary policy is well-functioning. The formulation suggested here avoid such features; in fact the coefficient vp corresponds to what is implied by the inflation target. Standard errors c^, 0, and hence Et^y+r would exponentially decrease as a function of t. On the other hand, under the formulation suggested here, Etpy, r —> /xy even if py < 0, and therefore you will expect the positive productivity growth in the future. 7 2.2 The state-space formulation of the model The model is formulated and estimated jointly in the state-space framework. The state equation is given as: 0t II 0 0t-i 0 Ei 0 0 pt = 0 P 0 pt-i + (I - P)M + 0 E2 0 Xt 0 0 A Xt-i 0 0 0 Ea (4) pt py pu pp where 9t = Ql 9y 9f 9p is the column vector of trends, pt = is the column vector of drifts, Xt is the column vector of cyclical components, I is the identity matrix and 0 is the zero matrix of an appropriate dimension, A is the VAR matrix, which determines the cyclical dynamics of the model7 , P is the diagonal matrix containing px's, M is the column vector containing /ix's, E1, E2, and Ea are diagonal matrices of standard deviations, and {£t}£=0 is the multivariate white noise process with Eet = 0 and E (ete^) = #tsI. The observation equation is given as: yt = C 0 I 9t pt + Ev ut, (5) where yt is the vector of observable variables, ut is the measurement noise, and Ev its covariance matrix. The matrix C is given by the long-run elasticities (see Table A.1). For later use in the paper, I introduce the following notation compactly describing the model. The system (4) and (5) is written in the compact form: Xt+i = Axt + M + E et+i, (6) yt = (J + Ev ut, (7) XT = 9T pT XT is the stacked vector of all states, and the matrices with 7 The third-order VAR was chosen based on empirical grounds. The matrix A and the vector xt are rewritten into the first-order form using the obvious transformation. 8 the refer to the matrices from the state space system, i.e., A II 0 0 P 0 0 0 A C C 0 I M 0 (I - P)M 0 and Sx = Si 0 0 0 S2 0 0 0 Sc 2.3 Data & Estimation I use quarterly national accounts from 1996Q1 to 2010Q1. I use seasonally adjusted data on labor force, output, employment, nominal wage, and GDP deflator. The model is estimated using the prediction-error minimization approach. The parameters of the model are constrained so the growth rates are stationary, which means simply that \px\ < 1 for x E {l,y,u,p} and that the modulus of eigenvalues of A are less than one. 3 Model Properties In this section, I briefly overview model properties. These are: (i) filtering data, (ii) replicating moments, and (iii) forecasting. 3.1 Multivariate filtering The model can be used for multivariate filtering in main model variables (output, employment, hours, and wages) so that the main restrictions (mainly the production function) are satisfied. Figure A.1 compares growth rates in the observed variables (solid line), growth rates in the model-based trends (dashed line), and the growth rates in the HP trends (dot-dashed lines). One can see that the model-based filter yields somehow more volatile trend in labor force and hence in employment and output than the HP-based trends. The hours, inflation, and productivity trends are comparable to their HP counterparts. Despite being consistent across intra-temporal restrictions, the model-based filtering has another advantage: the problem of the end-point bias is alleviated. Figure A.2 displays growth rates in the recursively computed trends (on consecutive data vintages) for the model filter, the HP filter, and the HP filter with end-point bias reduction (suggested by Bruchez, 2003). Two trends are shown: trend in the real output and in the GDP deflator. The results reveal 9 that the HP-based trend is subject to significant revisions, and the modification towards the end-point bias helps only marginally. It is interesting to note that the HP trend growth in the year 2008 was revised first up and then down, so the revisions need not be even monotonous. On the other hand, the model-based filter exhibits only slow revisions. Another striking feature is the discrepancy between the implications of the HP filter and of the model filter for the growth in the trend output for the last quarters. The model based filter indicates a larger and more negative cyclical position than the HP filter8. Since the capacity utilization is lower, as the employment and hours have not yet recovered, while the productivity partially has, the model thus refuses to see the development in last quarters as the gap closure. In other words, the HP filter can be blamed not only for the end-point bias, but also for being univariate, i.e., for extracting possibly different non-linear trends from each series independently. Similar discrepancies exist for another years and also for inlation. According to the model, the cyclical inflation position were in years 1998-2000 much more below the trend level than what the HP filter would imply. The model story could be corroborated by the fact that the then inlation development had been a big surprise for economic agents. On the other hand, the tendency of the HP filter 'to go through the middle of the series 9 ' and to smooth out the luctuations means that the cyclical position, which had been initially similar to the model, has been revised up. If the then disinlation was really a surprise for economic agents, then one can argue that the model filtration is more credible than the HP filtration. However, if a strong and quick disinflation is followed by a period of stable inflation, then the HP filter would tend to underestimate the trend inlation in the beginning of a disinlation period and overestimate it when the disinlation ends and the economy transit to the steady period. Therefore, also here, economic intuition favors the model-based filter for a more credible story. The reader may ask whether the model filter is multivariate on trends, on cycles, or both. This can be answered from the inverse filter10 for trends and cycles, where the weight of the last observation in trend component is lower than it would be for the HP filter. Figures A.8 and A.9 show how the weights for the unobserved states (trends and cycles) for the last observation period 8 This is not just due to the end-point bias, but also due to the tendency of the HP filter to smooth large drops; it is a kind of a folk theorem: the Great Depression is an unimportant event if it is measured by the conventional HP cycles. 9 Indeed, by the very construction of the HP filter, a bust must be followed by a boom. 10 Following Koopman and Harvey (2003), the Kalman filter can be 'inverted' to inquire how observations in each series translate to unobserved states: here to trends and cycles. 10 looks like (as a function of current and lagged observations). Its apparent that the filter is indeed multivariate, i.e., that the estimation of trend (here smoothing) really depends on all observation variables and not only on the corresponding series. 3.2 Second moments The model aims at replicating second moments in data. Figure A.3 compares the correlation of various lags and leads of selected variables (or their linear combinations) in data with model implications. The figure shows the sample correlation function (blue solid line) with correlation implied by the model (dashed red line)11. I also report sample co spectra and quadrature spectra for the same pair of variables (Figure A.4) and sample coherence (Figure A.5). These values have been computed using the Barlett estimate of multivariate spectrum, see Hamilton (1994, chap. 10). On these figures, the usual business cycle frequencies (from 6 to 32 quarters) are highlighted. The first sub-figure in Figure A.3 show correlation of productivity with real wage; labor productivity leads somehow real wage, but the highest correlation is contemporaneous. The model is able to roughly to replicate this feature, although the correlation in data is somehow higher. It is also apparent that the correlation between real wage and productivity is both due to all frequencies (coherence is relatively high for all frequencies) and that quadrature spectra suggest that the lead of productivity is caused on the business cycle frequencies. The next sub-figure shows that there is some slight correlation between the 11 The model correlation has been computed from the state space matrices of (6) and (7) as follows. Denote population covariance matrices as rx = E(xt+s%T ) — E(Xt+,)E(xT), and ry = E(//t+s//tT) - E(//t+s)E(//T). Then: 1 s — A 1 0 , ry = crxcT + Sso^v , where 5s0 is the Kronecker delta. The matrix rx satisfies the equation t which can be easily solved using the vec operator (Hamilton, 1994): vecrx = [I - A eg Alj"1vec(VxVxT). The model correlation for trends in variables, reported below, are computed similarly with the obvious modification on the state space matrices. 11 excess of real wage over productivity and inflation, where the excess leads the inflation at about five lags. In the other words, too high wages are corrected using the inflation. Both co spectra and coherence suggest that this correlation is mainly caused at frequencies around 1/5 periods. The model tends to overstate this data feature, which is not surprising given that the higher frequency movements are not modeled and the low frequency components of the relation implied by the model is zero (as the trend inflation is governed by an independent trend than real wages and productivity). The third sub-figure in the first row shows correlation between the output per capita (here per labor force) and unemployment. There is a negative correlation with some leads (at about 2-3 quarters) of output. The most movements are caused on the business cycle frequencies (especially the lead) with some coherence even at the low frequencies. This low frequency coherence forces me to ask whether the neoclassical view of the labor market is a complete description of the story. The neoclassical view is that the long-run unemployment is given by a natural rate. The natural rate is determined by long-run features, such as labor-market institutions, and is independent on the productivity growth, and on monetary and other cyclical factors. However, spectral properties of the investigated time series as well as anti-cyclical property of the filtered unemployment trend casts some doubts on the long-run neoclassical view on unemployment. The fact that the presented model, which is spanned by 'neoclassical' trends, has problems of explaining exactly those features may signal the need of alternative views on the labor market. This low frequency correlation is not specific to Czech data. King (2005) reviews studies which document this correlation on the US data. Farmer (2010) also finds it and uses it to support his Keynesian (NOT new Keynesian!) interpretation of unemployment dynamics. The two first sub-figures in the second row shows correlations between real output and employment and real output and total hours worked. There is positive correlation between real output and the two employment measures with output lead at about 2 to 3 quarters. The quadrature spectrum peaks at the business cycle frequency, which means that this lead is caused by the business cycle movements. The correlation is stronger for output and employment than for output and total hours. The intuition why this is the case can be given by the last sub-figure in the second row. Although the sample correlation of employment and hours per employee is virtually zero, the strong negative co spectrum and quadratic spectrum at business cycle frequencies suggest that hours and employee are substitute at these frequencies. Hence, it appears as if firms use the hours margin to manage their labor demand in the short and medium run. The model gives qualitatively the same picture as data, but overestimates the observed correlations. The last row shows correlation between real wage and total hours, employ- 12 ment, and hours per employees. The strongest correlation exists between real wage and employment with about 4 quarter lead of real wage. Surprisingly, the sample correlation between real wage and hours per employees is negative (both in model and in data), which is attributable to the discussed issue of substitution between hours and employees on the medium and high frequencies. Again, the model tends to overestimate the observed correlations. These findings are corroborated by Figures A.6 and A.7. Figure A.6 shows the population cross-correlations, the population cross-correlations for business cycle frequencies 12 (6 to 32 quarters) and cross-correlations implied by the model. It is apparent that the second moments implied by the model are closer to the second moments at the business cycle than to sample correlations. Figure A.7 shows the population cross-correlations, the population cross-correlations for low frequencies (more than 32 quarters) and cross-correlations implied by the trend component in the model. Apparently, spectral properties of some trends in data are well described by those implied by the model. However, the model fails at describing other low frequency movements in data, which is particularly the case for the relation between real wage and employment and hours. Again, this may point that the neoclassical approach to the labor market may fail to account for some interesting long-run movements. If such interpretation is correct, this would mean that also new Keynesian DSGE models, which have the neoclassical long run, may miss important aspects of data. 3.3 Forecasting The model can be used for forecasting. Figure A.10 displays the recursive forecast for main model variables, while Figure A.ii compares the relative root mean square error of random walk forecast and VAR forecast13 to model forecast for main variables at various lags. It seems that for some variables (especially real wages, hours worked and inflation), the model does a good job. On the other hand, the labor force and employment forecast are similar 12 The frequency specific cross correlations can be derived from the spectral density. Let sxy (w) be a cross spectral density of variables x and y at frequency w. Then the population cross-correlation between x and y at lag k corresponding to frequency band [w1 w2] is given as T^1 W2] = J^2 sxy(w)etojk dw. To estimate the frequency specific second moments reported in these two figures, I use the trapezoid rule to approximate the integral and sxy (w) is given as a linear transformation of the Barlett estimator of the multivariate spectral density. 13 In the figure, I compare the results with VAR(1) forecasts. The higher-order VARs can improve the forecast at short horizons (1-2 quarters), but fails completely at horizons greater than 3 quarters. The lucid paper by Tiao and Xu (1993) provides intuition why this may be the case. i3 to the unrestricted VAR forecasts. This is not surprising as trends driving these variables are identified to be more volatile. I use the model for a set of experiments. As the model is cast in the state space framework, forecasts can be easily conditioning on a variable or on a shocks. Figure A.12 compares how the model forecast 8 quarters ago would look like if the decline in real activity had been known. The unconditional forecast (green line) shows a prediction for 12 quarters for all model variables plus two additional derived variables 14 ) based on the knowledge of the then data. The conditional forecast shows the same forecast conditional on the actual realized output. It is interesting to note two counter-factual features: (i) the model would predict a more dramatic fall in real wages following by a rapid recovery, and (ii) the model is accurate in predicting the total hours worked, but fails in decomposing the total hours into employment and hours per employee. Since both features have been widely discussed during the forecasting process, the model can confirm that the behavior of these variables is somehow unusual from the historical perspective. 4 The Model in the Data-Rich Environment This section discusses the extension of the model to the data-rich environment. The basic idea is to consider a large set of economic time series to gain additional pieces of information, which can increase model forecasting power in real time. The additional series include both alternative definitions of model variables (such as alternative measures of inflation or of employment) as well as series directly unrelated to the model variables, but which may nevertheless improve the filtration and forecasting (this is the case of so-called leading indicators such as various measures of economic sentiments, import and energy prices, asset prices, or sectoral data). Several papers that deal with 'data rich' environments have appear, see Bernanke et al (2005) for an atheoretical model, Boivin and Giannoni (2006) for DSGE modeling in a data rich environment, or Moench (2008) for a financial application. My approach here is closest to Bernanke et al (2005). However, because of the short time series for Czech economy, I need a parsimonious specification of the extension and a good initial guess. I assume that the vector of additional variables 15 zt can be decomposed into a part, which can 14 The linearity of the framework allows to derive easily forecast, shock decomposition, filtering, confidence intervals etc. for any linear combination of model variables. Using unscented transform, even non-linear transformation may be generated. 15 The list of additional variables used for this exercise is provided in Appendix A. 14 be explained by the combination of the dynamics of the states of the original model xt and of a hidden variables (factors) unrelated to the original model, denote them 0t. Thus, the dynamics of zt can be written as follows: zt = *xXt + 0t + Uzt, uzt is idiosyncratic noise of individual series, and matrices ^ project zt on xt and 0t. To build the state space for the extended model, it is necessary to estimate the matrices and Sz. In principle, the matrix ^x could be estimated by least-square projections, since 0t is assumed to be unrelated (and hence orthogonal) to the states xt. This does not, however, work in practice, as the projection matrix is badly conditioned, not surprisingly given the large number of series in zt and in xt. Hence, I estimate the factor structure 16 for Xt and zt: xt = A^f, and zt = Az 0f, where the dimensions of 0^ and 0f are much smaller than that of xt and zt. Ax and Az are loading matrices. Then, I set: ^ = Az, where is the leat square projection of 0x on 0f and the cross superscript denotes the pseudo-inverse matrix. I set 0t to be equal to the residuum of the projection of zt on xt and the whole approximation reads as: 4>t = A+ [zt - AzA^A+xt] = A+ [zt - *xxt], Uzt = zt - *xXt + *<^t- Finally, I investigate the second-order properties of 0t and of uzt. It seems that the vector 0t could be described as a stationary VAR(1) process and uzt is a serially uncorrelated noise (with some intra-temporal cross-correlation between series): 0t+1 = + ~ i.i.d.N (0,1), Uzt ~ i.i.d.N(0, SzSTZ), for some suitable matrix $ (with eigenvalues in modulus less than one). Based on these findings, I formulate the extended model as follows: 16 Since, they are a lot of missing data in zt, I use the estimation approach suggested by Baiibura and Modugno (2010). 15 Xt+i A 0 Xt + M + 0 $ A 0 Sx 0 0 s C 0 Xt Sv 0 = + 0 Sz £t+l, (8) where Uzt is the 'white' version of uzt (i.e., uzt = SzUzt), and the tilde variables correspond to the system matrices given in the original model (6). In the extended model, I use first the values from the original model plus the values obtained as described above. Then, I optimize these values to increase the forecasting power (I run a few thousands iterations of the Nelder-Mead simplex search), the changes in parameter values are not huge and influence the entries in the covariance matrices mainly. One issue is worth discussing. The dynamics of unobserved states xt and 0t are unrelated and the observable variables yt do not depend on 0t. The reader then may worry what is the rational behind such formulation and whether then observation of zt provides an additional information about yt at all. The answer to the first point is that 0t is a part of the zt dynamics, which cannot be related to the dynamics of xt, in the other words, the vector 0t summarizes information about zt not contained in Xt or in yt. This does not mean that the author of this paper believes that an innovation to say an asset price originated outside the labor market does not have any impact on the labor market variables. That would be crazy in the light of the recent recession. But the issue is that the decomposition of states here reflects the purely statistical properties, not economic properties. Hence the structure of the extended model reflects the need of parsimonity when describing the data generating process but not the economic structure. In fact, there exists a huge number of other models, observationally equivalent to the described structure of the extended model, but which would imply different interpretations of shocks. Therefore, the exercise with the extended model version in this section should be interpreted as purely reduce-form. To answer the second question, note that observations of zt alone (even if yt were not observed) provide information about both xt and 0t. Hence, even the formulation used can be used to increase our information about xt and hence to improve our forecast of yt. Figure ?? compares the forecast performance of the original and the extended model. One can see that the forecasting improvement over the original model is marginal and for a set of variables only. One possible reason is the short span of available data. Nevertheless, the framework built can be used to test 16 the usefulness of the 'data rich' environment in future. 5 Conclusion In this paper, I propose an empirical small labor-market model for the Czech economy. The model allows for consistent filtering trends and cycles and for short-run forecasting. I show that the filtered trend differ from that based on the HP filter and propose an explanation for the difference. The paper then discusses the second moments of data and the ability of the model to fit these data is assessed. I argue that some feature of aggregate labor market data may be difficult to be rationalized by any model, which is neoclassical in the long-run, i.e. in which is the trend unemployment given by external (structural) features. Therefore, I think that economists may find worth considering alternative theories to the neoclassical approach, which is used to pin down the long run dynamics even in the new Keynesian models. Despite this, the presented model can be used as an independent tool for short-run forecasting. I document that the model outperform VAR models especially at horizons of 2 to 4 quarters. The state space formulation of the model allows for a simple treatment of missing data or asynchronous data release. Finally, I consider the extension to the data rich environment and I show that such an extension does not bring much benefits for the case considered. The analysis in the paper can be extended in several dimensions. First, the model can be extend to jointly filter the trend-cyclical component and to seasonally adjust data. In the present version, the data are seasonally adjusted outside the model. The joint filtration and seasonal adjustment may increase the model efficiency as the model-based adjustment can respect theoretical restrictions. Second, the short run dynamics of the present model is data driven. It may be useful to add more economic structure to the model. This is true not only for the short-run dynamics of the model, but the detailed modeling of labor force - in terms of exogenous demographic factors and possibly endogenous participation rates - may be useful too. Modeling of participation rates is on the other hand challenging as the Czech households have in past 10 years witnessed many changes in policy-induced incentives to enter or to leave the labor force. This is especially true for marginal groups like students or retired persons. Finally, the reader may have noticed that the model does not allow for correlation between shocks to the trends and to the cyclical part of the model. 17 As Morley et al (2003) show the possibility of correlation between these two kinds of shocks may have important implications for the trend-cycle decomposition. Andrle (2008) moreover argues that it is plausible from economic point of view to allow for such correlations. Nevertheless, it would be almost impossible to estimate and interpret such correlations in an atheoretical model and therefore I leave this possibility for this version. Therefore, the decomposition provided in this paper should be interpreted from the statistical point of view. Still I think that even such a statistical decomposition could be useful as the confrontation of its properties with those in data can teach interesting lessons. A Variables used in the extended model I considered a large set (> 100) of variables which can enter the data-rich extension of the model. However, I used only those, which have less than 25% missing data and which are available at quarterly frequencies. The resulting series are (the source of data is put to parenthesis): • Employment (VSPS); • consumption deflator (national account); • export deflator (national account); • import deflator (national account); • real consumption (national account); • real exports (national account); • real imports (national account); • energy import price (Czech Statistical Office); • other import price (Czech Statistical Office); • factors limiting the production - insufficient demand (Eurostat); • factors limiting the production - labour (Eurostat); • factors limiting the production - equipment (Eurostat); • current level of capacity utilization (%) (Eurostat); • electricity, gas, steam and air conditioning supply - number of persons employed index (Eurostat); • electricity, gas, steam and air conditioning supply - gross wages and salaries index (Eurostat); • capital goods - number of persons employed index (Eurostat); • capital goods - gross wages and salaries index (Eurostat); • consumer goods - number of persons employed index (Eurostat); • consumer goods - gross wages and salaries index (Eurostat); • durable consumer goods - number of persons employed index (Eurostat); • durable consumer goods -gross wages and salaries index (Eurostat); • intermediate goods - number of persons employed index (Eurostat); • intermediate goods - gross wages and salaries index (Eurostat); • non-durable consumer goods - number of persons employed index (Euro- 18 stat); • non-důrable consumer goods - gross wages and salaries index (Eůrostat); • energy - number of persons employed index (Eůrostat); • energy - gross wages and salaries index (Eůrostat); • 3M Pribor (CNB); • 6M Pribor (cNb); • 12M Pribor (CNB); • constrůction confidence indicator (Eůrostat); • economic sentiment indicator (Eůrostat); • indůstrial confidence indicator (Eůrostat); • retail confidence indicator (Eůrostat); • PX 50 (Eůrostat); • Deůtscher Aktienindex (Eůrostat); • Harmonized Index of Consůmer Prices (Eůrostat). 19 References [Andrle (2008)] Andrle, M. (2008). The Role of Trends and Detrending in DSGE models. Munich Personal RePEc Archive 13289; http://mpra.ub.uni-muenchen.de/13289/ [Babecky et al (2008)] Babecky, J., Dybczak, K., Galuscak, K. (2008). Survey on Wage and Price Formation of Czech Firms. CNB Working Paper 2008/12. [Ball (2009)] Ball, L. (2009). Hysteresis in Unemployment: Old and New Evidence. NBER working paper 14818. [Baabura and Modugno (2010)] Baybura, M., Modugno, M. (2010). Maximum Likelihood Estimation of factor Models on data Sets with Arbitrary Pattern of Missing Data. ECB Working paper 1189, May 2010. [Benes et al (2010)] Benes, J., Clinton, K., Johnson, M., Laxton, D., Matheson, T. (2010). Structural Models in Real Time. IMF Working Paper WP/10/56, March 2010. [Bernanke et al (2003)] Bernanke, B. S., Boivin, J., Eliasz, P. (2003). Measuring the Effect of Monetary Policy: A Factor-Augmented Vector Autoregressive (FAVAR) Approach. Quarterly Journal of Economics, February 2005, pp. 525546. [Bicakova et al (2006)] Bicakova, A., Slacalek, J., Slavik, M. (2006). Fiscal Implications of Personal Tax Adjustments in the Czech Republic. CNB Working Paper 7/2006. [Boivin and Giannoni (2006)] Boivin, J., Giannoni, M.P. (2006). DSGE Models in a Data-Rich Environment. NBER Working Paper 12772. [Bruchez (2003)] Bruchez, P.A. (2003). A Modification of the HP Filter Aiming at Reducing the End-Point Bias. Working Paper OT/2003/3. [Canova, Ferroni (2009)] Canova, F., Ferroni, F. (2009). Multiple filtering devices for the estimation of cyclical DSGE models. Working paper. [Chernozhukov and Hong (2003)] Chernozhukov, V., Hong, H. (2003). An MCMC approach to classical estimation. Journal of Econometrics 115, pp. 293-346. [Farmer (2010)] Farmer, R. (2010). Expectations, Employment and Prices. Oxford University Press. [Galuscak and Pavel (2007)] Galuscak, K., Pavel, J. (2007). Unemployment and Inactivity Traps in the Czech Republic: Incentive Effects of Policies. CNB Working Paper 2007/09. [Galuscak and Munich (2007)] Galuscak, K., Munich, D. (2007). Structural and Cyclical Unemployment: What Can Be Derived from the Matching Function. Czech Journal of Economics and Finance 57, 3-4, pp. 20 [Hamilton (1994)] Hamilton, J. (1994). Time Series Analysis. Princeton University Press. [Harvey (1989)] Harvey, A.C. (1989). Forecasting, structural time series and the Kalman filter. Cambridge University Press, Cambridge. [Harvey, Jaegger (1993)] Harvey, A.C., Jaegger A. (1993). Detrending, Stylized Facts, and the Business Cycle. Journal of Applied Econometrics 8(3), pp. 231247. [Hurnik, Navratil (2005)] Hurnik J., Navratil D. (2005). Labor Market Performance and Macroeconomic Policy: The Time-Varying NAIRU in the Czech Republic. Czech Journal of Economics and Finance 55, pp. 25-40. [King (2005)] King, T.B. (2005). Labor Productivity and Job-Market Flows: Trends, Cycles, and Correlations. Federal Reserve Bank of St. Louis Supervisory Policy Analysis Working Paper 2005-04. [Koopman, Harvey (2003)] Koopman, S.J., Harvey, A.C. (2003). Computing observation weights for signal extraction and filtering. Journal of Economic Dynamics and Control 27, pp. 1317-1333. [Leser et al (1961)] Leser, C.E.V. (1961). A Simple Method of Trend Construction. Journal of the Royal Statistical Society B 23, pp. 91-107. [Moench (2008)] Moench, E. (2008). Forecasting the yield curve in a data-rich environment: A no-arbitrage factor-augmented VAR approach. Journal of Econometrics 146, pp. 26-43. [Morley et al (2003)] orley, J., Nelson, C., Zivot, E. (2003). Why are the Beveridge-Nelson and Unobserved Components Decompositions of GDP so Different? The Review of Economics and Statistics LXXXV, pp. 235-243. [Proietti, Musso (2007)] Proietti, T., Musso, A. (2007). Growth Accounting for The Euro Area: A Structural Approach. ECB Working Paper 804, August 2007. [Tiao, Xu (1993)] Tiao G., Xu, D. (1993). Robustness of maximum likelihood estimates for multi-step predictions: the exponential smoothing case. Biometrika 80, pp. 623-641. 21 Fig. A.1. Model-based filtering Fig. A.2. Recursive identification of trends Growth rate in trend of output 1998 2000 2002 2004 2006 2008 Growth rate in trend inflation 1998 2000 2002 2004 2006 2008 1.5 * 0.5 -0.5 -1 2010 1.5 2010 22 Table A.1 Long-run elasticities Trend in Labor force 1t 1 0 0 0 0 Real output yt 1 1 -1 1 0 Employment et 1 0 -1 0 0 Nominal wage wt 0 1 0 0 1 Hours worked h t 0 0 0 1 0 GDP deflator pt 0 0 0 0 1 23 Fig. A.4. Co-spectra and quadrature spectra of selected variables Fig. A.5. Coherence of selected variables 24 Fig. A.6. Correlation of selected variables at quencies various lags at the business cycle fre- Fig. A.7. Correlation of selected variables at various lags at low frequencies 25 Fig. A.8. Inverse filter for trends Fig. A.9. Inverse filter for cycles 26 Fig. A.10. Recursive forecast for model variables Fig. A.11. Comparison of RMSE for various models Relative RMSE of Labor force 1.5 -0.5 - 1.5 -0.5 - 2.5 -1.5 -0.5 - i 2 3 4 5 6' Forecast horizont Relative RMSE of Employment 2 3 4 5 6 7 Forecast horizont Relative RMSE of Hours per employee 2.5 ■■ 1.5 - 0.5 ■■ 0 0 23456 Forecast horizont Relative RMSE of GDP deflator Relative RMSE of Rea l output rrr 1 11II 1 2 3 4 5 Forecast horizo Relative RMSE of Real II 1 6 7 8 t wage (h) 1 1 nfl n rim ffH- 1 ■ c _1R 1V andom AR (1) n walk relative to the model 0 relative to the model | 1 ni n 1_ 3456 Forecast horizont 3456 Forecast horizont 27 Fig. A.12. Unconditional versus conditional forecasting: the case of economic crisis ^ -2 (-4 L- Conditional forecast of labor force Conditional forecast of real output — Data — Unconditional foreca st 2007.5 2008 2008.5 2009 2009.5 2010 2010.5 2011 2011.5 Conditional forecast of employment ^ -5 --10 ^ 2007.5 2008 2008.5 2009 2009.5 2010 2010.5 2011 2011.5 Conditional forecast of hours per employee 10 --5 u \ 2007.5 2008 2008.5 2009 2009.5 2010 2010.5 2011 2011.5 Conditional forecast of nominal wage 20 10 -10 10 ■■ -5 ■■ -10 L 2007.5 2008 2008.5 2009 2009.5 2010 2010.5 2011 2011.5 Conditional forecast of real wage (h) 20 ~ 10 -10 2007.5 2008 2008.5 2009 2009.5 2010 2010.5 2011 2011.5 Conditional forecast of gdp deflator 10 -5 -10 2007.5 2008 2008.5 2009 2009.5 2010 2010.5 2011 2011.5 Conditional forecast of total hours 2007.5 2008 2008.5 2009 2009.5 2010 2010.5 2011 2011.5 ^ -5 -10 2007.5 2008 2008.5 2009 2009.5 2010 2010.5 2011 2011.5 28