Forecasting with Unobserved Heterogeneity.docx

上传人:a**** 文档编号:9228 上传时间:2017-10-21 格式:DOCX 页数:21 大小:265.71KB
返回 下载 相关 举报
Forecasting with Unobserved Heterogeneity.docx_第1页
第1页 / 共21页
Forecasting with Unobserved Heterogeneity.docx_第2页
第2页 / 共21页
点击查看更多>>
资源描述

《Forecasting with Unobserved Heterogeneity.docx》由会员分享,可在线阅读,更多相关《Forecasting with Unobserved Heterogeneity.docx(21页珍藏版)》请在得力文库 - 分享文档赚钱的网站上搜索。

1、1 Forecasting with Unobserved Heterogeneity Matteo Richiardi a,b aUniversity of Turin, Department of Economics, via Po 53, 10124 Torino. bLABORatorio R. Revelli, Collegio Carlo Alberto, via Real Collegio 30, 10024 Moncalieri, Torino May 9, 2012 Abstract Forecasting based on random intercepts models

2、requires imputation of the individual permanent effects to the simulated individuals. When these individuals enter the simula- tion with a history of past outcomes this involves sampling from conditional distributions, which might be unfeasible. I present a method for drawing individual permanent ef

3、fects from a conditional distribution which only requires to invert the corresponding estimated unconditional distribution. While the algorithms currently available in the literature re- quire polynomial time, the proposed method only requires matching two ranks and works therefore in N ln N time. K

4、eywords: forecasting, microsimulation, random intercept models, unobserved hetero- geneity JEL codes: C15 (Statistical Simulation Methods: General), C53 (Forecasting Models; Simulation Methods), C63 (Computational Techniques; Simulation Modeling) 1 Introduction In this paper I present a method for i

5、mputing individual specific effects to a population that is different from the one on which these effects have been estimated. The method is relevant for projecting forward in time models with unobserved heterogeneity (UH). UH generally comes in the form of fixed effects or random effects models, wh

6、ere the deviations from the conditional Email: matteo.richiardiunito.it 2 expectation function are specified as ei,t = i + ui,t (1) where i is the individual-specific effect, that is, unobserved permanent heterogeneity, and ui,t is a random component. Fixed effects and random effects models differ w

7、ith respect to the assumptions made on the correlation between the individual-specific effect and the other covariates1, but they both assume the unobserved individual component is constant over time: any variable component ends up in the random term , if unpredictable, or in the covariates X, if pr

8、edictable. For this reason, these models are often referred to as random intercept models. Assuming the processes are correctly specified, the techniques used to estimate these models allow to recover unbiased coefficient estimates, which can then be used for prediction. If we are only concerned wit

9、h projecting forward in time the estimation sample, the estimated indi- vidual intercepts will then be treated as additional covariates.2 However, an issue arises if the estimates have to be applied to a different population, for which the individual effects are (by definition) unobservable.3 Imputi

10、ng the missing variables in the simulation sample from their estimated counterparts in the estimation sample by means of standard imputation techniques (Pickles, 2005; Howell, 2008) is at best problematic, as the random intercepts and the observ- able explanatory variables can in principle be unrela

11、ted (as assumed in random effects models). Should we then set the random intercepts to zero, and take into account only the observables in simulating the outcomes of interest? This is often done and, as we will see, can be justified on some grounds. But for many practical purposes this is not satisf

12、actory, and we should as- sign a random intercept, that is an unobserved permanent effect, to each new individual in the simulation sample. If the new individuals enter the simulation without a history of previous outcomes, the solution is straightforward, and involves sampling from the unconditiona

13、l distribution of the estimated random intercepts. If, on the other hand, the new individuals come with a history of previous outcomes, the posterior distribution of heterogeneity given the observed past outcomes should be used, in a Bayesian perspective. However, these conditional distributions mig

14、ht be difficult to derive analytically, difficult to invert or computationally burdensome to sample from 1See Honore (2002) for a comprehensive discussion on the choice between these two approaches. 2This is generally done automatically by standard statistical packages, e.g. with the predict post-es

15、timation command in Stata. 3In this case, the out-of-sample nature of prediction regards both time and the units of analysis. 3 empirically. As an alternative to the Bayesian approach, an optimal assignment approach can be followed (Panis, 2003). This involves assigning each individual the value for

16、 unobserved heterogeneity that best matches his observed past outcomes. Solving the optimal assignment problem requires finding the distribution of individual effects that maximizes the likelihood of observing the true data. Methods for solving the optimal assignment problem can be borrowed from the

17、 linear pro- gramming literature. However, they work in polynomial time. Indeed, a widely held conjecture in that literature is that it is not possible to solve the optimal assignment problem in less than polynomial time. This might be an impediment in forecasting exercises that involve hundred of t

18、housands or even millions of individuals, as is common for instance in dynamic microsimulation models Li and ODonoghue (2012). By exploiting the properties of the likelihood function, I develop a more efficient method that works in N ln N time, where N is the number of individual effects to be assig

19、ned. This method can be applied to a wide variety of models, linear and non-linear, continuous and discrete. The paper proceeds as follows. Section 2 discusses the relevance of the setting to which these notes apply, that is when the estimation sample and the simulation sample do not coincide and ne

20、w individuals enter the simulation with a previous history of outcomes. Section 3 introduces two illustrative models that will be later used in the discussion a continuous response linear model and a binary response latent variable model and discusses whether and when simulating UH leads to better f

21、orecasts. Section 4 describes the Bayesian approach to the imputation of the random intercepts, which involves deriving the conditional distributions and then sampling via the Inverse Transform method (IT). Section 5 explains how the problem is approached in a linear programming setting. Section 6 p

22、resents the Rank method and shows that it indeed solves the optimal assignment problem, in a probabilistic sense. Section 7 discusses a drawback of the Rank method in nonlinear applications (as in the probit/logit models), and assesses it by means of a Montecarlo analysis. Section 8 concludes. 2 Rel

23、evance of the problem The problem of imputing the random intercepts arises, as we have seen, when the estimates have to be applied for projecting forward in time a different population, and individuals enter the simulation with a previous history of outcomes. Indeed, this is rather a common situatio

24、n, 4 for instance in dynamic microsimulation. First, microsimulation models generally include different processes (like schooling, household formation, labor market transitions, retirement, etc.): it is quite unlikely that a single dataset exists with all the relevant variables so that it can be use

25、d for estimation of all processes and as a basis for the simulation; this being all the more so as models with random intercepts require panel data for estimation. A more common situation is to estimate different processes on different datasets, and than apply the estimated coefficients to an inital

26、 population to be simulated forward in time. True, to simulate we need information on all the variables included in the empirical specifications, but this falls short from requiring the union of all the datasets used for estimation, as (i) we do not need the longitudinal dimension required for deali

27、ng with unobserved heterogeneity, (ii) we do not even need retrospective information if the empirical specifications only include first order lags, as the base year values will become lagged values in the first year of the simulation, and (iii) we might impute missing information from other donor da

28、tasets4. Second, even when the initial population coincides with the estimation sample, it is often the case that it needs to be expanded over time, for instance to include partners and offsprings. These new individuals entering the simulation might come with a previous history of outcomes as well:

29、not only in the case of spouses, but also of newborns. If the latter sounds puzzling, consider that many datasets register information only for individuals above a minimum age: for instance, EU-SILC have data only for those aged 16+. Given the wide coverage of the EU-SILC survey, this is a likely fe

30、eder of a microsimulation model for European countries. In this case, newborns enter the microsimulation at age 16, after having already experienced meaningful education and labor market choices/lotteries. Finally, even with cohort models where the evolution of a single cohort of individuals is simu

31、lated through time, it is quite likely that the cohort is not observed from birth, so that individuals enter the microsimulation with a previous history of outcomes. 3 Two benchmark cases: linear and discrete choice models As an illustration, consider two benchmark cases. Eq. 2 refers to a simple li

32、near model, where the (observed) continuous response is y. If we assume, by converse, that y is a latent 4By converse, estimating the models on imputed data would impinge on the properties of the estimates (Rubin, 1976; Little and Rubin, 1987). 5 i,t (unobserved) variable, and that only a discrete 0

33、-1 outcome y can be observed, we get the standard binary response model (eq. 3). y 0 i,t = xi,t + ui,t + i (2) with E(U ) = E(A) = 0, and 1 if y 0 yi,t = (3) 0 otherwise Hereafter, Ill refer to the two models above, by assuming that either y (continuous response model) or y (binary response model) i

34、s observable. Estimates for the effect of the explanatory variables X, and for the parameters governing the A and U distributions are obtained, together with individual intercepts i.5 The model must then be applied to a different population j = 1 N , for which we know, at the beginning of the simula

35、tion at time s = 0, only the observable characteristics xj,0 and possibly yj,0. While can be directly used to construct the predicted outcome, a problem arises in assigning each simulated individual j a specific j . A simple solution is to set j = 0 to each individual in the simulation sample. Ill n

36、ow discuss what are its implications, in terms of the projected outcomes. For the sake of simplicity, in this discussion I will assume = = 0. 3.1 What if unobserved heterogeneity is neglected in forecasting? A simple but at first sight counterintuitive result is that neglecting UH is good if the goa

37、l is to forecast the trend, that is the average value of the outcome. To see why, assume the forecasts are evaluated on the basis of the mean squared errors between the predicted and the actual outcome, at some future time s: MSE(s) = X(j,s j,s)2/N (4) j where = y, y. We want to minimize the expecte

38、d value of MSE(). In the linear case of eq. 2, we get, in 5Under the standard assumptions of homoscedastic Gaussian error term U , U is obtained. In random effects models (A is normally distributed and independent of X), A is obtained. In fixed effects models, a non-parametric distribution for the r

39、andom intercepts can be recovered. 6 A expected terms:6 E(Y Y )2 = E(U + A A)2 = EU 2 + E(A A)2 (5) = 2 + 2(2 ) 2 U A A,A U where A,A is the covariance between the true and the estimated random intercepts, which is at most equal to 2 (when the random intercepts are perfectly estimated). Hence, setti

40、ng A = 0 leads to more accurate predictions of the evolution of the aggregate outcome. The discrete case is a bit more involved, but the intuition still goes through.7 However, the goal of microsimulation models is to forecast distributions, rather than averages (Orcutt, 1957). Consequently, it is t

41、he likelihood of individual longitudinal trajectories that matters the most in microsimulation modeling. What we want, therefore, is to get the covariance Cov(j,s, j,s+1) between individual outcomes at different periods right. For the linear case, the expected value of this covariance is E(Y EY )(Y

42、EY ) = 2 (6) j,s j,s j,s+1 j,s+1 j Clearly, setting A = 0 leads to less accurate predictions of the autocorrelation structure, if A has any predictive power (A,A 0). Again, the discrete case is more complicated, but the intuition remains the same. 4 Sampling from the conditional distributions via th

43、e Inverse Transform method According to the Bayesian approach, the correct way of assigning random intercepts j to individuals with a previous history of outcomes B, is sampling from the estimated distribution of A, conditional on B. Applying Bayes theorem: fA(A = ) Pr(B|A) fA(A = ) PrU (B) fA|B ()

44、= = Pr(B) PrE (7) (B) 6I denote random variables with capital letters, and their realization for a specific individual j with small letters and the index j. 7Note that what Im discussing here is setting the j to 0 but using the UH-corrected coefficient , not estimating the model without UH as Panis

45、(2003) does, to provide a benchmark to the projections with imputed j . 7 | | where fA|B and fA are respectively the conditional and the unconditional distribution of A with respect to B, and PrU (B) and PrE (B) the conditional and unconditional distribution of B with respect to . Sampling from this

46、 distribution can be done by the Inverse Transform method: a random number is extracted from the uniform distribution on 0, 1, which gives the value of the condi- tional cumulative distribution function FA|B () = R fA|B ()d; is then assigned by inverting this function, using the estimated distributi

47、on of A, fA B : r U (0, 1) = FA B (r) (8) In our continuous response linear model the conditional PDF is: fA(A = )fU (U + = k) fA()fU (k ) fA|Y =k (, k) = = fE (E = k) (9) fE (k) while in the binary response model the conditional PDFs are:8 fA(A = ) Pr(U + 0) fA()FU () fA|Y =1() = = Pr(U + A 0) = 2f

48、A()FU () (10b) FE (0) Consequently, the conditional CDFs are: Z fA()fU (k ) FA|Y =k (, k) = d (11) FE (k) for the linear case, and FA|Y =1() = FA|Y =0() = Z 2fA()FU ()d (12a) Z 2fA()(1 FU ()d (12b) for the binary case. These are the functions to be inverted, which might prove not an easy task, for most dis- 8Assuming simmetry around the mean 0 of the error terms. 1 8 j=1 j,0 j,0 j,0 j,0 j=1 tributional assumptions about the error terms, even in our simple case with no explanatory variables. 5 Optimal assignment Solving the optimal assignment pr

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 应用文书 > 毕业论文

本站为文档C TO C交易模式,本站只提供存储空间、用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。本站仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知得利文库网,我们立即给予删除!客服QQ:136780468 微信:18945177775 电话:18904686070

工信部备案号:黑ICP备15003705号-8 |  经营许可证:黑B2-20190332号 |   黑公网安备:91230400333293403D

© 2020-2023 www.deliwenku.com 得利文库. All Rights Reserved 黑龙江转换宝科技有限公司 

黑龙江省互联网违法和不良信息举报
举报电话:0468-3380021 邮箱:hgswwxb@163.com