《Bayesian Inference for Duration Data with Unobserved and Unknown Heterogeneity Monte Carlo Evidence and an Application.docx》由会员分享,可在线阅读,更多相关《Bayesian Inference for Duration Data with Unobserved and Unknown Heterogeneity Monte Carlo Evidence and an Application.docx(45页珍藏版)》请在得力文库 - 分享文档赚钱的网站上搜索。
1、 IZA DP No. 996 Bayesian Inference for Duration Data with Unobserved and Unknown Heterogeneity: Monte Carlo Evidence and an Application M. Daniele Paserman January 2004 Forschungsinstitut zur Zukunft der Arbeit Institute for the Study of Labor D I SCUSSI O N PAPERS E R I ES Bayesian Inference for Du
2、ration Data with Unobserved and Unknown Heterogeneity: Monte Carlo Evidence and an Application M. Daniele Paserman Hebrew University and IZA Bonn Discussion Paper No. 996 January 2004 IZA P.O. Box 7240 D-53072 Bonn Germany Tel.: +49-228-3894-0 Fax: +49-228-3894-210 Email: izaiza.org This paper can b
3、e downloaded without charge at: http:/ An index to IZA Discussion Papers is located at: http:/www.iza.org/publications/dps/ This Discussion Paper is issued within the framework of IZAs research area General Labor Economics. Any opinions expressed here are those of the author(s) and not those of the
4、institute. Research disseminated by IZA may include views on policy, but the institute itself takes no institutional policy positions. The Institute for the Study of Labor (IZA) in Bonn is a local and virtual international research center and a place of communication between science, politics and bu
5、siness. IZA is an independent, nonprofit limited liability company (Gesellschaft mit beschrnkter Haftung) supported by Deutsche Post World Net. The center is associated with the University of Bonn and offers a stimulating research environment through its research networks, research support, and visi
6、tors and doctoral programs. IZA engages in (i) original and internationally competitive research in all fields of labor economics, (ii) development of policy concepts, and (iii) dissemination of research results and concepts to the interested public. The current research program deals with (1) mobil
7、ity and flexibility of labor, (2) internationalization of labor markets, (3) welfare state and labor market, (4) labor markets in transition countries, (5) the future of labor, (6) evaluation of labor market policies and projects and (7) general labor economics. IZA Discussion Papers often represent
8、 preliminary work and are circulated to encourage discussion. Citation of such a paper should account for its provisional character. A revised version may be available on the IZA website (www.iza.org) or directly from the author. IZA Discussion Paper No. 996 January 2004 ABSTRACT Bayesian Inference
9、for Duration Data with Unobserved and Unknown Heterogeneity: Monte Carlo Evidence and an ApplicationThis paper describes a semiparametric Bayesian method for analyzing duration data. The proposed estimator specifies a complete functional form for duration spells, but allows flexibility by introducin
10、g an individual heterogeneity term, which follows a Dirichlet mixture distribution. I show how to obtain predictive distributions for duration data that correctly account for the uncertainty present in the model. I also directly compare the performance of the proposed estimator with Heckman and Sing
11、ers (1984) Non Parametric Maximum Likelihood Estimator (NPMLE). The methodology is applied to the analysis of youth unemployment spells. Compared to the NPMLE, the proposed estimator reflects more accurately the uncertainty surrounding the heterogeneity distribution. JEL Classification: C11, C41 Key
12、words: duration data, Dirichlet process, Bayesian inference, Markov chain Monte Carlo simulation M. Daniele Paserman Department of Economics Hebrew University Jerusalem, 91905 Israel Email: dpasermashum.huji.ac.il I wish to thank Gary Chamberlain and Keisuke Hirano for many lengthy discussions. Caro
13、line Hoxby, Guido Imbens, Larry Katz, Jack Porter, Tiemen Woutersen, and participants in seminars at Harvard University, Hebrew University, and at the Second Haifa Winter Workshop on Computer Science and Statistics (CSStat 2003) also provided helpful suggestions. All errors are my own. 1 1. Introduc
14、tion This paper develops a semiparametric Bayesian methodology for analyzing dura- tion data. The methodology specifies a hazard model belonging to a parametric family, and allows a flexible distribution for a residual heterogeneity term, by modeling it as a mixture of Dirichlet processes (Ferguson,
15、 1973; Antoniak, 1974). Markov chain Monte Carlo methods are then used to simulate posterior quantities for parameters of interest, and to generate predictive distributions. It is common to model duration data as a combination of a baseline hazard and a mixing distribution, and to interpret the base
16、line hazard as representing structural duration dependence, and the mixing distribution as unobserved het- erogeneity. It is well known that parameter estimates in this model are sensi- tive to the assumptions made about the mixing distribution. In an important contribution, Heckman and Singer (1984
17、) propose a Non Parametric Maximum Likelihood Estimator (NPMLE) that overcomes the excessive sensitivity of para- meter estimates to assumptions about the distribution of residual heterogeneity. The NPMLE specifies a hazard function up to a finite number of unknown para- meters, and then lets the he
18、terogeneity term follow a discrete mixture structure. This estimator has been used frequently in the literature to model unobserved heterogeneity in a variety of settings: unemployment duration for Canadian men (Ham and Rea, 1987); the eects of training on the length of unemployment and employment s
19、pells in an experimental study (Ham and Lalonde, 1996); welfare spells (Blank, 1989); and transitions in and out of poverty (Stevens, 1999). The NPMLE performs well in estimating the structural parameters of the duration model, but proves to be an unreliable guide to the shape of the true mixing dis
20、- 2 tribution of unobservables. In the case where the number of mixture points is unknown, a distribution theory for the proposed estimator has not yet, to the best of my knowledge, been developed.1 The estimator proposed here addresses the shortcomings of the NPMLE. The Bayesian approach enables on
21、e to obtain, conditional on the prior distribution, exact finite sample posterior probability intervals for the parameters of interest, that correctly account for the uncertainty present in the model.2 The Dirichlet process is a prior on the space of distribution functions, and allows flexibility in
22、 the heterogeneity distribution: this can be multimodal, skewed, or fat-tailed. The posterior distribution is a mixture of a continuous density and a discrete density. Importantly, the algorithm used to obtain posterior distributions of the parameters of interest generates also a posterior distribut
23、ion for the number of mass points in the heterogeneity distribution. Therefore, the marginal distribution of the parameters reflects the uncertainty surrounding the number of mixture points. This enables one to directly compare the performance of my estimator to Heckman and Singers NPMLE. In most ap
24、plications of the Dirichlet process, the data is modeled as a normal density, mixed with respect to the distribution of the parameters. If the common prior distribution follows a Dirichlet process, then the data will come from a Dirichlet mixture of normals (Ferguson, 1983; Escobar, 1994; Escobar an
25、d West, 1995). An interesting economic application is in Hirano (2002), who uses this 1 Van der Vaart (1996) proves asymptotic normality for the NPMLE in certain special cases, but does not provide a general proof. 2 In a parametric model, the 95 percent posterior probability interval does have the
26、frequentist property that, in repeated samples, and for large sample sizes, it contains the true parameter 95% of the time. In semiparametric applications such as the one studied here, it is no longer clear that the posterior probability interval has the desired frequentist property. Nevertheless, i
27、t can still be a useful summary measure of uncertainty. 3 methodology to study the structure of earning dynamics in a longitudinal data set. Non parametric analysis of duration data presents some peculiarities, because of the nonlinearity of the problem and because the residual heterogeneity term us
28、ually enters the model multiplicatively. The normal model is not convenient in this case. I overcome these diculties by specifying a Weibull hazard function, and letting the heterogeneity term follow a Dirichlet mixture of Gamma distribu- tions. The posterior distribution for the mixture density in
29、this case has not been previously derived. Semiparametric Bayesian analysis for proportional hazard models has been described in Kalbfleisch and Prentice (1980). Hjort (1990) proposes a nonpara- metric Bayes estimator based on Beta processes. In economic applications, Rug- giero (1994) proposes a fu
30、lly Bayesian estimator for the regression parameters in a proportional hazards model, by specifying a Dirichlet prior distribution for the baseline hazard, treated as a nuisance parameter. He then computes the posterior distribution of the parameter of interest, conditional on the data and integrate
31、d with respect to the nuisance parameter, and applies this methodology to an analysis of survival times of job vacancies. My approach diers in that I specify the complete distribution of duration times, up to a finite dimensional pa- rameter vector, and allow a flexible mixture for the distribution
32、of the individual heterogeneity term. This allows one to generate predictive distributions for du- ration spells, possibly at the cost of additional functional form assumptions. My approach is similar to that developed independently by Campolieti (2001): the dierence lies in the fact that Campolieti
33、 models the hazard in discrete time us- ing a multiperiod probit model and a normal prior for the Dirichlet process. The 4 Weibull-Gamma combination used in this paper adheres more closely to the types of models commonly analyzed in duration studies. In addition, I present results from a small Monte
34、 Carlo study showing that proposed estimator has the desired frequentist properties of unbiasedness (i.e., the posterior mean approximates the true parameter value) and correct coverage rates of the posterior interval. The rest of the paper is structured as follows: in Section 2 I present first a br
35、ief description of the Dirichlet process and discuss of some of its properties; then I describe its application to the Bayesian estimation of duration data. In Section 3 I present some suggestive Monte Carlo evidence on the performance of the estima- tion technique on simulated data sets. Section 4
36、applies this methodology to an analysis of unemployment spells of young men. It also compares the performance of the proposed estimator to Heckman and Singers NPMLE: parameter estimates and standard errors based on the Dirichlet model reflect substantially more accu- rately the uncertainty surroundi
37、ng the distribution of unobserved heterogeneity. Section 5 concludes. 2. Dirichlet Mixture Models for Duration Data 2.1. The Dirichlet Process The following definitions and properties of a Dirichlet process are due to Antoniak (1974). Definition 1. Let be a set, and A a -field of subsets of . Let be
38、 a finite, non-null, non-negative, finitely additive measure on (, A) . We say a random probability measure P on (, A) is a Dirichlet process on (, A) with base measure , denoted P D () , if for every k = 1, 2, . and measurable 5 i=1 6 X partition B1, B2, ., Bk of , the joint distribution of the ran
39、dom probabilities (P (B1) , ., P (Bk) is Dirichlet with parameters ( (B1) , ., (Bk) . (Based on Antoniak, 1974, Definition 1). Following are some useful properties of the Dirichlet process: 1. If P D() and A A,then E(P (A) = (A)/(). 2. If P D() and conditional given P, 1, 2, ., N are i.i.d. P, then
40、P |1, 2, ., N D( + PN i ) where x denotes the probability measure giving mass one to the point x. 3. If P D(), then P is almost surely discrete. The almost sure discreteness of the Dirichlet process is a key feature for model analysis. Suppose that P D(P0) is a Dirichlet process defined by , a posit
41、ive scalar, and P0, a probability measure. The probability measure P0 can be thought of as the prior expectation of P. The scalar is a precision parameter which determines the prior concentration of P around P0. In other words, represents the weight of the belief that P is centered around the distri
42、bution P0. Briefly, in any sample of size N from P , there is positive probability of coincident values. For any i = 1, 2, ., N , let (i) denote the vector without element i: (i) = 1, ., i 1, , ., . Then the conditional prior for ( |(i) is (i|(i) i+1 N 1 P0 + i N j . (1) + N 1 + N 1 j=1, j=i Similar
43、ly, the distribution of a new draw (N +1| ) is given by: 1 N (N +1|) P + X + N 0 + N j=1 j . (2) 6 +N j j j Thus, given , a sample of size N from P , the next case N +1 represents a new, distinct value with probability and is otherwise drawn uniformly from among the first N values. These first N val
44、ues themselves behave as described by (1) and so with positive probability reduce to k N distinct values. If we write the k distinct values among the N elements of as , j = 1, ., k, and let Nj be the number of occurrences of , then we can rewrite equation (2) as 1 k (N +1|) P + X + N 0 + N j=1 Nj .
45、(3) Antoniak (1974) summarizes the prior distribution for k induced by this process, and shows that it depends critically on . A value of =1 indicates that we are giving the prior P0 the same weight as every other observation. For instance, for N relatively large, E(k|,N ) ln(1 + N ); for N between
46、50 and 250, the prior for k heavily favors single digit values. Now assume the data t = (t1, ., tN ) are conditionally independent and follow a distribution with density f (ti|i). It then follows, from simple application of Bayes Theorem that the posterior distribution of i given (i) and t is where
47、(i|(i), ti) q0P1i + Z N X j=1, j6=i qj j , (4) q0 f (ti |)dP0; (5) qj f (ti |j ), (6) and P1i is the marginal posterior distribution of i given the data t and the prior P0. This posterior distribution has an analogous meaning as above: with probability 7 proportional to q0 we draw a new value of fro
48、m the posterior distribution P1i, and with probability proportional to qj we draw from one of the already existing values, j . The proportionality factor is easily obtained by noting that q0 + q1 + . + qN always sum up to one. The conditional distribution of (i | (i), t) is easily sampled from, given a convenient choice of the prior distribution P0. Given some starting value for , (possibly drawn from the P1 distribution), one can sample new elements of sequentially, by drawing from the distribution of (1 | (1), t), (2 | (2), t), and so on up to (N | (N ), t),