
on Econometrics 
By:  Martin Huber 
Abstract:  Sample selection is inherent to a range of treatment evaluation problems as the estimation of the returns to schooling or of the effect of school vouchers on test scores of college admissions tests, when some students abstain from the test in a nonrandom manner. Parametric and semiparametric estimators tackling selectivity typically rely on restrictive functional form assumptions that are unlikely to hold in reality. This paper proposes nonparametric weighting and matching estimators of average and quantile treatment effects that are consistent under more general forms of sample selection and incorporate effect heterogeneity with respect to observed characteristics. These estimators control for the double selection problem (i) into the observed population (e.g., working or taking the test) and (ii) into treatment by conditioning on nested propensity scores characterizing either selection probability. Weighting estimators based on parametric propensity score models are shown to be rootnconsistent and asymptotically normal. Simulations suggest that the proposed methods yield decent results in scenarios when parametric estimators are inconsistent. 
Keywords:  treatment effects, sample selection, inverse probability weighting, propensity score matching. 
JEL:  C13 C14 C21 
Date:  2009–04 
URL:  http://d.repec.org/n?u=RePEc:usg:dp2009:200907&r=ecm 
By:  David Roodman 
Abstract:  At the heart of many econometric models is a linear function and a normal error. Examples include the classical smallsample linear regression model and the probit, ordered probit, multinomial probit, Tobit, interval regression, and truncateddistribution regression models. Because the normal distribution has a natural multidimensional generalization, such models can be combined into multiequation systems in which the errors share a multivariate normal distribution. The literature has historically focused on multistage procedures for estimating mixed models, which are more efficiently computationally, if less so statistically, than maximum likelihood (ML). But faster computers and simulated likelihood methods such as the Geweke, Hajivassiliou, and Keane (GHK) algorithm for estimating higherdimensional cumulative normal distributions have made direct ML estimation practical. ML also facilitates a generalization to switching, selection, and other models in which the number and types of equations vary by observation. The Stata module cmp fits Seemingly Unrelated Regressions (SUR) models of this broad family. Its estimator is also consistent for recursive systems in which all endogenous variables appear on the righthandsides as observed. If all the equations are structural, then estimation is fullinformation maximum likelihood (FIML). If only the final stage or stages are, then it is limitedinformation maximum likelihood (LIML). cmp can mimic a dozen builtin Stata commands and several userwritten ones. It is also appropriate for a panoply of models previously hard to estimate. Heteroskedasticity, however, can render it inconsistent. This paper explains the theory and implementation of cmp and of a related Mata function, ghk2(), that implements the GHK algorithm. 
Keywords:  econometrics, cmp, GHK algorithm, seemingly unrelated regressions 
Date:  2009–03 
URL:  http://d.repec.org/n?u=RePEc:cgd:wpaper:168&r=ecm 
By:  Arnab Bhattacharjee 
Abstract:  We develop tests of the proportional hazards assumption, with respect to a continuous covariate, in the presence of unobserved heterogeneity with unknown distribution at the individual observation level. The proposed tests are specially powerful against ordered alternatives useful for modeling nonproportional hazards situations. By contrast to the case when the heterogeneity distribution is known up to finite dimensional parameters, the null hypothesis for the current problem is similar to a test for absence of covariate dependence. However, the two testing problems di¤er in the nature of relevant alternative hypotheses. We develop tests for both the problems against ordered alternatives. Small sample performance and an application to real data highlight the usefulness of the framework and methodology . 
Keywords:  Twosample tests, Increasing hazard ratio, Trend tests, Partial orders, Mixed proportional hazards model, Time varying coe¢cients. 
JEL:  C12 C14 C24 C41 
Date:  2009–04 
URL:  http://d.repec.org/n?u=RePEc:san:wpecon:0904&r=ecm 
By:  Roy Cerqueti (Univesity of Macerata); Paolo Falbo (University of Brescia); Cristian Pelizzari (University of Brescia) 
Abstract:  <p> </p><p align="left"><font size="1">While the large portion of the literature on Markov chain (possibly of order<br />higher than one) bootstrap methods has focused on the correct estimation of<br />the transition probabilities, little or no attention has been devoted to the<br />problem of estimating the dimension of the transition probability matrix.<br />Indeed, it is usual to assume that the Markov chain has a onestep memory<br />property and that the state space could not to be clustered, and coincides<br />with the distinct observed values. In this paper we question the opportunity<br />of such a standard approach.<br />In particular we advance a method to jointly estimate the order of the Markov<br />chain and identify a suitable clustering of the states. Indeed in several real<br />life applications the "memory" of many<br />processes extends well over the last observation; in those cases a correct<br />representation of past trajectories requires a significantly richer set than<br />the state space. On the contrary it can sometimes happen that some distinct<br />values do not correspond to really "different<br />states of a process; this is a common conclusion whenever,<br />for example, a process assuming two distinct values in t is not affected in<br />its distribution in t+1. Such a situation would suggest to reduce the<br />dimension of the transition probability matrix.<br />Our methods are based on solving two optimization problems. More specifically<br />we consider two competing objectives that a researcher will in general pursue<br />when dealing with bootstrapping: preserving the similarity between the<br />observed and the bootstrap series and reducing the probabilities of getting a<br />perfect replication of the original sample. A brief axiomatic discussion is<br />developed to define the desirable properties for such optimal criteria. Two<br />numerical examples are presented to illustrate the method.</font></p><p align="left"> </p> 
Keywords:  order of Markov chains,similarity of time series,transition probability matrices,multiplicity of time series,partition of states of Markov chains,Markov chains,bootstrap methods 
JEL:  C14 C15 C61 
Date:  2009–04 
URL:  http://d.repec.org/n?u=RePEc:mcr:wpdief:wpaper00053&r=ecm 
By:  Ingmar Nolte (Warwick Business School,FERC, CoFE); Valeri Voev (University of Aarhus, CoFE and CREATES) 
Abstract:  The expected value of sums of squared intraday returns (realized variance) gives rise to a least squares regression which adapts itself to the assumptions of the noise process and allows for a joint inference on integrated volatility (IV), noise moments and pricenoise relations. In the iid noise case we derive the asymptotic variance of the regression parameter estimating the IV, show that it is consistent and compare its asymptotic efficiency against alternative consistent IV measures. In case of noise which is correlated with the efficient return process, we postulate a new “asymptotically increasing” type of dependence and analyze its ability to cope with the empirically observed pricenoise dependence in quote data. In the empirical section of the paper we apply the LS methodology to estimate the integrated volatility as well as the noise properties of 25 liquid stocks both with midquote and transaction price data. We find that while iid noise is an oversimplification, its noniid characteristics have a decidedly negligible effect on volatility estimation within our framework, for which we provide a sound theoretical reason. In terms of noiseprice endogeneity, we are not able to find empirical support for simple ad hoc theoretical models and we provide an alternative explanation for the observed patterns in midquote data, based on market microstructure theory. 
Keywords:  High frequency data, Subsampling, Realized volatility, Market microstructure 
JEL:  G10 F31 C32 
Date:  2009–04–27 
URL:  http://d.repec.org/n?u=RePEc:aah:create:200916&r=ecm 
By:  Marcin Owczarczuk (Department of Applied Econometrics, Warsaw School of Economics) 
Abstract:  This paper presents maximum score type estimators for linear, binomial, tobit and truncated regression models. These estimators estimate the normalized vector of slopes and do not provide the estimator of intercept, although it may appear in the model. Strong consistency is proved. In addition, in the case of truncated and tobit regression models, maximum score estimators allow restriction of the sample in order to make ordinary least squares method consistent. 
Keywords:  maximum score estimation, tobit, truncated, binomial, semiparametric 
JEL:  C24 C25 C21 
Date:  2009–03–05 
URL:  http://d.repec.org/n?u=RePEc:wse:wpaper:30&r=ecm 
By:  Leech, Dennis (Department of Economics, University of Warwick); Leech, Robert (Division of Neuroscience and Mental Health, Imperial College London); Simmonds, Anna (MRC Clinical Sciences Center, Imperial College London) 
Abstract:  An increasing trend in functional MRI experiments involves discriminating between experimental conditions on the basis of finegrained spatial patterns extending across many voxels. Typically, these approaches have used randomized resampling to derive inferences. Here, we introduce an analytical method for drawing inferences from multivoxel patterns. This approach extends the general linear model to the multivoxel case resulting in a variant of the Mahalanobis distance statistic which can be evaluated on the !2 distribution. We apply this parametric inference to a singlesubject fMRI dataset and consider how the approach is both computationally more efficient and more sensitive than resampling inference. 
Date:  2009 
URL:  http://d.repec.org/n?u=RePEc:wrk:warwec:899&r=ecm 
By:  Michael Greenacre 
Abstract:  The use of simple and multiple correspondence analysis is wellestablished in social science research for understanding relationships between two or more categorical variables. By contrast, canonical correspondence analysis, which is a correspondence analysis with linear restrictions on the solution, has become one of the most popular multivariate techniques in ecological research. Multivariate ecological data typically consist of frequencies of observed species across a set of sampling locations, as well as a set of observed environmental variables at the same locations. In this context the principal dimensions of the biological variables are sought in a space that is constrained to be related to the environmental variables. This restricted form of correspondence analysis has many uses in social science research as well, as is demonstrated in this paper. We first illustrate the result that canonical correspondence analysis of an indicator matrix, restricted to be related an external categorical variable, reduces to a simple correspondence analysis of a set of concatenated (or “stacked”) tables. Then we show how canonical correspondence analysis can be used to focus on, or partial out, a particular set of response categories in sample survey data. For example, the method can be used to partial out the influence of missing responses, which usually dominate the results of a multiple correspondence analysis. 
Keywords:  Constraints, correspondence analysis, missing data, multiple correspondence 
JEL:  C19 C88 
Date:  2009–04 
URL:  http://d.repec.org/n?u=RePEc:upf:upfgen:1154&r=ecm 
By:  Monika Oleksiak (Warsaw School of Economics) 
Abstract:  The primary goal of the study is to diagnose satisfaction and loyalty drivers in Polish retail banking sector. The problem is approached with Customer Satisfaction Index (CSI) models, which were developed for national satisfaction studies in the United States and European countries. These are multiequation path models with latent variables. The data come from a survey on Poles’ usage and attitude towards retail banks, conducted quarterly on a representative sample. The model used in the study is a compromise between author’s synthesis of national CSI models and the data constraints. There are two approaches to the estimation of the CSI models: Partial Least Squares  used in national satisfaction studies and Covariance Based Methods (SEM, Lisrel). A discussion is held on which of those two methods is better and in what circumstances. In this study both methods are used. Comparison of their performance is the secondary goal of the study. 
Keywords:  satisfaction, loyalty, customer satisfaction index models, banking sector, structural equation models with latent variables, structural equations modeling, partial least squares, covariance based methods 
JEL:  C13 C39 C51 G21 M31 
Date:  2009–03–19 
URL:  http://d.repec.org/n?u=RePEc:wse:wpaper:33&r=ecm 
By:  Patrick Bajari; Jeremy Fox; Kyoo il Kim; Stephen P. Ryan 
Abstract:  The random coefficients, multinomial choice logit model has been widely used in empirical choice analysis for the last 30 years. We are the first to prove that the distribution of random coefficients in this model is nonparametrically identified. Our approach exploits the structure of the logit model, and so requires no monotonicity assumptions and requires variation in product characteristics within only an infinitesimally small open set. Our identification argument is constructive and may be applied to other choice models with random coefficients. 
JEL:  C14 C25 L00 
Date:  2009–04 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:14934&r=ecm 
By:  Cho, SeongHoon; Lambert, Dayton M.; Kim, Seung Gyu; Jung, Su Hyun 
Abstract:  This study deals with the issue of extreme coefficients in geographically weighted regression (GWR) and their effects on mapping coefficients using three datasets with different spatial resolutions. We found that although GWR yields extreme coefficients regardless of the resolution of the dataset or types of kernel function, 1) the GWR tends to generate extreme coefficients for less spatially dense datasets, 2) coefficient maps based on polygon data representing aggregated areal units are more sensitive to extreme coefficients, and 3) coefficient maps using bandwidths generated by a fixed calibration procedure are more vulnerable to the extreme coefficients than adaptive calibration. 
Keywords:  extreme coefficient, fixed and adaptive calibrations, geographically weighted regression, Mapping, Research Methods/ Statistical Methods, 
Date:  2009 
URL:  http://d.repec.org/n?u=RePEc:ags:aaea09:49117&r=ecm 
By:  Justyna Wróblewska (Cracow University of Economics) 
Abstract:  In this paper we present the Bayesian model selection procedure within the class of cointegrated processes. In order to make inference about the cointegration space we use the class of Matrix Angular Central Gaussian distributions. To carry out posterior simulations we use an alorithm based on the collapsed Gibbs sampler. The presented methods are applied to the analysis of the price  wage mechanism in the Polish economy. 
Keywords:  cointegration, Bayesian analysis, Grassmann manifold, Stiefel manifold, posterior probability 
JEL:  C11 C32 C52 
Date:  2009–03–22 
URL:  http://d.repec.org/n?u=RePEc:wse:wpaper:32&r=ecm 
By:  Andrés González Gómez; Lavan Mahadeva; Diego Rodríguez; Luis Eduardo Rojas 
Abstract:  If theoryconsistent models can ever hope to forecast well and to be useful for policy, they have to relate to data which though rich in information is uncertain, unbalanced and sometimes forecasts from external sources about the future path of other variables. One example from many is financial market data, which can help but only after smoothing out irrelevant shortterm volatility. In this paper we propose combining different types of useful but awkward data set with a linearised forwardlooking DSGE model through a Kalman Filter fixedinterval smoother to improve the utility of these models as policy tools. We apply this scheme to a model for Colombia. 
Date:  2009–04–21 
URL:  http://d.repec.org/n?u=RePEc:col:000094:005480&r=ecm 
By:  Mark Craddock (Department of Mathematical Sciences, University of Technology, Sydney); Eckhard Platen (School of Finance and Economics, University of Technology, Sydney) 
Abstract:  This paper uses Lie symmetry group methods to obtain transition probability densities for scalar diffusions, where the diffusion coefficient is given by a power law. We will show that if the drift of the diffusion satisfies a certain family of Riccati equations, then it is possible to compute a generalized Laplace transform of the transition density for the process. Various explicit examples are provided. We also obtain fundamental solutions of the Kolmogorov forward equation for diffusions, which do not correspond to transition probability densities. 
Keywords:  Lie symmetry groups; fundamental solutions; transition probability densities, It?o diffusions 
Date:  2009–03–01 
URL:  http://d.repec.org/n?u=RePEc:uts:rpaper:246&r=ecm 
By:  Ghislain Yanou (Centre d'Economie de la Sorbonne) 
Abstract:  In this paper, we propose a methodology for building an estimator of the covariance matrix. We use a robust measure of moments called Lmoments (see hosking, 1986), and their extension into a multivariate framework (see Serfling and Xiao, 2007). Random matrix theory (see Edelman, 1989) allows us to extract factors which contain real information. An empirical study in the American market shows that the Global Minimum Lvariance Portfolio (GMLP) obtained from our estimator well performs the Global Minimum Variance Portfolio (GMVP) that acquired from the empirical estimator of the covariance matrix. 
Keywords:  Covariance matrix, Lvariancecovariance, Lcorrelation, concomitance, random matrix theory. 
JEL:  G11 
Date:  2008–12 
URL:  http://d.repec.org/n?u=RePEc:mse:cesdoc:bla08103&r=ecm 
By:  Dominique Guegan (Paris School of Economics  Centre d'Economie de la Sorbonne); PierreAndré Maugis (Centre d'Economie de la Sorbonne) 
Abstract:  We present here a new way of building vine copulas that allows us to create a vast number of new vine copulas, allowing for more precise modeling in high dimensions. To deal with this great number of copulas we present a new efficient selection methodology using a lattice structure on the vine set. Our model allows for a lot of degrees of freedom, but further improvements face numerous problems caused by vines' complexity as an estimator in a statistical and computational way, problems that we will expose in this paper. Robust nvariate models would be a great breakthrough for asset risk management in banks and insurance companies. 
Keywords:  Vines, multivariate copulas, model selection. 
JEL:  D81 C10 C40 C52 
Date:  2008–12 
URL:  http://d.repec.org/n?u=RePEc:mse:cesdoc:b08095&r=ecm 
By:  Balakrishna, B S 
Abstract:  The jump distribution for the default intensities in a reduced form framework is modeled and calibrated to provide reasonable fits to CDX.NA.IG and iTraxx Europe CDOs, to 5, 7 and 10 year maturities simultaneously. Calibration is carried out using an efficient Monte Carlo simulation algorithm suitable for both homogeneous and heterogeneous collections of credit names. The underlying jump process is found to relate closely to a maximally skewed stable Levy process with index of stability alpha ~ 1.5. 
Keywords:  Default Risk; Default Correlation; Default Intensity; Intensity Model; Levy Density; CDO; Monte Carlo 
JEL:  G13 
Date:  2008–07–16 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:14922&r=ecm 
By:  Agostino Tarsitano (Dipartimento di Economia e Statistica, Università della Calabria) 
Abstract:  Rank correlation is a fundamental tool to express dependence in cases in which the data are arranged in order. There are, by contrast, circumstances where the ordinal association is of a nonlinear type. In this paper we investigate the effectiveness of several measures of rank correlation. These measures have been divided into three classes: conventional rank correlations, weighted rank correlations, correlations of scores. Our findings suggest that none is systematically better than the other in all circumstances. However, a simply weighted version of the Kendall rank correlation coefficient provides plausible answers to many special situations where intercategory distances could not be considered on the same basis. 
Keywords:  Ordinal Data, Nonlinear Association, Weighted Rank Correlation 
Date:  2009–04 
URL:  http://d.repec.org/n?u=RePEc:clb:wpaper:200906&r=ecm 