nep-ecm New Economics Papers
on Econometrics
Issue of 2009‒05‒02
eighteen papers chosen by
Sune Karlsson
Orebro University

  1. Treatment evaluation in the presence of sample selection By Martin Huber
  2. Estimating Fully Observed Recursive Mixed-Process Models with cmp By David Roodman
  3. Testing for Proportional Hazards with Unrestricted Univariate Unobserved Heterogeneity By Arnab Bhattacharjee
  4. Optimal Dimension of Transition Probability Matrices for Markov Chain Bootstrapping By Roy Cerqueti; Paolo Falbo; Cristian Pelizzari
  5. Least Squares Inference on Integrated Volatility and the Relationship between Efficient Prices and Noise By Ingmar Nolte; Valeri Voev
  6. Maximum Score Type Estimators By Marcin Owczarczuk
  7. Parametric inference for functional information mapping By Leech, Dennis; Leech, Robert; Simmonds, Anna
  8. Canonical correspondence analysis in social science research By Michael Greenacre
  9. Satisfaction Drivers in Retail Banking: Comparison of Partial Least Squares and Covariance Based Methods By Monika Oleksiak
  10. The Random Coefficients Logit Model Is Identified By Patrick Bajari; Jeremy Fox; Kyoo il Kim; Stephen P. Ryan
  11. Extreme coefficients in Geographically Weighted Regression and their effects on mapping By Cho, Seong-Hoon; Lambert, Dayton M.; Kim, Seung Gyu; Jung, Su Hyun
  12. Bayesian Model Selection in the Analysis of Cointegration By Justyna Wróblewska
  14. On Explicit Probability Laws for Classes of Scalar Diffusions By Mark Craddock; Eckhard Platen
  15. Extension of random matrix theory to the L-moments for robust portfolio allocation. By Ghislain Yanou
  16. Note on new prospects on vines. By Dominique Guegan; Pierre-André Maugis
  17. Levy Density Based Intensity Modeling of the Correlation Smile By Balakrishna, B S

  1. By: Martin Huber
    Abstract: Sample selection is inherent to a range of treatment evaluation problems as the estimation of the returns to schooling or of the effect of school vouchers on test scores of college admissions tests, when some students abstain from the test in a non-random manner. Parametric and semiparametric estimators tackling selectivity typically rely on restrictive functional form assumptions that are unlikely to hold in reality. This paper proposes nonparametric weighting and matching estimators of average and quantile treatment effects that are consistent under more general forms of sample selection and incorporate effect heterogeneity with respect to observed characteristics. These estimators control for the double selection problem (i) into the observed population (e.g., working or taking the test) and (ii) into treatment by conditioning on nested propensity scores characterizing either selection probability. Weighting estimators based on parametric propensity score models are shown to be root-n-consistent and asymptotically normal. Simulations suggest that the proposed methods yield decent results in scenarios when parametric estimators are inconsistent.
    Keywords: treatment effects, sample selection, inverse probability weighting, propensity score matching.
    JEL: C13 C14 C21
    Date: 2009–04
  2. By: David Roodman
    Abstract: At the heart of many econometric models is a linear function and a normal error. Examples include the classical small-sample linear regression model and the probit, ordered probit, multinomial probit, Tobit, interval regression, and truncateddistribution regression models. Because the normal distribution has a natural multidimensional generalization, such models can be combined into multi-equation systems in which the errors share a multivariate normal distribution. The literature has historically focused on multi-stage procedures for estimating mixed models, which are more efficiently computationally, if less so statistically, than maximum likelihood (ML). But faster computers and simulated likelihood methods such as the Geweke, Hajivassiliou, and Keane (GHK) algorithm for estimating higherdimensional cumulative normal distributions have made direct ML estimation practical. ML also facilitates a generalization to switching, selection, and other models in which the number and types of equations vary by observation. The Stata module cmp fits Seemingly Unrelated Regressions (SUR) models of this broad family. Its estimator is also consistent for recursive systems in which all endogenous variables appear on the right-hand-sides as observed. If all the equations are structural, then estimation is full-information maximum likelihood (FIML). If only the final stage or stages are, then it is limited-information maximum likelihood (LIML). cmp can mimic a dozen built-in Stata commands and several user-written ones. It is also appropriate for a panoply of models previously hard to estimate. Heteroskedasticity, however, can render it inconsistent. This paper explains the theory and implementation of cmp and of a related Mata function, ghk2(), that implements the GHK algorithm.
    Keywords: econometrics, cmp, GHK algorithm, seemingly unrelated regressions
    Date: 2009–03
  3. By: Arnab Bhattacharjee
    Abstract: We develop tests of the proportional hazards assumption, with respect to a continuous covariate, in the presence of unobserved heterogeneity with unknown distribution at the individual observation level. The proposed tests are specially powerful against ordered alternatives useful for modeling non-proportional hazards situations. By contrast to the case when the heterogeneity distribution is known up to finite dimensional parameters, the null hypothesis for the current problem is similar to a test for absence of covariate dependence. However, the two testing problems di¤er in the nature of relevant alternative hypotheses. We develop tests for both the problems against ordered alternatives. Small sample performance and an application to real data highlight the usefulness of the framework and methodology .
    Keywords: Two-sample tests, Increasing hazard ratio, Trend tests, Partial orders, Mixed proportional hazards model, Time varying coe¢cients.
    JEL: C12 C14 C24 C41
    Date: 2009–04
  4. By: Roy Cerqueti (Univesity of Macerata); Paolo Falbo (University of Brescia); Cristian Pelizzari (University of Brescia)
    Abstract: <p> </p><p align="left"><font size="1">While the large portion of the literature on Markov chain (possibly of order<br />higher than one) bootstrap methods has focused on the correct estimation of<br />the transition probabilities, little or no attention has been devoted to the<br />problem of estimating the dimension of the transition probability matrix.<br />Indeed, it is usual to assume that the Markov chain has a one-step memory<br />property and that the state space could not to be clustered, and coincides<br />with the distinct observed values. In this paper we question the opportunity<br />of such a standard approach.<br />In particular we advance a method to jointly estimate the order of the Markov<br />chain and identify a suitable clustering of the states. Indeed in several real<br />life applications the "memory" of many<br />processes extends well over the last observation; in those cases a correct<br />representation of past trajectories requires a significantly richer set than<br />the state space. On the contrary it can sometimes happen that some distinct<br />values do not correspond to really "different<br />states of a process; this is a common conclusion whenever,<br />for example, a process assuming two distinct values in t is not affected in<br />its distribution in t+1. Such a situation would suggest to reduce the<br />dimension of the transition probability matrix.<br />Our methods are based on solving two optimization problems. More specifically<br />we consider two competing objectives that a researcher will in general pursue<br />when dealing with bootstrapping: preserving the similarity between the<br />observed and the bootstrap series and reducing the probabilities of getting a<br />perfect replication of the original sample. A brief axiomatic discussion is<br />developed to define the desirable properties for such optimal criteria. Two<br />numerical examples are presented to illustrate the method.</font></p><p align="left"> </p>
    Keywords: order of Markov chains,similarity of time series,transition probability matrices,multiplicity of time series,partition of states of Markov chains,Markov chains,bootstrap methods
    JEL: C14 C15 C61
    Date: 2009–04
  5. By: Ingmar Nolte (Warwick Business School,FERC, CoFE); Valeri Voev (University of Aarhus, CoFE and CREATES)
    Abstract: The expected value of sums of squared intraday returns (realized variance) gives rise to a least squares regression which adapts itself to the assumptions of the noise process and allows for a joint inference on integrated volatility (IV), noise moments and price-noise relations. In the iid noise case we derive the asymptotic variance of the regression parameter estimating the IV, show that it is consistent and compare its asymptotic efficiency against alternative consistent IV measures. In case of noise which is correlated with the efficient return process, we postulate a new “asymptotically increasing” type of dependence and analyze its ability to cope with the empirically observed price-noise dependence in quote data. In the empirical section of the paper we apply the LS methodology to estimate the integrated volatility as well as the noise properties of 25 liquid stocks both with midquote and transaction price data. We find that while iid noise is an oversimplification, its non-iid characteristics have a decidedly negligible effect on volatility estimation within our framework, for which we provide a sound theoretical reason. In terms of noise-price endogeneity, we are not able to find empirical support for simple ad hoc theoretical models and we provide an alternative explanation for the observed patterns in midquote data, based on market microstructure theory.
    Keywords: High frequency data, Subsampling, Realized volatility, Market microstructure
    JEL: G10 F31 C32
    Date: 2009–04–27
  6. By: Marcin Owczarczuk (Department of Applied Econometrics, Warsaw School of Economics)
    Abstract: This paper presents maximum score type estimators for linear, binomial, tobit and truncated regression models. These estimators estimate the normalized vector of slopes and do not provide the estimator of intercept, although it may appear in the model. Strong consistency is proved. In addition, in the case of truncated and tobit regression models, maximum score estimators allow restriction of the sample in order to make ordinary least squares method consistent.
    Keywords: maximum score estimation, tobit, truncated, binomial, semiparametric
    JEL: C24 C25 C21
    Date: 2009–03–05
  7. By: Leech, Dennis (Department of Economics, University of Warwick); Leech, Robert (Division of Neuroscience and Mental Health, Imperial College London); Simmonds, Anna (MRC Clinical Sciences Center, Imperial College London)
    Abstract: An increasing trend in functional MRI experiments involves discriminating between experimental conditions on the basis of fine-grained spatial patterns extending across many voxels. Typically, these approaches have used randomized resampling to derive inferences. Here, we introduce an analytical method for drawing inferences from multivoxel patterns. This approach extends the general linear model to the multivoxel case resulting in a variant of the Mahalanobis distance statistic which can be evaluated on the !2 distribution. We apply this parametric inference to a single-subject fMRI dataset and consider how the approach is both computationally more efficient and more sensitive than resampling inference.
    Date: 2009
  8. By: Michael Greenacre
    Abstract: The use of simple and multiple correspondence analysis is well-established in social science research for understanding relationships between two or more categorical variables. By contrast, canonical correspondence analysis, which is a correspondence analysis with linear restrictions on the solution, has become one of the most popular multivariate techniques in ecological research. Multivariate ecological data typically consist of frequencies of observed species across a set of sampling locations, as well as a set of observed environmental variables at the same locations. In this context the principal dimensions of the biological variables are sought in a space that is constrained to be related to the environmental variables. This restricted form of correspondence analysis has many uses in social science research as well, as is demonstrated in this paper. We first illustrate the result that canonical correspondence analysis of an indicator matrix, restricted to be related an external categorical variable, reduces to a simple correspondence analysis of a set of concatenated (or “stacked”) tables. Then we show how canonical correspondence analysis can be used to focus on, or partial out, a particular set of response categories in sample survey data. For example, the method can be used to partial out the influence of missing responses, which usually dominate the results of a multiple correspondence analysis.
    Keywords: Constraints, correspondence analysis, missing data, multiple correspondence
    JEL: C19 C88
    Date: 2009–04
  9. By: Monika Oleksiak (Warsaw School of Economics)
    Abstract: The primary goal of the study is to diagnose satisfaction and loyalty drivers in Polish retail banking sector. The problem is approached with Customer Satisfaction Index (CSI) models, which were developed for national satisfaction studies in the United States and European countries. These are multiequation path models with latent variables. The data come from a survey on Poles’ usage and attitude towards retail banks, conducted quarterly on a representative sample. The model used in the study is a compromise between author’s synthesis of national CSI models and the data constraints. There are two approaches to the estimation of the CSI models: Partial Least Squares - used in national satisfaction studies and Covariance Based Methods (SEM, Lisrel). A discussion is held on which of those two methods is better and in what circumstances. In this study both methods are used. Comparison of their performance is the secondary goal of the study.
    Keywords: satisfaction, loyalty, customer satisfaction index models, banking sector, structural equation models with latent variables, structural equations modeling, partial least squares, covariance based methods
    JEL: C13 C39 C51 G21 M31
    Date: 2009–03–19
  10. By: Patrick Bajari; Jeremy Fox; Kyoo il Kim; Stephen P. Ryan
    Abstract: The random coefficients, multinomial choice logit model has been widely used in empirical choice analysis for the last 30 years. We are the first to prove that the distribution of random coefficients in this model is nonparametrically identified. Our approach exploits the structure of the logit model, and so requires no monotonicity assumptions and requires variation in product characteristics within only an infinitesimally small open set. Our identification argument is constructive and may be applied to other choice models with random coefficients.
    JEL: C14 C25 L00
    Date: 2009–04
  11. By: Cho, Seong-Hoon; Lambert, Dayton M.; Kim, Seung Gyu; Jung, Su Hyun
    Abstract: This study deals with the issue of extreme coefficients in geographically weighted regression (GWR) and their effects on mapping coefficients using three datasets with different spatial resolutions. We found that although GWR yields extreme coefficients regardless of the resolution of the dataset or types of kernel function, 1) the GWR tends to generate extreme coefficients for less spatially dense datasets, 2) coefficient maps based on polygon data representing aggregated areal units are more sensitive to extreme coefficients, and 3) coefficient maps using bandwidths generated by a fixed calibration procedure are more vulnerable to the extreme coefficients than adaptive calibration.
    Keywords: extreme coefficient, fixed and adaptive calibrations, geographically weighted regression, Mapping, Research Methods/ Statistical Methods,
    Date: 2009
  12. By: Justyna Wróblewska (Cracow University of Economics)
    Abstract: In this paper we present the Bayesian model selection procedure within the class of cointegrated processes. In order to make inference about the cointegration space we use the class of Matrix Angular Central Gaussian distributions. To carry out posterior simulations we use an alorithm based on the collapsed Gibbs sampler. The presented methods are applied to the analysis of the price - wage mechanism in the Polish economy.
    Keywords: cointegration, Bayesian analysis, Grassmann manifold, Stiefel manifold, posterior probability
    JEL: C11 C32 C52
    Date: 2009–03–22
  13. By: Andrés González Gómez; Lavan Mahadeva; Diego Rodríguez; Luis Eduardo Rojas
    Abstract: If theory-consistent models can ever hope to forecast well and to be useful for policy, they have to relate to data which though rich in information is uncertain, unbalanced and sometimes forecasts from external sources about the future path of other variables. One example from many is financial market data, which can help but only after smoothing out irrelevant short-term volatility. In this paper we propose combining different types of useful but awkward data set with a linearised forward-looking DSGE model through a Kalman Filter fixed-interval smoother to improve the utility of these models as policy tools. We apply this scheme to a model for Colombia.
    Date: 2009–04–21
  14. By: Mark Craddock (Department of Mathematical Sciences, University of Technology, Sydney); Eckhard Platen (School of Finance and Economics, University of Technology, Sydney)
    Abstract: This paper uses Lie symmetry group methods to obtain transition probability densities for scalar diffusions, where the diffusion coefficient is given by a power law. We will show that if the drift of the diffusion satisfies a certain family of Riccati equations, then it is possible to compute a generalized Laplace transform of the transition density for the process. Various explicit examples are provided. We also obtain fundamental solutions of the Kolmogorov forward equation for diffusions, which do not correspond to transition probability densities.
    Keywords: Lie symmetry groups; fundamental solutions; transition probability densities, It?o diffusions
    Date: 2009–03–01
  15. By: Ghislain Yanou (Centre d'Economie de la Sorbonne)
    Abstract: In this paper, we propose a methodology for building an estimator of the covariance matrix. We use a robust measure of moments called L-moments (see hosking, 1986), and their extension into a multivariate framework (see Serfling and Xiao, 2007). Random matrix theory (see Edelman, 1989) allows us to extract factors which contain real information. An empirical study in the American market shows that the Global Minimum L-variance Portfolio (GMLP) obtained from our estimator well performs the Global Minimum Variance Portfolio (GMVP) that acquired from the empirical estimator of the covariance matrix.
    Keywords: Covariance matrix, Lvariance-covariance, Lcorrelation, concomitance, random matrix theory.
    JEL: G11
    Date: 2008–12
  16. By: Dominique Guegan (Paris School of Economics - Centre d'Economie de la Sorbonne); Pierre-André Maugis (Centre d'Economie de la Sorbonne)
    Abstract: We present here a new way of building vine copulas that allows us to create a vast number of new vine copulas, allowing for more precise modeling in high dimensions. To deal with this great number of copulas we present a new efficient selection methodology using a lattice structure on the vine set. Our model allows for a lot of degrees of freedom, but further improvements face numerous problems caused by vines' complexity as an estimator in a statistical and computational way, problems that we will expose in this paper. Robust n-variate models would be a great breakthrough for asset risk management in banks and insurance companies.
    Keywords: Vines, multivariate copulas, model selection.
    JEL: D81 C10 C40 C52
    Date: 2008–12
  17. By: Balakrishna, B S
    Abstract: The jump distribution for the default intensities in a reduced form framework is modeled and calibrated to provide reasonable fits to CDX.NA.IG and iTraxx Europe CDOs, to 5, 7 and 10 year maturities simultaneously. Calibration is carried out using an efficient Monte Carlo simulation algorithm suitable for both homogeneous and heterogeneous collections of credit names. The underlying jump process is found to relate closely to a maximally skewed stable Levy process with index of stability alpha ~ 1.5.
    Keywords: Default Risk; Default Correlation; Default Intensity; Intensity Model; Levy Density; CDO; Monte Carlo
    JEL: G13
    Date: 2008–07–16
  18. By: Agostino Tarsitano (Dipartimento di Economia e Statistica, Università della Calabria)
    Abstract: Rank correlation is a fundamental tool to express dependence in cases in which the data are arranged in order. There are, by contrast, circumstances where the ordinal association is of a nonlinear type. In this paper we investigate the effectiveness of several measures of rank correlation. These measures have been divided into three classes: conventional rank correlations, weighted rank correlations, correlations of scores. Our findings suggest that none is systematically better than the other in all circumstances. However, a simply weighted version of the Kendall rank correlation coefficient provides plausible answers to many special situations where intercategory distances could not be considered on the same basis.
    Keywords: Ordinal Data, Nonlinear Association, Weighted Rank Correlation
    Date: 2009–04

This nep-ecm issue is ©2009 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.