nep-ecm New Economics Papers
on Econometrics
Issue of 2006‒03‒11
fifteen papers chosen by
Sune Karlsson
Orebro University

  1. The Power of Bootstrap and Asymptotic Tests By Russell Davidson; James MacKinnon
  2. "A New Light from Old Wisdoms : Alternative Estimation Methods of Simultaneous Equations with Possibly Many Instruments" By T. W. Anderson; Naoto Kunitomo; Yukitoshi Matsushita
  3. Probit Models with Binary Endogenous Regressors By Jacob Nielsen Arendt; Anders Holm
  4. College Education and Wages in the U.K.: Estimating Conditional Average Structural Functions in Nonadditive Models with Binary Endogenous Variables By Tobias J. Klein
  5. Matching Estimating of Dynamic Treatment Models: Some Practical Issues By Michael Lechner
  6. Graphical Data Representation in Bankruptcy Analysis By Wolfgang Härdle; Rouslan Moro; Dorothea Schäfer
  7. A Large Deviation Approach to the Measurement of Mobility By Robert Aebi; Klaus Neusser; Peter Steiner
  8. Forecasting interest rates: A Comparative assessment of some second generation non-linear model By Dilip M. Nachane; Jose G. Clavel
  9. Using Predicted Outcome Stratified Sampling to Reduce the Variability in Predictive Performance of a One-Shot Train-and-Test Split for Individual Customer Predictions By G. VERSTRAETEN; D. VAN DEN POEL
  10. Estimation with the Nested Logit Model: Specifications and Software Particularities By Nadja Silberhorn; Yasemin Boztug; Lutz Hildebrandt
  11. A Review of Methodological Research Pertinent to Longitudinal Survey Design and Data Collection By Nick Buck; Jonathan Burton; Annette Jäckle; Heather Laurie; Peter Lynn
  12. Heterogeneity and Microeconometrics Modelling By Martin Browning; Jesus Carro
  14. I didn't run a single regression By Christian Müller
  15. Concordant Convergence Empirics By Don J Webber; Paul White

  1. By: Russell Davidson (McGill University); James MacKinnon (Queen's University)
    Abstract: We introduce the concept of the bootstrap discrepancy, which measures the difference in rejection probabilities between a bootstrap test based on a given test statistic and that of a (usually infeasible) test based on the true distribution of the statistic. We show that the bootstrap discrepancy is of the same order of magnitude under the null hypothesis and under non-null processes described by a Pitman drift. However, complications arise in the measurement of power. If the test statistic is not an exact pivot, critical values depend on which data-generating process (DGP) is used to determine the distribution under the null hypothesis. We propose as the proper choice the DGP which minimizes the bootstrap discrepancy. We also show that, under an asymptotic independence condition, the power of both bootstrap and asymptotic tests can be estimated cheaply by simulation. The theory of the paper and the proposed simulation method are illustrated by Monte Carlo experiments using the logit model.
    Keywords: bootstrap test, bootstrap discrepancy, Pitman drift, drifting DGP, Monte Carlo, test power
    JEL: C12 C15
    Date: 2004–07
  2. By: T. W. Anderson (Department of Statistics and Department of Economics, Stanford University); Naoto Kunitomo (Faculty of Economics, University of Tokyo); Yukitoshi Matsushita (Graduate School of Economics, University of Tokyo)
    Abstract: We compare four dffierent estimation methods for a coefficient of a linear structural equation with instrumental variables. As the classical methods we consider the limited information maximum likelihood (LIML) estimator and the two-stage least squares (TSLS) estimator, and as the semi-parametric estimation methods we consider the maximum emirical likelihood (MEL) estimator and the generalized method of moments (GMM) (or the estimating equation) estimator. We prove several theorems on the asymptotic optimality of the LIML estimator when the number of instruments is large, which are new as well as old, and we relate them to the results in some recent studies. Tables and figures of the distribution functions of four estimators are given for enough values of the parameters to cover most of interest. We have found that the LIML estimator has good performance when the number of instruments is large, that is, the micro-econometric models with many instruments in the terminology of recent econometric literature.
    Date: 2006–02
  3. By: Jacob Nielsen Arendt (Department of Business and Economics, University of Southern Denmark); Anders Holm (Department of Sociology, University of Copenhagen)
    Abstract: Sample selection and endogeneity are frequent causes of biases in non-experimental empirical studies. In binary models a standard solution involves complex multivariate models. A simple approximation has been shown to work well in bivariate models. This paper extends the approximation to a trivariate model. Simulations show that the approximation outperforms full maximum likelihood while a least squares approximation may be severely biased. The methods are used to estimate the influence of trust in the parliament and politicians on voting- propensity. No previous studies have allowed for endogeneity of trust on voting and it is shown to severely affect the results.
    Keywords: endogeneity; multivariate probit; approximation; Monte Carlo simulation
    Date: 2006–02
  4. By: Tobias J. Klein (University of Mannheim, Department of Economics)
    Abstract: We propose and implement an estimator for identifiable features of correlated random coefficient models with binary endogenous variables and nonadditive errors in the outcome equation. It is suitable, e.g., for estimation of the average returns to college education when they are heterogeneous across individuals and correlated with the schooling choice. The estimated features are of central interest to economists and are directly linked to the marginal and average treatment effect in policy evaluation. They are identified under assumptions weaker than typical exclusion restrictions used in the context of classical instrumental variables analysis. In our application for the U.K., we relate levels of expected wages to unobserved ability, measured ability, family background, type of secondary school, and the decision whether to attend college.
    Keywords: Returns to college education, correlated random coefficient model, local instrumental variables, local linear regression
    JEL: C14 C31 J31
    Date: 2006–02
  5. By: Michael Lechner
    Abstract: Lechner and Miquel (2001) approached the causal analysis of sequences of interventions from a potential outcome perspective based on selection on observable type of assumptions (sequential conditional independence assumptions). Lechner (2004) proposed matching estimators for this framework. However, many practical issues that might have substantial consequences for interpretation of the results have not been thoroughly investigated so far. This paper discusses some of these practical issues. The discussion is related to estimates based on an artificial data set for which the true values of the parameters are known and that shares many features of data that could be used for an empirical dynamic matching analysis.
    JEL: C31 C41
    Date: 2006–01
  6. By: Wolfgang Härdle; Rouslan Moro; Dorothea Schäfer
    Abstract: Graphical data representation is an important tool for model selection in bankruptcy analysis since the problem is highly non-linear and its numerical representation is much less transparent. In classical rating models a convenient representation of ratings in a closed form is possible reducing the need for graphical tools. In contrast to that non-linear non-parametric models achieving better accuracy often rely on visualisation. We demonstrate an application of visualisation techniques at different stages of corporate default analysis based on Support Vector Machines (SVM). These stages are the selection of variables (predictors), probability of default (PD) estimation and the representation of PDs for two and higher dimensional models with colour coding. It is at this stage when the selection of a proper colour scheme becomes essential for a correct visualisation of PDs. The mapping of scores into PDs is done as a non-parametric regression with monotonisation. The SVM learns a non-parametric score function that is, in its turn, non-parametrically transformed into PDs. Since PDs cannot be represented in a closed form, some other ways of displaying them must be found. Graphical tools give this possibility.
    Keywords: company rating, default probability, support vector machines, colour coding
    JEL: C14 G33 C45
    Date: 2006–02
  7. By: Robert Aebi; Klaus Neusser; Peter Steiner
    Abstract: We propose an approach to measure the mobility immanent in regular Markov processes. For this purpose, we distinguish between mobility in equilibrium and mobility associated with convergence towards equilibrium. The former aspect is measured as the expectation of a functional, defined on the Cartesian square product of the state space, with respect to the invariant distribution. Based on large deviations techniques, we show how the two aspects of mobility are related and how the second one can be characterized by a certain relative entropy. Finally, we show that some prominent mobility indices can be considered as special cases.
    Keywords: mobility index; large deviations; relative entropy
    JEL: C22 J62
    Date: 2005–12
  8. By: Dilip M. Nachane (Indira Gandhi Institute of Development Research); Jose G. Clavel (Universidad de Murcia)
    Abstract: Modelling and forecasting of interest rates has traditionally proceeded in the framework of linear stationary models such as ARMA and VAR, but only with moderate success. We examine here four models which account for several specific features of real world asset prices such as non-stationarity and non-linearity. Our four candidate models are based respectively on wavelet analysis, mixed spectrum analysis, non-linear ARMA models with Fourier coefficients, and the Kalman filter. These models are applied to weekly data on interest rates in India, and their forecasting performance is evaluated vis-vis three GARCH models (GARCH (1,1), GARCH-M (1,1) and EGARCH (1,1)) as well as the random walk model. The Kalman filter model emerges at the top, with wavelet and mixed spectrum models also showing considerable promise.
    Keywords: Interest rates, wavelets, mixed spectra, non-linear ARMA, Kalman filter, GARCH, Forecast encompassing
    Date: 2005
    Abstract: Since it is generally recognized that models evaluated on the data that was used for constructing them are overly optimistic, in predictive modeling practice, the assessment of a model’s predictive performance frequently relies on a one-shot train-and-test split between observations used for estimating a model, and those used for validating it. Previous research has indicated the usefulness of stratified sampling for reducing the variation in predictive performance in a linear regression application. In this paper, we validate the previous findings on six real-life European predictive modeling applications for marketing and credit scoring using a dichotomous outcome variable. We find confirmation for the reduction in variability using a procedure we describe as predicted outcome stratified sampling in a logistic regression model, and we find that the gain in variation reduction is – also in large data sets – almost always significant, and in certain applications markedly high.
    Date: 2006–01
  10. By: Nadja Silberhorn; Yasemin Boztug; Lutz Hildebrandt
    Abstract: Due to its ability to allow and account for similarities betweenpairs of alternatives, the nested logit model is increasingly used in practical applications. However the fact that there are two different specifications of the nested logit model has not received adequate attention. The utility maximization nested logit (UMNL) model and the non-normalized nested logit (NNNL) model have different properties, influencing the estimation results in a different manner. As the NNNL specification is not consistent with random utility theory (RUT), the UMNL form is preferred. This article introduces distinct specifications of the nested logit model and indicates particularities arising from model estimation. Additionally, it demonstrates the performance ofsimulation studies with the nested logit model. In simulation studies with the nested logit model using NNNL software (e. g. PROC MDC in SAS(c) ), it must be pointed out that the simulation of the utility function´s error terms needs to assume RUT-conformity. But as the NNNL specification is not consistent with RUT, the input parameters cannot be reproduced without imposing restrictions. The effects of using various software packages on the estimation results of a nested logit model are shown on the basis of a simulation study.
    Keywords: nested logit model, utility maximization nested logit, non-normalized nested logit, simulation study
    Date: 2006–02
  11. By: Nick Buck (Institute for Social and Economic Research); Jonathan Burton (Cabinet Office); Annette Jäckle (Institute for Social and Economic Research); Heather Laurie (Institute for Social and Economic Research); Peter Lynn (Institute for Social and Economic Research)
    Abstract: This paper presents a review of methodological research regarding issues that are pertinent to surveys involving longitudinal data collection, i.e. repeated measurement over time on the same units. The objective of the review is to identify important gaps in our knowledge of issues affecting the design and implementation of such surveys and the use of the data that they provide. This should help to inform the development of an agenda for future methodological research as well as serving as a useful summary of current knowledge. The issues addressed relate to sample design, missing data (as a result of item and unit non-response and attrition) and measurement error (including panel conditioning).
    Keywords: complex surveys, data collection, dependent interviewing, household surveys, item non-response, longitudinal data quality, measurement error, methodology, non-contacts, non-response bias, panel attrition, panel data estimation, questionnaire design, recall error, refusals, respondent incentives, response rates, sampling, seam effect, survey errors, survey methodology, survey non-response, survey quality
    Date: 2005–12
  12. By: Martin Browning (Department of Economics, University of Copenhagen); Jesus Carro (Department of Economics, Carlos III, Madrid)
    Abstract: Presented at the 2005 Econometric Society World Congress Plenary Session on "Modelling Heterogeneity". We survey the treatment of heterogeneity in applied microeconometrics analyses. There are three themes. First, there is usually much more heterogeneity than empirical researchers allow for. Second, the inappropriate treatment of heterogeneity can lead to serious error when estimating outcomes of interest. Finally, once we move away from the traditional linear model with a single 'fixed effect', it is very difficult to account for heterogeneity and fit the data and maintain coherence with theory structures. The latter task is one for economists: "heterogeneity is too important to be left to the statisticians". The paper concludes with a report of our own research on dynamic discrete choice models that allow for maximal heterogeneity.
    Keywords: heterogeneity; applied microeconometrics; fixed effects; dyanamic discrete choice
    JEL: C30 C33 C51
    Date: 2006–01
  13. By: Dirk Baur; Renee Fry
    Abstract: This paper poses a multivariate test for contagion that distinguishes between vulnerability, positive and negative contagion. The model proides a time series of contagion with which the existence, severity and significance of crisis periods can be endogenously determined. Eleven stock markets from the Asian regions are analyzed during the Asian crisis, and contagion is significant in four periods. These episodes are split equally between positive and negative movements. Anecdotal evidence is matched to significant contagion, with events surrounding Hong Kong and the key drivers.
    JEL: C10 C51 F36 G14
    Date: 2006–01
  14. By: Christian Müller (Swiss Institute for Business Cycle Research (KOF), Swiss Federal Institute of Technology Zurich (ETH))
    Abstract: Growth regression economics are haunted by the fact that results are easily overthrown by regressing alternative model specifications. Recent research therefore aims at obtaining robust regression results by systematically running multiple models and picking surviving variables. This note shows that a very popular of these approaches, the robust regression due to Sala-i-Martin (1997) very likely leads to inconsistent conclusions but may be remedied by re.ning the ‘testimation’ algorithm. To that aim I do not need to run a single regression.
    Keywords: robust estimation, growth regression
    JEL: C50
    Date: 2006–01
  15. By: Don J Webber (School of Economics, University of the West of England); Paul White (Faculty of Computing, Engineering and Mathematical Sciences, University of the West of England)
    Abstract: We present a new model to test the convergence hypothesis based on the ideas of concordance and then employ the model to test empirically for GDP per capita convergence across 97 countries. Our results suggest the presence of switching, while there is more ‘strong divergence’ than ‘strong convergence’.
    Keywords: Convergence; Concordance; Income per capita.
    JEL: C14 F19
    Date: 2004–12

This nep-ecm issue is ©2006 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.