nep-ecm New Economics Papers
on Econometrics
Issue of 2007‒04‒21
seventeen papers chosen by
Sune Karlsson
Orebro University

  1. On Weighted Estimation in Linear Regression in th Presence of Parameter Uncertainty By Judith A. Clarke
  2. An Embarrassment of Riches: Forecasting Using Large Panels By Eklund, Jana; Karlsson, Sune
  3. Heterogeneous Treatment Effects: Instrumental Variables without Monotonicity? By Tobias J. Klein
  4. Copula-based measures of dependence structure in assets returns By Viviana Fernandez
  5. Forecasting crude oil and natural gas spot prices by classification methods By Viviana Fernández
  6. A Multivariate Integer Count Hurdle Model: Theory and Application to Exchange Rate Dynamics By Katarzyna Bien; Ingmar Nolte; Winfried Pohlmeier
  7. The relationship between ARIMA-GARCH and unobserved component models with GARCH disturbances By Santiago Pellegrini; Esther Ruiz; Antoni Espasa
  8. Environmental Efficiency Measurement with Translog Distance Functions: A Parametric Approach By Cuesta, Rafael A.; Knox Lovell, C.A.; Zofío, José Luis
  9. A class of spatial econometric methods in the empirical analysis of clusters of firms in the space By Giuseppe Arbia; Giuseppe Espa; Danny Quah
  10. Assessing the reasonableness of an imputation model By Maarten L. Buis
  11. Providing Intuition to the Fieller Method with Two Geometric Representations using STATA and Eviews By J.G. Hirschberg; J. N. Lye
  12. A measure of association (correlation) in nominal data (contingency tables), using determinants By Colignatus, Thomas
  13. Regression Discontinuity Designs: A Guide to Practice By Guido Imbens; Thomas Lemieux
  14. The Propensity Score method in public policy evaluation: a survey. By Bia, Michela
  15. Exact prediction of inflation in the USA By Ivan, Kitov
  16. The nearest correlation matrix problem: Solution by differential evolution method of global optimization By Mishra, SK
  17. "Misclassification of the Dependent Variable in Binary Choice Models: Evidence from Five Latin American Countries" By Evangelos M. Falaris

  1. By: Judith A. Clarke (Department of Economics, University of Victoria)
    Abstract: We consider estimating the linear regression model’s coefficients when there is uncertainty about coefficient restrictions. Theorems establish that the mean squared errors of combination estimators, formed as weighted averages of the ordinary least squares and one or more restricted least squares estimators, depend on finding the optimal estimator of a single normally distributed vector. Our results generalize those of Magnus and Durbin (1999) [Magnus, J.R., Durbin, J. 1999. Estimation of regression coefficients of interest when other regression coefficients are of no interest. Econometrica 67, 639-643] and Danilov and Magnus (2004) [Danilov, D., Magnus, J.R. 2004. On the harm that ignoring pretesting can cause. Journal of Econometrics 122, 27-46].
    Keywords: Logit, Mean squared error, weighted estimaor, linear restrictions
    JEL: C12 C13 C20 C52
    Date: 2007–04–09
    URL: http://d.repec.org/n?u=RePEc:vic:vicewp:0701&r=ecm
  2. By: Eklund, Jana (Bank of England); Karlsson, Sune (Department of Business, Economics, Statistics and Informatics)
    Abstract: The increasing availability of data and potential predictor variables poses new challenges to forecasters. The task of formulating a single forecasting model that can extract all the relevant information is becoming increasingly difficult in the face of this abundance of data. The two leading approaches to addressing this "embarrassment of riches" are philosophically distinct. One approach builds forecast models based on summaries of the predictor variables, such as principal components, and the second approach is analogous to forecast combination, where the forecasts from a multitude of possible models are averaged. Using several data sets we compare the performance of the two approaches in the guise of the diffusion index or factor models popularized by Stock and Watson and forecast combination as an application of Bayesian model averaging. We find that none of the methods is uniformly superior and that no method performs better than, or is outperformed by, a simple AR(p) process.
    Keywords: Bayesian model averaging; Diffusion indexes; GDP growth rate; Inflation rate
    JEL: C11 C51 C52 C53
    Date: 2007–03–31
    URL: http://d.repec.org/n?u=RePEc:hhs:oruesi:2007_001&r=ecm
  3. By: Tobias J. Klein (University of Mannheim and IZA)
    Abstract: A fundamental identification problem in program evaluation arises when idiosyncratic gains from participation and the treatment decision depend on each other. Imbens and Angrist (1994) were the first to exploit a monotonicity condition in order to identify an average treatment effect parameter using instrumental variables. More recently, Heckman and Vytlacil (1999) suggested estimation of a variety of treatment effect parameters using a local version of their approach. However, identification hinges on the same monotonicity assumption that is fundamentally untestable. We investigate the sensitivity of respective estimates to reasonable departures from monotonicity that are likely to be encountered in practice and relate it to properties of a structural parameter. One of our results is that the bias vanishes under a testable linearity condition. Our findings are illustrated in a Monte Carlo analysis.
    Keywords: program evaluation, heterogeneity, dummy endogenous variable, selection on unobservables, instrumental variables, monotonicity, identification
    JEL: C21
    Date: 2007–04
    URL: http://d.repec.org/n?u=RePEc:iza:izadps:dp2738&r=ecm
  4. By: Viviana Fernandez
    Abstract: Copula modeling has become an increasingly popular tool in finance to model assets returns dependency. In essence, copulas enable us to extract the dependence structure from the joint distribution function of a set of random variables and, at the same time, to separate the dependence structure from the univariate marginal behavior. In this study, based on U.S. stock data, we illustrate how tail-dependency tests may be misleading as a tool to select a copula that closely mimics the dependency structure of the data. This problem becomes more severe when the data is scaled by conditional volatility and/or filtered out for serial correlation. The discussion is complemented, under more general settings, with Monte Carlo simulations.
    Date: 2006
    URL: http://d.repec.org/n?u=RePEc:edj:ceauch:228&r=ecm
  5. By: Viviana Fernández
    Abstract: In this article, we forecast crude oil and natural gas spot prices at a daily frequency based on two classification techniques: artificial neural networks (ANN) and support vector machines (SVM). As a benchmark, we utilize an autoregressive integrated moving average (ARIMA) specification. We evaluate out-of-sample forecast based on encompassing tests and mean-squared prediction error (MSPE). We find that at short-time horizons (e.g., 2-4 days), ARIMA tends to outperform both ANN and SVM. However, at longer-time horizons (e.g., 10-20 days), we find that in general ARIMA is encompassed by these two methods, and linear combinations of ANN and SVM forecasts are more accurate than the corresponding individual forecasts. Based on MSPE calculations, we reach similar conclusions: the two classification methods under consideration outperform ARIMA at longer time horizons.
    Date: 2006
    URL: http://d.repec.org/n?u=RePEc:edj:ceauch:229&r=ecm
  6. By: Katarzyna Bien (University of Konstanz); Ingmar Nolte (University of Konstanz); Winfried Pohlmeier (University of Konstanz)
    Abstract: In this paper we propose a model for the conditional multivariate density of integer count variables defined on the set Zn. Applying the concept of copula functions, we allow for a general form of dependence between the marginal processes which is able to pick up the complex nonlinear dynamics of multivariate financial time series at high frequencies. We use the model to estimate the conditional bivariate density of the high frequency changes of the EUR/GBP and the EUR/USD exchange rates.
    Keywords: Integer Count Hurdle, Copula Functions, Discrete Multivariate, Distributions, Foreign Exchange Market
    JEL: G10 F30 C30
    Date: 2006–11–14
    URL: http://d.repec.org/n?u=RePEc:knz:cofedp:0606&r=ecm
  7. By: Santiago Pellegrini; Esther Ruiz; Antoni Espasa
    Abstract: The objective of this paper is to analyze the consequences of fitting ARIMA-GARCH models to series generated by conditionally heteroscedastic unobserved component models. Focusing on the local level model, we show that the heteroscedasticity is weaker in the ARIMA than in the local level disturbances. In certain cases, the IMA(1,1) model could even be wrongly seen as homoscedastic. Next, with regard to forecasting performance, we show that the prediction intervals based on the ARIMA model can be inappropriate as they incorporate the unit root while the intervals of the local level model can converge to the homoscedastic intervals when the heteroscedasticity appears only in the transitory noise. All the analytical results are illustrated with simulated and real time series.
    Date: 2007–04
    URL: http://d.repec.org/n?u=RePEc:cte:wsrepe:ws072706&r=ecm
  8. By: Cuesta, Rafael A. (Departamento de Economía, Universidad de Oviedo, E-33071, Oviedo, Spain.); Knox Lovell, C.A. (Department of Economics, Terry College of Business, University of Georgia, Athens, GA 30602, USA); Zofío, José Luis (Departamento de Análisis Económico (Teoría e Historia Económica). Universidad Autónoma de Madrid.)
    Abstract: We use a flexible parametric hyperbolic distance function to estimate environmental efficiency when some outputs are undesirable. Cuesta and Zofio (J. Prod. Analysis (2005), 31-48) introduced this distance function specification in conventional input-output space to estimate technical efficiency within a stochastic frontier context. We extend their approach to accommodate undesirable outputs and to estimate environmental efficiency within a stochastic frontier context. This provides a parametric counterpart to Färe et al.’s popular nonparametric environmental efficiency measures (Rev. Econ. Stat. 75 (1989), 90-98). The distance function model is applied to a panel of U.S. electricity generating units that produce marketed electricity and non-marketed SO2 emissions.
    Keywords: Undesirable outputs; parametric distance functions; stochastic frontier analysis; environmental efficiency
    JEL: C32 L95
    Date: 2007–03
    URL: http://d.repec.org/n?u=RePEc:uam:wpaper:200702&r=ecm
  9. By: Giuseppe Arbia; Giuseppe Espa; Danny Quah
    Abstract: In this paper we aim at identifying stylized facts in order to suggest adequate models of spatial co–agglomeration of industries. We describe a class of spatial statistical methods to be used in the empirical analysis of spatial clusters. Compared to previous contributions using point pattern methods, the main innovation of the present paper is to consider clustering for bivariate (rather than univariate) distributions, which allows uncovering co–agglomeration and repulsion phenomena between the different industrial sectors. Furthermore we present the results of an empirical application of such methods to a set of European Patent Office (EPO) data and we produce a series of empirical evidences referred to the the pair–wise intra–sectoral spatial distribution of patents in Italy in the nineties. In this analysis we are able to identify some distinctive joint patterns of location between patents of different sectors and to propose some possible economic interpretations.
    Keywords: Agglomeration, Bivariate K–functions, co–agglomeration, Non parametric concentration measures, Spatial clusters, Spatial econometrics
    JEL: C21 D92 L60 O18 R12
    Date: 2007
    URL: http://d.repec.org/n?u=RePEc:trn:utwpde:0705&r=ecm
  10. By: Maarten L. Buis (Vrije Universiteit Amsterdam)
    Abstract: Multiple imputation is a popular way of dealing with missing values under the missing at random (MAR) assumption. Imputation models can become quite complicated, for instance, when the model of substantive interest contains many interactions or when the data originate from a nested design. This paper will discuss two methods to assess how plausible the results are. The first method consists of comparing the point estimates obtained by multiple imputation with point estimates obtained by another method for controlling for bias due to missing data. Second, the changes in standard error between the model that ignores the missing cases and the multiple imputation model are decomposed into three components: changes due to changes in sample size, changes due to uncertainty in the imputation model used in multiple imputation, and changes due to changes in the estimates that underlie the standard error. This decomposition helps in assessing the reasonableness of the change in standard error. These two methods will be illustrated with two new user written Stata commands.
    Date: 2007–04–11
    URL: http://d.repec.org/n?u=RePEc:boc:dsug07:02&r=ecm
  11. By: J.G. Hirschberg; J. N. Lye
    Abstract: The Fieller Method for the construction of confidence intervals for ratios of the expected value of two normally distributed random variables has been shown by a number of authors to be a superior method to the delta approximation. However, it is not widely used due in part, to the tendency to present the intervals only in a formula context. In addition, potential users have been deterred by the potential difficulty in interpreting non-finite confidence intervals when the confidence level is less than 100%. In this paper we present two graphical methods which can be easily constructed using two widely used statistical software packages (Eviews and Stata) for the representation of the Fieller intervals. An application is presented to assess the results of a model of the non-accelerating inflation rate of unemployment (NAIRU).
    Keywords: Fieller method, ratios of parameters, confidence interval, confidence ellipsoid,1st derivative function, NAIRU, EViews, STATA
    JEL: C12 C20 E24
    Date: 2007
    URL: http://d.repec.org/n?u=RePEc:mlb:wpaper:992&r=ecm
  12. By: Colignatus, Thomas
    Abstract: Nominal data currently lack a correlation coefficient, such as has already defined for real data. A measure is possible using the determinant, with the useful interpretation that the determinant gives the ratio between volumes. With M a m × n contingency table and n ≤ m the suggested measure is r = Sqrt[det[A'A]] with A = Normalized[M]. With M an n1 × n2 × ... × nk contingency matrix, we can construct a matrix of pairwise correlations R so that the overall correlation is f[R]. An option is to use f[R] = Sqrt[1 - det[R]]. However, for both nominal and cardinal data the advisable choice for such a function f is to take the maximal multiple correlation within R.
    Keywords: association; correlation; contingency table; volume ratio; determinant; nonparametric methods; nominal data; nominal scale; categorical data; Fisher’s exact test; odds ratio; tetrachoric correlation coefficient; phi; Cramer’s V; Pearson; contingency coefficient; uncertainty coefficient; Theil’s U; eta; meta-analysis; Simpson’s paradox; causality; statistical independence
    JEL: C10
    Date: 2007–03–20
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:2662&r=ecm
  13. By: Guido Imbens; Thomas Lemieux
    Abstract: In Regression Discontinuity (RD) designs for evaluating causal effects of interventions, assignment to a treatment is determined at least partly by the value of an observed covariate lying on either side of a fixed threshold. These designs were first introduced in the evaluation literature by Thistlewaite and Campbell (1960). With the exception of a few unpublished theoretical papers, these methods did not attract much attention in the economics literature until recently. Starting in the late 1990s, there has been a large number of studies in economics applying and extending RD methods. In this paper we review some of the practical and theoretical issues involved in the implementation of RD methods.
    JEL: C14 C21
    Date: 2007–04
    URL: http://d.repec.org/n?u=RePEc:nbr:nberwo:13039&r=ecm
  14. By: Bia, Michela
    Abstract: Recently, in the field of causal inference, nonparametric techniques, that use matching procedures based for example on the propensity score (Rosenbaum, Rubin, 1983), have received growing attention. In this paper we focus on propensity score methods, introduced by Rosenbaum and Rubin (1983). The key result underlying this methodology is that, given the ignorability assumption, treatment assignment and the potential outcomes are independent given the propensity score. Much of the work on propensity score analysis has focused on the case where the treatment is binary, but in many cases of interest the treatment takes on more than two values. In this article we examine an extension to the propensity score method, in a setting with a continuous treatment.
    Date: 2007–04
    URL: http://d.repec.org/n?u=RePEc:uca:ucapdv:79&r=ecm
  15. By: Ivan, Kitov
    Abstract: A linear and lagged relationship between inflation and labor force growth rate has been recently found for the USA. It accurately describes the period after the late 1950s with linear coefficient 4.0, intercept -0.03, and the lag of 2 years. The previously reported agreement between observed and predicted inflation is substantially improved by some simple measures removing the most obvious errors in the labor force time series. The labor force readings originally obtained from the Bureau of Labor Statistics (BLS) website are corrected for step-like adjustments. Additionally, a half-year time shift between the inflation and the annual labor force readings is compensated. GDP deflator represents the inflation. Linear regression analysis demonstrates that the annual labor force growth rate used as a predictor explains almost 82% (R2=0.82) of the inflation variations between 1965 and 2002. Moving average technique applied to the annual time series results in a substantial increase in R2. It grows from 0.87 for two-year wide windows to 0.96 for four-year windows. Regression of cumulative curves is characterized by R2>0.999. This allows effective replacement of GDP deflation index by a “labor force growth” index. The linear and lagged relationship provides a precise forecast at the two-year horizon with root mean square forecasting error (RMSFE) as low as 0.008 (0.8%) for the entire period between 1965 and 2002. For the last 20 years, RMSFE is only 0.4%. Thus, the forecast methodology effectively outperforms any other forecasting technique reported in economic and financial literature. Moreover, further significant improvements in the forecasting accuracy are accessible through improvements in the labor force measurements in line with the US Census Bureau population estimates, which are neglected by BLS.
    Keywords: inflation; labor force; forecast; the USA
    JEL: E61 E31 J21
    Date: 2006–07
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:2735&r=ecm
  16. By: Mishra, SK
    Abstract: Correlation matrices have many applications, particularly in marketing and financial economics - such as in risk management, option pricing and to forecast demand for a group of products in order to realize savings by properly managing inventories, etc. Various methods have been proposed by different authors to solve the nearest correlation matrix problem by majorization, hypersphere decomposition, semi-definite programming, or geometric programming, etc. In this paper we propose to obtain the nearest valid correlation matrix by the differential evaluation method of global optimization. We may draw some conclusions from the exercise in this paper. First, the ‘nearest correlation matrix problem may be solved satisfactorily by the evolutionary algorithm like the differential evolution method/Particle Swarm Optimizer. Other methods such as the Particle Swarm method also may be used. Secondly, these methods are easily amenable to choice of the norm to minimize. Absolute, Frobenius or Chebyshev norm may easily be used. Thirdly, the ‘complete the correlation matrix problem’ can be solved (in a limited sense) by these methods. Fourthly, one may easily opt for weighted norm or un-weighted norm minimization. Fifthly, minimization of absolute norm to obtain nearest correlation matrices appears to give better results. In solving the nearest correlation matrix problem the resulting valid correlation matrices are often near-singular and thus they are on the borderline of semi-negativity. One finds difficulty in rounding off their elements even at 6th or 7th places after decimal, without running the risk of making the rounded off matrix negative definite. Such matrices are, therefore, difficult to handle. It is possible to obtain more robust positive definite valid correlation matrices by constraining the determinant (the product of eigenvalues) of the resulting correlation matrix to take on a value significantly larger than zero. But this can be done only at the cost of a compromise on the criterion of ‘nearness.’ The method proposed by us does it very well.
    Keywords: Correlation matrix; product moment; nearest; complete; positive semi-definite; majorization; hypersphere decomposition; semi-definite programming; geometric programming; Particle Swarm; Differential Evolution; Particle Swarm Optimization; Global Optimization; risk management; option pricing; financial economics; marketing; computer program; Fortran; norm; absolute; maximum; Frobenius; Chebyshev; Euclidean.
    JEL: C63 G00 C88 C61 G19
    Date: 2007–04–14
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:2760&r=ecm
  17. By: Evangelos M. Falaris (Department of Economics,University of Delaware)
    Abstract: Misclassification of the dependent variable in binary choice models can result in inconsistency of the parameter estimates. I estimate probit models that treat misclassification probabilities as estimable parameters for three labor market outcomes: formal sector employment, pension contribution and job change. I use Living Standards Measurement Study data from Nicaragua, Peru, Brazil, Guatemala, and Panama. I find that there is significant misclassification in eleven of the sixteen cases that I investigate. If misclassification is present, but is ignored, estimates of the probit parameters and their standard errors are biased toward zero. In most cases, predicted probabilities of the outcomes are significantly affected by misclassification of the dependent variable. Even a moderate degree of misclassification can have substantial effects on the estimated parameters and on many of the predictions.
    Keywords: Data Quality; Misclassification; Formal Sector; Pension Contributor; Job Change; Nicaragua; Peru; Brazil; Guatemala; Panama
    JEL: C81 C25 O17 J26 J62
    URL: http://d.repec.org/n?u=RePEc:dlw:wpaper:07-05.&r=ecm

This nep-ecm issue is ©2007 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.