
on Econometrics 
By:  Judith A. Clarke (Department of Economics, University of Victoria) 
Abstract:  We consider estimating the linear regression model’s coefficients when there is uncertainty about coefficient restrictions. Theorems establish that the mean squared errors of combination estimators, formed as weighted averages of the ordinary least squares and one or more restricted least squares estimators, depend on finding the optimal estimator of a single normally distributed vector. Our results generalize those of Magnus and Durbin (1999) [Magnus, J.R., Durbin, J. 1999. Estimation of regression coefficients of interest when other regression coefficients are of no interest. Econometrica 67, 639643] and Danilov and Magnus (2004) [Danilov, D., Magnus, J.R. 2004. On the harm that ignoring pretesting can cause. Journal of Econometrics 122, 2746]. 
Keywords:  Logit, Mean squared error, weighted estimaor, linear restrictions 
JEL:  C12 C13 C20 C52 
Date:  2007–04–09 
URL:  http://d.repec.org/n?u=RePEc:vic:vicewp:0701&r=ecm 
By:  Eklund, Jana (Bank of England); Karlsson, Sune (Department of Business, Economics, Statistics and Informatics) 
Abstract:  The increasing availability of data and potential predictor variables poses new challenges to forecasters. The task of formulating a single forecasting model that can extract all the relevant information is becoming increasingly difficult in the face of this abundance of data. The two leading approaches to addressing this "embarrassment of riches" are philosophically distinct. One approach builds forecast models based on summaries of the predictor variables, such as principal components, and the second approach is analogous to forecast combination, where the forecasts from a multitude of possible models are averaged. Using several data sets we compare the performance of the two approaches in the guise of the diffusion index or factor models popularized by Stock and Watson and forecast combination as an application of Bayesian model averaging. We find that none of the methods is uniformly superior and that no method performs better than, or is outperformed by, a simple AR(p) process. 
Keywords:  Bayesian model averaging; Diffusion indexes; GDP growth rate; Inflation rate 
JEL:  C11 C51 C52 C53 
Date:  2007–03–31 
URL:  http://d.repec.org/n?u=RePEc:hhs:oruesi:2007_001&r=ecm 
By:  Tobias J. Klein (University of Mannheim and IZA) 
Abstract:  A fundamental identification problem in program evaluation arises when idiosyncratic gains from participation and the treatment decision depend on each other. Imbens and Angrist (1994) were the first to exploit a monotonicity condition in order to identify an average treatment effect parameter using instrumental variables. More recently, Heckman and Vytlacil (1999) suggested estimation of a variety of treatment effect parameters using a local version of their approach. However, identification hinges on the same monotonicity assumption that is fundamentally untestable. We investigate the sensitivity of respective estimates to reasonable departures from monotonicity that are likely to be encountered in practice and relate it to properties of a structural parameter. One of our results is that the bias vanishes under a testable linearity condition. Our findings are illustrated in a Monte Carlo analysis. 
Keywords:  program evaluation, heterogeneity, dummy endogenous variable, selection on unobservables, instrumental variables, monotonicity, identification 
JEL:  C21 
Date:  2007–04 
URL:  http://d.repec.org/n?u=RePEc:iza:izadps:dp2738&r=ecm 
By:  Viviana Fernandez 
Abstract:  Copula modeling has become an increasingly popular tool in finance to model assets returns dependency. In essence, copulas enable us to extract the dependence structure from the joint distribution function of a set of random variables and, at the same time, to separate the dependence structure from the univariate marginal behavior. In this study, based on U.S. stock data, we illustrate how taildependency tests may be misleading as a tool to select a copula that closely mimics the dependency structure of the data. This problem becomes more severe when the data is scaled by conditional volatility and/or filtered out for serial correlation. The discussion is complemented, under more general settings, with Monte Carlo simulations. 
Date:  2006 
URL:  http://d.repec.org/n?u=RePEc:edj:ceauch:228&r=ecm 
By:  Viviana Fernández 
Abstract:  In this article, we forecast crude oil and natural gas spot prices at a daily frequency based on two classification techniques: artificial neural networks (ANN) and support vector machines (SVM). As a benchmark, we utilize an autoregressive integrated moving average (ARIMA) specification. We evaluate outofsample forecast based on encompassing tests and meansquared prediction error (MSPE). We find that at shorttime horizons (e.g., 24 days), ARIMA tends to outperform both ANN and SVM. However, at longertime horizons (e.g., 1020 days), we find that in general ARIMA is encompassed by these two methods, and linear combinations of ANN and SVM forecasts are more accurate than the corresponding individual forecasts. Based on MSPE calculations, we reach similar conclusions: the two classification methods under consideration outperform ARIMA at longer time horizons. 
Date:  2006 
URL:  http://d.repec.org/n?u=RePEc:edj:ceauch:229&r=ecm 
By:  Katarzyna Bien (University of Konstanz); Ingmar Nolte (University of Konstanz); Winfried Pohlmeier (University of Konstanz) 
Abstract:  In this paper we propose a model for the conditional multivariate density of integer count variables defined on the set Zn. Applying the concept of copula functions, we allow for a general form of dependence between the marginal processes which is able to pick up the complex nonlinear dynamics of multivariate financial time series at high frequencies. We use the model to estimate the conditional bivariate density of the high frequency changes of the EUR/GBP and the EUR/USD exchange rates. 
Keywords:  Integer Count Hurdle, Copula Functions, Discrete Multivariate, Distributions, Foreign Exchange Market 
JEL:  G10 F30 C30 
Date:  2006–11–14 
URL:  http://d.repec.org/n?u=RePEc:knz:cofedp:0606&r=ecm 
By:  Santiago Pellegrini; Esther Ruiz; Antoni Espasa 
Abstract:  The objective of this paper is to analyze the consequences of fitting ARIMAGARCH models to series generated by conditionally heteroscedastic unobserved component models. Focusing on the local level model, we show that the heteroscedasticity is weaker in the ARIMA than in the local level disturbances. In certain cases, the IMA(1,1) model could even be wrongly seen as homoscedastic. Next, with regard to forecasting performance, we show that the prediction intervals based on the ARIMA model can be inappropriate as they incorporate the unit root while the intervals of the local level model can converge to the homoscedastic intervals when the heteroscedasticity appears only in the transitory noise. All the analytical results are illustrated with simulated and real time series. 
Date:  2007–04 
URL:  http://d.repec.org/n?u=RePEc:cte:wsrepe:ws072706&r=ecm 
By:  Cuesta, Rafael A. (Departamento de Economía, Universidad de Oviedo, E33071, Oviedo, Spain.); Knox Lovell, C.A. (Department of Economics, Terry College of Business, University of Georgia, Athens, GA 30602, USA); Zofío, José Luis (Departamento de Análisis Económico (Teoría e Historia Económica). Universidad Autónoma de Madrid.) 
Abstract:  We use a flexible parametric hyperbolic distance function to estimate environmental efficiency when some outputs are undesirable. Cuesta and Zofio (J. Prod. Analysis (2005), 3148) introduced this distance function specification in conventional inputoutput space to estimate technical efficiency within a stochastic frontier context. We extend their approach to accommodate undesirable outputs and to estimate environmental efficiency within a stochastic frontier context. This provides a parametric counterpart to Färe et al.’s popular nonparametric environmental efficiency measures (Rev. Econ. Stat. 75 (1989), 9098). The distance function model is applied to a panel of U.S. electricity generating units that produce marketed electricity and nonmarketed SO2 emissions. 
Keywords:  Undesirable outputs; parametric distance functions; stochastic frontier analysis; environmental efficiency 
JEL:  C32 L95 
Date:  2007–03 
URL:  http://d.repec.org/n?u=RePEc:uam:wpaper:200702&r=ecm 
By:  Giuseppe Arbia; Giuseppe Espa; Danny Quah 
Abstract:  In this paper we aim at identifying stylized facts in order to suggest adequate models of spatial co–agglomeration of industries. We describe a class of spatial statistical methods to be used in the empirical analysis of spatial clusters. Compared to previous contributions using point pattern methods, the main innovation of the present paper is to consider clustering for bivariate (rather than univariate) distributions, which allows uncovering co–agglomeration and repulsion phenomena between the different industrial sectors. Furthermore we present the results of an empirical application of such methods to a set of European Patent Office (EPO) data and we produce a series of empirical evidences referred to the the pair–wise intra–sectoral spatial distribution of patents in Italy in the nineties. In this analysis we are able to identify some distinctive joint patterns of location between patents of different sectors and to propose some possible economic interpretations. 
Keywords:  Agglomeration, Bivariate K–functions, co–agglomeration, Non parametric concentration measures, Spatial clusters, Spatial econometrics 
JEL:  C21 D92 L60 O18 R12 
Date:  2007 
URL:  http://d.repec.org/n?u=RePEc:trn:utwpde:0705&r=ecm 
By:  Maarten L. Buis (Vrije Universiteit Amsterdam) 
Abstract:  Multiple imputation is a popular way of dealing with missing values under the missing at random (MAR) assumption. Imputation models can become quite complicated, for instance, when the model of substantive interest contains many interactions or when the data originate from a nested design. This paper will discuss two methods to assess how plausible the results are. The first method consists of comparing the point estimates obtained by multiple imputation with point estimates obtained by another method for controlling for bias due to missing data. Second, the changes in standard error between the model that ignores the missing cases and the multiple imputation model are decomposed into three components: changes due to changes in sample size, changes due to uncertainty in the imputation model used in multiple imputation, and changes due to changes in the estimates that underlie the standard error. This decomposition helps in assessing the reasonableness of the change in standard error. These two methods will be illustrated with two new user written Stata commands. 
Date:  2007–04–11 
URL:  http://d.repec.org/n?u=RePEc:boc:dsug07:02&r=ecm 
By:  J.G. Hirschberg; J. N. Lye 
Abstract:  The Fieller Method for the construction of confidence intervals for ratios of the expected value of two normally distributed random variables has been shown by a number of authors to be a superior method to the delta approximation. However, it is not widely used due in part, to the tendency to present the intervals only in a formula context. In addition, potential users have been deterred by the potential difficulty in interpreting nonfinite confidence intervals when the confidence level is less than 100%. In this paper we present two graphical methods which can be easily constructed using two widely used statistical software packages (Eviews and Stata) for the representation of the Fieller intervals. An application is presented to assess the results of a model of the nonaccelerating inflation rate of unemployment (NAIRU). 
Keywords:  Fieller method, ratios of parameters, confidence interval, confidence ellipsoid,1st derivative function, NAIRU, EViews, STATA 
JEL:  C12 C20 E24 
Date:  2007 
URL:  http://d.repec.org/n?u=RePEc:mlb:wpaper:992&r=ecm 
By:  Colignatus, Thomas 
Abstract:  Nominal data currently lack a correlation coefficient, such as has already defined for real data. A measure is possible using the determinant, with the useful interpretation that the determinant gives the ratio between volumes. With M a m × n contingency table and n ≤ m the suggested measure is r = Sqrt[det[A'A]] with A = Normalized[M]. With M an n1 × n2 × ... × nk contingency matrix, we can construct a matrix of pairwise correlations R so that the overall correlation is f[R]. An option is to use f[R] = Sqrt[1  det[R]]. However, for both nominal and cardinal data the advisable choice for such a function f is to take the maximal multiple correlation within R. 
Keywords:  association; correlation; contingency table; volume ratio; determinant; nonparametric methods; nominal data; nominal scale; categorical data; Fisher’s exact test; odds ratio; tetrachoric correlation coefficient; phi; Cramer’s V; Pearson; contingency coefficient; uncertainty coefficient; Theil’s U; eta; metaanalysis; Simpson’s paradox; causality; statistical independence 
JEL:  C10 
Date:  2007–03–20 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:2662&r=ecm 
By:  Guido Imbens; Thomas Lemieux 
Abstract:  In Regression Discontinuity (RD) designs for evaluating causal effects of interventions, assignment to a treatment is determined at least partly by the value of an observed covariate lying on either side of a fixed threshold. These designs were first introduced in the evaluation literature by Thistlewaite and Campbell (1960). With the exception of a few unpublished theoretical papers, these methods did not attract much attention in the economics literature until recently. Starting in the late 1990s, there has been a large number of studies in economics applying and extending RD methods. In this paper we review some of the practical and theoretical issues involved in the implementation of RD methods. 
JEL:  C14 C21 
Date:  2007–04 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:13039&r=ecm 
By:  Bia, Michela 
Abstract:  Recently, in the field of causal inference, nonparametric techniques, that use matching procedures based for example on the propensity score (Rosenbaum, Rubin, 1983), have received growing attention. In this paper we focus on propensity score methods, introduced by Rosenbaum and Rubin (1983). The key result underlying this methodology is that, given the ignorability assumption, treatment assignment and the potential outcomes are independent given the propensity score. Much of the work on propensity score analysis has focused on the case where the treatment is binary, but in many cases of interest the treatment takes on more than two values. In this article we examine an extension to the propensity score method, in a setting with a continuous treatment. 
Date:  2007–04 
URL:  http://d.repec.org/n?u=RePEc:uca:ucapdv:79&r=ecm 
By:  Ivan, Kitov 
Abstract:  A linear and lagged relationship between inflation and labor force growth rate has been recently found for the USA. It accurately describes the period after the late 1950s with linear coefficient 4.0, intercept 0.03, and the lag of 2 years. The previously reported agreement between observed and predicted inflation is substantially improved by some simple measures removing the most obvious errors in the labor force time series. The labor force readings originally obtained from the Bureau of Labor Statistics (BLS) website are corrected for steplike adjustments. Additionally, a halfyear time shift between the inflation and the annual labor force readings is compensated. GDP deflator represents the inflation. Linear regression analysis demonstrates that the annual labor force growth rate used as a predictor explains almost 82% (R2=0.82) of the inflation variations between 1965 and 2002. Moving average technique applied to the annual time series results in a substantial increase in R2. It grows from 0.87 for twoyear wide windows to 0.96 for fouryear windows. Regression of cumulative curves is characterized by R2>0.999. This allows effective replacement of GDP deflation index by a “labor force growth” index. The linear and lagged relationship provides a precise forecast at the twoyear horizon with root mean square forecasting error (RMSFE) as low as 0.008 (0.8%) for the entire period between 1965 and 2002. For the last 20 years, RMSFE is only 0.4%. Thus, the forecast methodology effectively outperforms any other forecasting technique reported in economic and financial literature. Moreover, further significant improvements in the forecasting accuracy are accessible through improvements in the labor force measurements in line with the US Census Bureau population estimates, which are neglected by BLS. 
Keywords:  inflation; labor force; forecast; the USA 
JEL:  E61 E31 J21 
Date:  2006–07 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:2735&r=ecm 
By:  Mishra, SK 
Abstract:  Correlation matrices have many applications, particularly in marketing and financial economics  such as in risk management, option pricing and to forecast demand for a group of products in order to realize savings by properly managing inventories, etc. Various methods have been proposed by different authors to solve the nearest correlation matrix problem by majorization, hypersphere decomposition, semidefinite programming, or geometric programming, etc. In this paper we propose to obtain the nearest valid correlation matrix by the differential evaluation method of global optimization. We may draw some conclusions from the exercise in this paper. First, the ‘nearest correlation matrix problem may be solved satisfactorily by the evolutionary algorithm like the differential evolution method/Particle Swarm Optimizer. Other methods such as the Particle Swarm method also may be used. Secondly, these methods are easily amenable to choice of the norm to minimize. Absolute, Frobenius or Chebyshev norm may easily be used. Thirdly, the ‘complete the correlation matrix problem’ can be solved (in a limited sense) by these methods. Fourthly, one may easily opt for weighted norm or unweighted norm minimization. Fifthly, minimization of absolute norm to obtain nearest correlation matrices appears to give better results. In solving the nearest correlation matrix problem the resulting valid correlation matrices are often nearsingular and thus they are on the borderline of seminegativity. One finds difficulty in rounding off their elements even at 6th or 7th places after decimal, without running the risk of making the rounded off matrix negative definite. Such matrices are, therefore, difficult to handle. It is possible to obtain more robust positive definite valid correlation matrices by constraining the determinant (the product of eigenvalues) of the resulting correlation matrix to take on a value significantly larger than zero. But this can be done only at the cost of a compromise on the criterion of ‘nearness.’ The method proposed by us does it very well. 
Keywords:  Correlation matrix; product moment; nearest; complete; positive semidefinite; majorization; hypersphere decomposition; semidefinite programming; geometric programming; Particle Swarm; Differential Evolution; Particle Swarm Optimization; Global Optimization; risk management; option pricing; financial economics; marketing; computer program; Fortran; norm; absolute; maximum; Frobenius; Chebyshev; Euclidean. 
JEL:  C63 G00 C88 C61 G19 
Date:  2007–04–14 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:2760&r=ecm 
By:  Evangelos M. Falaris (Department of Economics,University of Delaware) 
Abstract:  Misclassification of the dependent variable in binary choice models can result in inconsistency of the parameter estimates. I estimate probit models that treat misclassification probabilities as estimable parameters for three labor market outcomes: formal sector employment, pension contribution and job change. I use Living Standards Measurement Study data from Nicaragua, Peru, Brazil, Guatemala, and Panama. I find that there is significant misclassification in eleven of the sixteen cases that I investigate. If misclassification is present, but is ignored, estimates of the probit parameters and their standard errors are biased toward zero. In most cases, predicted probabilities of the outcomes are significantly affected by misclassification of the dependent variable. Even a moderate degree of misclassification can have substantial effects on the estimated parameters and on many of the predictions. 
Keywords:  Data Quality; Misclassification; Formal Sector; Pension Contributor; Job Change; Nicaragua; Peru; Brazil; Guatemala; Panama 
JEL:  C81 C25 O17 J26 J62 
URL:  http://d.repec.org/n?u=RePEc:dlw:wpaper:0705.&r=ecm 