nep-ecm New Economics Papers
on Econometrics
Issue of 2017‒06‒04
nineteen papers chosen by
Sune Karlsson
Örebro universitet

  1. System Priors for Econometric Time Series By Michal Andrle; Miroslav Plasil
  3. Pseudolikelihood estimation of the stochastic frontier model By Andor, Mark; Parmeter, Christopher
  4. Chow-Lin x N: How adding a panel dimension can improve accuracy By Bettendorf, Timo; Bursian, Dirk
  5. Sharp convergence rates for forward regression in high-dimensional sparse linear models By Damian Kozbur
  6. Nonparametric Regressions with Thresholds: Identification and Estimations By Yan-Yu Chiou; Mei-Yuan Chen; Jau-er Chen
  7. Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators By Lillo Rodríguez, Rosa Elvira; Laniado Rodas, Henry; Cabana Garceran del Vall, Elisa
  8. Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case By Arthur Lewbel
  9. Conditional Independence test for categorical data using Poisson log-linear model By Tsagris, Michail
  10. A menu on output gap estimation methods By Luis J. Álvarez; Ana Gómez-Loscos
  11. A general framework for prediction in penalized regression By Lee, Dae-Jin; Durbán Reguera, María Luz; Carballo González, Alba
  12. Identfication, data combination and the risk of disclosure By Tatiana Komarova; Denis Nekipelov; Evgeny Yakovlev
  13. Kernel depth funcions for functional data By Muñoz García, Alberto; Hernández Banadik, Nicolás Jorge
  14. Testing for Volatility Co-movement in Bivariate Stochastic Volatility Models By Chen, J.; Kobayashi, M.; McAleer, M.J.
  15. Estimation of a Dynamic Multilevel Factor Model with possible long-range dependence By Rodríguez Caballero, Carlos Vladimir; Ergemen, Yunus Emre
  16. Endogenous Environmental Variables In Stochastic Frontier Models By Amsler, Christine; Prokhorov, Artem; Schmidt, Peter
  17. Why You Should Never Use the Hodrick-Prescott Filter By James D. Hamilton
  18. Financial Time Series Forecasting: Semantic Analysis Of Economic News By Kateryna Kononova; Anton Dek
  19. Prediction Bands for Functional Data Based on Depth Measures By Elías Fernández, Antonio; Jiménez Recaredo, Raúl José

  1. By: Michal Andrle; Miroslav Plasil
    Abstract: This paper introduces "system priors" into Bayesian analysis of econometric time series and provides a simple and illustrative application. Unlike priors on individual parameters, system priors offer a simple and efficient way of formulating well-defined and economically meaningful priors about model properties that determine the overall behavior of the model. The generality of system priors is illustrated using an AR(2) process with a prior that its dynamics comes mostly from business-cycle frequencies.
    Keywords: Bayesian analysis, system priors, time series
    JEL: C11 C18 C22 C51
    Date: 2017–05
  2. By: Davide De Gaetano
    Abstract: This paper proposes some weighting schemes to average forecasts across different estimation windows to account for structural changes in the unconditional variance of a GARCH (1,1) model. Each combination is obtained by averaging forecasts generated by recursively increasing an initial estimation window of a fixed number of observations v. Three different choices of the combination weights are proposed. In the first scheme, the forecast combination is obtained by using equal weights to average the individual forecasts; the second weighting method assigns heavier weights to forecasts that use more recent information; the third is a trimmed version of the forecast combination with equal weights where a fixed fraction of forecasts with the worst performance are discarded. Simulation results show that forecast combinations with high values of v are able to perform better than alternative schemes proposed in the literature. An application to real data confirms the simulation results
    Keywords: Forecast combinations, Structural breaks, GARCH models.
    JEL: C53 C58 G17
    Date: 2017–05
  3. By: Andor, Mark; Parmeter, Christopher
    Abstract: Stochastic frontier analysis is a popular tool to assess firm performance. Almost universally it has been applied using maximum likelihood estimation. An alternative approach, pseudolikelihood estimation, decouples estimation of the error component structure and the production frontier, has been adopted in both the nonparametric and panel data settings. To date, no formal comparison has yet to be conducted comparing these methods in a standard, parametric cross sectional framework. We produce a comparison of these two competing methods using Monte Carlo simulations. Our results indicate that pseudolikelihood estimation enjoys almost identical performance to maximum likelihood estimation across a range of scenarios and performance metrics, and for certain metrics outperforms maximum likelihood estimation when the distribution of inefficiency is incorrectly specied.
    Keywords: stochastic frontier analysis,maximum likelihood,production function,Monte Carlo simulation
    JEL: C1 C5 D2
    Date: 2017
  4. By: Bettendorf, Timo; Bursian, Dirk
    Abstract: Single equation models are well established among academics and practitioners to perform temporal disaggregation of low frequency time series using available related series. In this paper, we propose an extension that exploits information from the cross-sectional dimension. More specifically, we suggest jointly estimating multiple Chow and Lin (1971) equations, one for each cross-sectional unit (e.g. country), restricting the coefficients to be the same across units in order to interpolate unitspecific data. Using actual data on real GDP and industrial production for euro area countries we provide evidence that this approach can result in more accurate interpolated time series for individual countries. The results suggest that the inclusion of time fixed effects, which is not feasible in standard single equation models, can be helpful in increasing accuracy of the resulting series.
    Keywords: temporal disaggregation,interpolation,panel data
    JEL: C23 C53
    Date: 2017
  5. By: Damian Kozbur
    Abstract: Forward regression is a statistical model selection and estimation procedure which inductively selects covariates that add predictive power into a working statistical regression model. Once a model is selected, unknown regression parameters are estimated by least squares. This paper analyzes forward regression in high-dimensional sparse linear models. Probabilistic bounds for prediction error norm and number of selected covariates are proved. The analysis in this paper gives sharp rates and does not require β-min or irrepresentability conditions.
    Keywords: Forward regression, high-dimensional models, sparsity, model selection
    Date: 2017–05
  6. By: Yan-Yu Chiou; Mei-Yuan Chen; Jau-er Chen
    Abstract: This paper examines nonparametric regressions with an exogenous threshold variable, allowing for an unknown number of thresholds. Given the number of thresholds and corresponding threshold values, we first establish the asymptotic properties of the local-constant estimator for a nonparametric regression with multiple thresholds. We then determine the unknown number of thresholds and derive the limiting distribution of the proposed test. The Monte Carlo simulation results indicate the adequacy of the modified test and accuracy of the sequential estimation of the threshold values. We apply our testing procedure to an empirical study of the 401(k) retirement savings plan with income thresholds.
    Date: 2017–05
  7. By: Lillo Rodríguez, Rosa Elvira; Laniado Rodas, Henry; Cabana Garceran del Vall, Elisa
    Abstract: A collection of methods for multivariate outlier detection based on a robust Mahalanobis distance is proposed. The procedure consists on different combinations of robust estimates for location and covariance matrix based on shrinkage. The performance of our proposal is illustrated, through the comparison to other techniques from the literature, in a simulation study. The resulting high correct classification rates and low false classification rates in the vast majority of cases, and also the good computational times shows the goodness of our proposal. The performance is also illustrated with a real dataset example and some conclusions are established.
    Keywords: robust covariance matrix; robust location; robust estimation; high-dimension; shrinkage estimator; robust Mahalanobis distance; outlier detection
    Date: 2017–05
  8. By: Arthur Lewbel (Boston College)
    Abstract: Lewbel (2012) provides an estimator for linear regression models containing an endogenous regressor, when no outside instruments or other such information is available. The method works by exploiting model heteroscedasticity to construct instruments using the available regressors. Some authors have considered the method in empirical applications where an endogenous regressor is binary (e.g., endogenous Diff-in-Diff or endogenous binary treatment models), without proving validity of the estimator in that case. The present paper shows that the assumptions required for Lewbel’s estimator can indeed be satisfied when an endogenous regressor is binary.
    Keywords: Simultaneous systems, linear regressions, endogeneity, identification, heteroscedasticity, binary regressors, dummy regressors, linear probability model, logit, probit
    JEL: C35 C36 C30 C13
    Date: 2016–12–15
  9. By: Tsagris, Michail
    Abstract: We demonstrate how to test for conditional independence of two variables with categorical data using Poisson log-linear models. The size of the conditioning set of variables can vary from 0 (simple independence) up to many variables. We also provide a function in R for performing the test. Instead of calculating all possible tables with for loop we perform the test using the log-linear models and thus speeding up the process. Time comparison simulation studies are presented.
    Keywords: Conditional independence, categorical data, Poisson log-linear models
    JEL: C12
    Date: 2017–03
  10. By: Luis J. Álvarez (Banco de España); Ana Gómez-Loscos (Banco de España)
    Abstract: This paper presents a survey of output gap modeling techniques, which are of special interest for policy making institutions. We distinguish between univariate -which estimate trend output on the basis of actual output, without taking into account the information contained in other variables–, and multivariate methods –which incorporate useful information on some other variables, based on economic theory. We present the main advantages and drawbacks of the different methods.
    Keywords: output gap, potential output, business cycle, trend output, survey
    JEL: E32 O4
    Date: 2017–05
  11. By: Lee, Dae-Jin; Durbán Reguera, María Luz; Carballo González, Alba
    Abstract: We present several methods for prediction of new observations in penalized regression using different methodologies, based on the methods proposed in: i) Currie et al. (2004), ii) Gilmour et al. (2004) and iii) Sacks et al. (1989). We extend the method introduced by Currie et al. (2004) to consider the prediction of new observations in the mixed model framework. In the context of penalties based on differences between adjacent coefficients (Eilers & Marx (1996)), the equivalence of the different methods is shown. We demonstrate several properties of the new coefficients in terms of the order of the penalty. We also introduce the concept memory of a P-spline, this new idea gives us information on how much past information we are using to predict. The methodology and the concept of memory of a P-spline are illustrated with three real data sets, two of them on the yearly mortality rates of Spanish men and other on rental prices.
    Keywords: Mixed Models; P-splines; Penalized regression; Prediction
    Date: 2017–05
  12. By: Tatiana Komarova; Denis Nekipelov; Evgeny Yakovlev
    Abstract: It is commonplace that the data needed for econometric inference are not contained in a single source. In this paper we analyze the problem of parametric inference from combined individual-level data when data combination is based on personal and demographic identifiers such as name, age, or address. Our main question is the identification of the econometric model based on the combined data when the data do not contain exact individual identifiers and no parametric assumptions are imposed on the joint distribution of information that is common across the combined dataset. We demonstrate the conditions on the observable marginal distributions of data in individual datasets that can and cannot guarantee identification of the parameters of interest. We also note that the data combination procedure is essential in the semiparametric setting such as ours. Provided that the (non-parametric) data combination procedure can only be defined in finite samples, we introduce a new notion of identification based on the concept of limits of statistical experiments. Our results apply to the setting where the individual data used for inferences are sensitive and their combination may lead to a substantial increase in the data sensitivity or lead to a de-anonymization of the previously anonymized information. We demonstrate that the point identification of an econometric model from combined data is incompatible with restrictions on the risk of individual disclosure. If the data combination procedure guarantees a bound on the risk of individual disclosure, then the information available from the combined dataset allows one to identify the parameter of interest only partially, and the size of the identification region is inversely related to the upper bound guarantee for the disclosure risk. This result is new in the context of data combination as we notice that the quality of links that need to be used in the combined data to assure point identification may be much higher than the average link quality in the entire dataset, and thus point inference requires the use of the most sensitive subset of the data. Our results provide important insights into the ongoing discourse on the empirical analysis of merged administrative records as well as discussions on the disclosive nature of policies implemented by the data-driven companies (such as Internet services companies and medical companies using individual patient records for policy decisions)
    Keywords: Data protection; model identification; data combination.
    JEL: C13 C14 C25 C35
  13. By: Muñoz García, Alberto; Hernández Banadik, Nicolás Jorge
    Abstract: In the last years the concept of data depth has been increasingly used in Statistics as a center-outward ordering of sample points in multivariate data sets. Recently data depth has been extended to functional data. In this paper we propose new intrinsic functional data depths based on the representation of functional data on Reproducing Kernel Hilbert Spaces, and test its performance against a number of well known alternatives in the problem of functional outlier detection.
    Keywords: Outlier detection; Reproducing Kernel Hilbert Spaces; Functional Data Analysis; Kernel depth
    Date: 2017–04
  14. By: Chen, J.; Kobayashi, M.; McAleer, M.J.
    Abstract: The paper considers the problem of volatility co-movement, namely as to whether two financial returns have perfectly correlated common volatility process, in the framework of multivariate stochastic volatility models and proposes a test which checks the volatility co-movement. The proposed test is a stochastic volatility version of the co-movement test proposed by Engle and Susmel (1993), who investigated whether international equity markets have volatility co-movement using the framework of the ARCH model. In empirical analysis we found that volatility co-movement exists among closelylinked stock markets and that volatility co-movement of the exchange rate markets tends to be found when the overall volatility level is low, which is contrasting to the often-cited finding in the financial contagion literature that financial returns have co-movement in the level during the financial crisis.
    Keywords: Lagrange multiplier test, Volatility co-movement, Stock markets, Exchange rate Markets, Financial crisis
    JEL: C12 C58 G01 G11
    Date: 2017–02–01
  15. By: Rodríguez Caballero, Carlos Vladimir; Ergemen, Yunus Emre
    Abstract: A dynamic multilevel factor model with possible stochastic time trends is proposed. In the model, long-range dependence and short memory dynamics are allowed in global and regional common factors as well as model innovations. Estimation of global and regional common factors is performed on the prewhitened series, for which the prewhitening parameter is estimated semiparametrically from the cross-sectional and regional average of the observable series. Employing canonical correlation analysis and a sequential least-squares algorithm on the prewhitened series, the resulting multilevel factor estimates have a centered asymptotic normal distribution. Selection of the number of global and regional factors is also discussed. Estimates are found to have good small-sample performance via Monte Carlo simulations. The method is then applied to the Nord Pool electricity market for the analysis of price comovements among different regions within the power grid. The global factor is identified to be the system price, and fractional cointegration relationships are found between regional prices and the system price.
    Keywords: Nord Pool power market; fractional cointegration; short memory; long-range dependence; Multi-level factor
    Date: 2017–05
  16. By: Amsler, Christine; Prokhorov, Artem; Schmidt, Peter
    Abstract: This paper considers a stochastic frontier model that contains environmental variables that affect the level of inefficiency but not the frontier. The model contains statistical noise, potentially endogenous regressors, and technical inefficiency that follows the scaling property, in the sense that it is the product of a basic (half-normal) inefficiency term and a parametric function of the environmental variables. The environmental variables may be endogenous because they are correlated with the statistical noise or with the basic inefficiency term. Several previous papers have considered the case of inputs that are endogenous because they are correlated with statistical noise, and if they contain environmental variables these are exogenous. One recent paper allows the environmental variables to be correlated with statistical noise. Our paper is the first to allow both the inputs and the environmental variables to be endogenous in the sense that they are correlated either with statistical noise or with the basic inefficiency term. Correlation of inputs or environmental variables with the basic inefficiency term raises non-trivial conceptual issues about the meaning of exogeneity, and technical issues of estimation of the model.
    Keywords: environmental variables; stochastic frontier; endogeneity
    Date: 2017–04–09
  17. By: James D. Hamilton
    Abstract: Here's why. (1) The HP filter produces series with spurious dynamic relations that have no basis in the underlying data-generating process. (2) Filtered values at the end of the sample are very different from those in the middle, and are also characterized by spurious dynamics. (3) A statistical formalization of the problem typically produces values for the smoothing parameter vastly at odds with common practice, e.g., a value for λ far below 1600 for quarterly data. (4) There's a better alternative. A regression of the variable at date t+h on the four most recent values as of date t offers a robust approach to detrending that achieves all the objectives sought by users of the HP filter with none of its drawbacks.
    JEL: C22 E32 E47
    Date: 2017–05
  18. By: Kateryna Kononova; Anton Dek
    Abstract: The paper proposes a method of financial time series forecasting taking into account the semantics of news. For the semantic analysis of financial news the sampling of negative and positive words in economic sense was formed based on Loughran McDonald Master Dictionary. The sampling included the words with high frequency of occurrence in the news of financial markets. For single-root words it has been left only common part that allows covering few words for one request. Neural networks were chosen for modeling and forecasting. To automate the process of extracting information from the economic news a script was developed in the MATLAB Simulink programming environment, which is based on the generated sampling of positive and negative words. Experimental studies with different architectures of neural networks showed a high adequacy of constructed models and confirmed the feasibility of using information from news feeds to predict the stock prices.
    Date: 2017–05
  19. By: Elías Fernández, Antonio; Jiménez Recaredo, Raúl José
    Abstract: We propose a new methodology for predicting a partially observed curve from a functional data sample. The novelty of our approach relies on the selection of sample curves which form tight bands that preserve the shape of the curve to predict, making this a deep datum. The involved subsampling problem is dealt by algorithms specially designed to be used in conjunction with two different tools for computing central regions for functional data. From this merge, we obtain prediction bands for the unobserved part of the curve in question. We test our algorithms by forecasting the Spanish electricity demand and imputing missing daily temperatures. The results are consistent with our simulation that show that we can predict at the far horizon.
    Keywords: daily temperatures; electricity demand; central regions; depth measures
    Date: 2017–05

This nep-ecm issue is ©2017 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.