nep-ecm New Economics Papers
on Econometrics
Issue of 2022‒09‒12
nineteen papers chosen by
Sune Karlsson
Örebro universitet

  1. Power of unit root tests against nonlinear and noncausal alternatives By Frédérique Bec; Alain Guay; Heino Bohn Nielsen; Sarra Saïdi
  2. Efficient estimation of spatial econometrics models with skewed and heavy-tailed distributed errors By Vincenzo Verardi
  3. The Econometrics of Financial Duration Modeling By Giuseppe Cavaliere; Thomas Mikosch; Anders Rahbek; Frederik Vilandt
  4. Difference-in-Differences with a Misclassified Treatment By Akanksha Negi; Digvijay Singh Negi
  5. Bootstrap inference in the presence of bias By Giuseppe Cavaliere; S\'ilvia Gon\c{c}alves; Morten {\O}rregaard Nielsen
  6. Identification and Inference with Min-over-max Estimators for the Measurement of Labor Market Fairness By Karthik Rajkumar
  7. Weak Instruments, First-Stage Heteroskedasticity, the Robust F-Test and a GMM Estimator with the Weight Matrix Based on First-Stage Residuals By Frank Windmeijer
  8. Identification and (Fast) Estimation of Large Nonlinear Panel Models with Two-Way Fixed Effects By Mugnier, Martin; Wang, Ao
  9. Factor Network Autoregressions By Matteo Barigozzi; Giuseppe Cavaliere; Graziano Moramarco
  10. Estimating Inequality with Missing Incomes By Paolo Brunori; Pedro Salas-Rojo; Paolo Verme
  11. How does a local Instrumental Variable Method perform across settings with instruments of differing strengths? A simulation study and an evaluation of emergency surgery. By Moler-Zapata, S.;; Grieve, R.;; Basu, A.;; O'Neill, S.;
  12. Bayesian ranking and selection with applications to field studies, economic mobility, and forecasting By Dillon Bowen
  13. Testing for Outliers with Conformal P-Values By Bates, Stephen; Candes, Emmanuel; Lei, Lihua; Romano, Yaniv; Sesia, Matteo
  14. A Hawkes model with CARMA(p,q) intensity By Lorenzo Mercuri; Andrea Perchiazzo; Edit Rroji
  15. Investigation of the Convex Time Budget Experiment by Parameter Recovery Simulation By Yuta Shimodaira; Kohei Shiozawa; Keigo Inukai
  16. Preferences and performance in simultaneous first-price auctions: a structural analysis By Gentry, Matthew; Komarova, Tatiana; Schiraldi, Pasquale
  17. Computing Bayes: From Then `Til Now By Gael M. Martin; David T. Frazier; Christian P. Robert
  18. Skewed SVARs: tracking the structural sources of macroeconomic tail risks By Carlos Montes-Galdón; Eva Ortega
  19. The Limit of the Marginal Distribution Model in Consumer Choice By Yanqiu Ruan; Xiaobo Li; Karthyek Murthy; Karthik Natarajan

  1. By: Frédérique Bec; Alain Guay; Heino Bohn Nielsen; Sarra Saïdi (Université de Cergy-Pontoise, THEMA)
    Abstract: The increasing sophistication of economic and financial time series modelling creates a need for a test of the time dependence structure of the series which does not require a proper specification of the alternative. Indeed, the latter is unknown beforehand. Yet, the stationarity has to be established before proceeding to the estimation and testing of causal/noncausal or linear/nonlinear models as their econometric theory has been developed under the maintained assumption of stationarity. In this paper, we propose a new unit root test statistics which is both asymptotically consistent against all stationary alternatives and still keeps good power properties in finite sample. A large simulation study is performed to assess the power of our test compared to existing unit root tests built specifically for various kinds of stationary alternatives, when the true DGP is either causal or noncausal, linear or nonlinear stationary. Based on various sample sizes and degrees of persistence, it turns out that our new test performs very well in terms of power in finite sample, no matter the alternative under consideration.
    Keywords: Unit root test, Threshold autoregressive model, Noncausal model.
    JEL: C12 C22 C32
    Date: 2022
  2. By: Vincenzo Verardi (Université de Namur)
    Abstract: In spatial econometrics, estimation of models by maximum likelihood (ML) generally relies on the assumption of normally distributed errors. While this approach leads to highly efficient estimators when the distribution is Gaussian, GMM might yield more efficient estimators if the distribution is misspecified. For the SAR model, Lee (2004) proposes an alternative QML estimator that is less sensitive to the violation of the normality assumption. In this presentation, I derive an estimator that is highly efficient for skewed and heavy-tailed distributions. More precisely, I here assume that the distribution of the errors is a Tukey g-and-h (Tgh). However, because the density function of the Tgh has no explicit form, the optimization program for the MLE needs a numeric inversion of the quantile function to fit the model, which is a computationally demanding task. To solve this difficulty, I rely on the local asymptotic normality (LAN) property of spatial econometrics models to propose an estimator that avoids such a computational burden. My Monte Carlo simulations show that our estimator outperforms the ones available as soon as the distribution of the errors departs from Gaussianity either by exhibiting heavier tails or skewness. I illustrate the usefulness of the suggested procedure relying on a trade regression.
    Date: 2022–08–01
  3. By: Giuseppe Cavaliere; Thomas Mikosch; Anders Rahbek; Frederik Vilandt
    Abstract: We discuss estimation and inference in financial durations models. For the classical autoregressive conditional duration (ACD) models by Engle and Russell (1998, Econometrica 66, 1127-1162), we show the surprising result that the large sample behavior of likelihood estimators depends on the tail behavior of the durations. Even under stationarity, asymptotic normality breaks down for tail indices smaller than one. Instead, estimators are mixed Gaussian with non-standard rates of convergence. We exploit here the crucial fact that for duration data the number of observations within any time span is random. Our results apply to general econometric models where the number of observed events is random.
    Date: 2022–08
  4. By: Akanksha Negi; Digvijay Singh Negi
    Abstract: This paper studies identification and estimation of the average treatment effect on the treated (ATT) in difference-in-difference (DID) designs when the variable that classifies individuals into treatment and control groups (treatment status, D) is endogenously misclassified. We show that misclassification in D hampers consistent estimation of ATT because 1) it restricts us from identifying the truly treated from those misclassified as being treated and 2) differential misclassification in counterfactual trends may result in parallel trends being violated with D even when they hold with the true but unobserved D*. We propose a solution to correct for endogenous one-sided misclassification in the context of a parametric DID regression which allows for considerable heterogeneity in treatment effects and establish its asymptotic properties in panel and repeated cross section settings. Furthermore, we illustrate the method by using it to estimate the insurance impact of a large-scale in-kind food transfer program in India which is known to suffer from large targeting errors.
    Date: 2022–08
  5. By: Giuseppe Cavaliere; S\'ilvia Gon\c{c}alves; Morten {\O}rregaard Nielsen
    Abstract: We consider bootstrap inference for estimators which are (asymptotically) biased. We show that, even when the bias term cannot be consistently estimated, valid inference can be obtained by proper implementations of the bootstrap. Specifically, we show that the prepivoting approach of Beran (1987, 1988), originally proposed to deliver higher-order refinements, restores bootstrap validity by transforming the original bootstrap p-value into an asymptotically uniform random variable. We propose two different implementations of prepivoting (plug-in and double bootstrap), and provide general high-level conditions that imply validity of bootstrap inference. To illustrate the practical relevance and implementation of our results, we discuss five applications: (i) a simple location model for i.i.d. data, possibly with infinite variance; (ii) regression models with omitted controls; (iii) inference on a target parameter based on model averaging; (iv) ridge-type regularized estimators; and (v) dynamic panel data models.
    Date: 2022–08
  6. By: Karthik Rajkumar
    Abstract: These notes shows how to do inference on the Demographic Parity (DP) metric. Although the metric is a complex statistic involving min and max computations, we propose a smooth approximation of those functions and derive its asymptotic distribution. The limit of these approximations and their gradients converge to those of the true max and min functions, wherever they exist. More importantly, when the true max and min functions are not differentiable, the approximations still are, and they provide valid asymptotic inference everywhere in the domain. We conclude with some directions on how to compute confidence intervals for DP, how to test if it is under 0.8 (the U.S. Equal Employment Opportunity Commission fairness threshold), and how to do inference in an A/B test.
    Date: 2022–07
  7. By: Frank Windmeijer
    Abstract: This paper is concerned with the findings related to the robust first-stage F-statistic in the Monte Carlo analysis of Andrews (2018), who found in a heteroskedastic grouped-data design that even for very large values of the robust F-statistic, the standard 2SLS confidence intervals had large coverage distortions. This finding appears to discredit the robust F-statistic as a test for underidentification. However, it is shown here that large values of the robust F-statistic do imply that there is first-stage information, but this may not be utilized well by the 2SLS estimator, or the standard GMM estimator. An estimator that corrects for this is a robust GMM estimator, denoted GMMf, with the robust weight matrix not based on the structural residuals, but on the first-stage residuals. For the grouped-data setting of Andrews (2018), this GMMf estimator gives the weights to the group specific estimators according to the group specific concentration parameters in the same way as 2SLS does under homoskedasticity, which is formally shown using weak instrument asymptotics. The GMMf estimator is much better behaved than the 2SLS estimator in the Andrews (2018) design, behaving well in terms of relative bias and Wald-test size distortion at more standard values of the robust F-statistic. We show that the same patterns can occur in a dynamic panel data model when the error variance is heteroskedastic over time. We further derive the conditions under which the Stock and Yogo (2005) weak instruments critical values apply to the robust F-statistic in relation to the behaviour of the GMMf estimator.
    Date: 2022–08
  8. By: Mugnier, Martin (CREST, ENSAE, Institut Polytechnique de Paris); Wang, Ao (University of Warwick, CAGE Research Centre)
    Abstract: We study a nonlinear two-way fixed effects panel model that allows for unobserved individual heterogeneity in slopes (interacting with covariates) and (unknown) flexibly specified link function. The former is particularly relevant when the researcher is interested in the distributional causal effects of covariates, and the latter mitigates potential misspecification errors due to imposing a known link function. We show that the fixed effects parameters and the (nonparametrically specified) link function can be identified when both individual and time dimensions are large. We propose a novel iterative Gauss-Seidel estimation procedure that overcomes the practical challenge of dimensionality in the number of fixed effects when the dataset is large. We revisit two empirical studies in trade (Helpman et al., 2008) and innovation (Aghion et al., 2013), and find non-negligible unobserved dispersion in trade elasticity (across countries) and the effect of institutional ownership on innovation (across firms). These exercises emphasize the usefulness of our method in capturing flexible (and unobserved) heterogeneity in the causal relationship of interest that may have important implications for the subsequent policy analysis.
    Date: 2022
  9. By: Matteo Barigozzi; Giuseppe Cavaliere; Graziano Moramarco
    Abstract: We propose a factor network autoregressive (FNAR) model for time series with complex network structures. The coefficients of the model reflect many different types of connections between economic agents ("multilayer network"), which are summarized into a smaller number of network matrices ("network factors") through a novel tensor-based principal component approach. We provide consistency results for the estimation of the factors and the coefficients of the FNAR. Our approach combines two different dimension-reduction techniques and can be applied to ultra-high dimensional datasets. In an empirical application, we use the FNAR to investigate the cross-country interdependence of GDP growth rates based on a variety of international trade and financial linkages. The model provides a rich characterization of macroeconomic network effects and exhibits good forecast performance compared to popular dimension-reduction methods.
    Date: 2022–08
  10. By: Paolo Brunori; Pedro Salas-Rojo; Paolo Verme
    Abstract: The measurement of income inequality is affected by missing observations, especially if they are concentrated on the tails of an income distribution. This paper conducts an experiment to test how the different correction methods proposed by the statistical, econometric and machine learning literature address measurement biases of inequality due to item non response. We take a baseline survey and artificially corrupt the data employing several alternative non-linear functions that simulate patterns of income non-response, and show how biased inequality statistics can be when item non-responses are ignored. The comparative assessment of correction methods indicates that most methods are able to partially correct for missing data biases. Sample reweighting based on probabilities on non-response produces inequality estimates quite close to true values in most simulated missing data patterns. Matching and Pareto corrections can also be effective to correct for selected missing data patterns. Other methods, such as Single and Multiple imputations and Machine Learning methods are less effective. A final discussion provides some elements that help explaining these findings.
    Keywords: Income Inequality; Item non-response; Income Distributions; Inequality Predictions; Imputations.
    JEL: D31 D63 E64 O15
    Date: 2022
  11. By: Moler-Zapata, S.;; Grieve, R.;; Basu, A.;; O'Neill, S.;
    Abstract: Local instrumental variable (LIV) approaches use continuous/multi-valued instrumental variables (IV) to generate consistent estimates of average treatment effects (ATEs) and Conditional Average Treatment Effects (CATEs). However, there is little evidence on how LIV approaches perform with different sample sizes or according to the strength of the IV (as measured by the first-stage F-statistic). We examined the performance of an LIV approach and a two-stage least squares (2SLS) approach in settings with different sample sizes and IV strengths, and considered the implications for practice. Our simulation study considered three sample sizes (n = 5000, 10000, 50000), six levels of IV strength (F-statistic = 10, 25, 50, 100, 500, 1000) under four ‘heterogeneity’ scenarios: effect homogeneity, overt heterogeneity (over measured covariates), essential heterogeneity (over unmeasured covariates), and overt and essential heterogeneity combined. Compared to 2SLS, the LIV approach provided estimates for ATE and CATE with lower levels of bias and RMSE, irrespective of the sample size or IV strength. With smaller sample sizes, both approaches required IVs with greater strength to ensure low (less than 5%) levels of bias. In the presence of overt and/or essential heterogeneity, the LIV approach reported estimates with low bias even when the sample size was smaller (n = 5000), provided that the instrument was moderately strong (F-statistic greater than 50, for the ATE estimand). We considered both methods in evaluating emergency surgery across three different acute conditions with IVs of differing strengths (F-statistic ranging from 100 to 9000), and sample sizes (100000 to 300000). We found that 2SLS did not detect significant differences in effectiveness across subgroups, even with subgroup by treatment interactions included in the model. The LIV approach found there were substantive differences in the effectiveness of emergency surgery according to subgroups; for each of the three acute conditions, frail patients had worse outcomes following emergency surgery. These findings indicate that when a continuous IV of a moderate strength is available, LIV approaches are better suited than 2SLS to estimate policy-relevant treatment effect parameters.
    Keywords: instrumental variables; instrument strength; tendency to operate; emergency surgery;
    Date: 2022–07
  12. By: Dillon Bowen
    Abstract: Decision-making often involves ranking and selection. For example, to assemble a team of political forecasters, we might begin by narrowing our choice set to the candidates we are confident rank among the top 10% in forecasting ability. Unfortunately, we do not know each candidate's true ability but observe a noisy estimate of it. This paper develops new Bayesian algorithms to rank and select candidates based on noisy estimates. Using simulations based on empirical data, we show that our algorithms often outperform frequentist ranking and selection algorithms. Our Bayesian ranking algorithms yield shorter rank confidence intervals while maintaining approximately correct coverage. Our Bayesian selection algorithms select more candidates while maintaining correct error rates. We apply our ranking and selection procedures to field experiments, economic mobility, forecasting, and similar problems. Finally, we implement our ranking and selection techniques in a user-friendly Python package documented here:
    Date: 2022–08
  13. By: Bates, Stephen (UC Berkeley); Candes, Emmanuel (Stanford U); Lei, Lihua (Stanford U); Romano, Yaniv (Israel Institute of Technology); Sesia, Matteo (University of Southern California)
    Abstract: This paper studies the construction of p-values for nonparametric outlier detection, taking a multiple-testing perspective. The goal is to test whether new independent samples belong to the same distribution as a reference data set or are outliers. We propose a solution based on conformal inference, a broadly applicable framework which yields p-values that are marginally valid but mutually dependent for different test points. We prove these p-values are positively dependent and enable exact false discovery rate control, although in a relatively weak marginal sense. We then introduce a new method to compute p-values that are both valid conditionally on the training data and independent of each other for different test points; this paves the way to stronger type-I error guarantees. Our results depart from classical conformal inference as we leverage concentration inequalities rather than combinatorial arguments to establish our finite-sample guarantees. Furthermore, our techniques also yield a uniform confidence bound for the false positive rate of any outlier detection algorithm, as a function of the threshold applied to its raw statistics. Finally, the relevance of our results is demonstrated by numerical experiments on real and simulated data.
    Date: 2022–05
  14. By: Lorenzo Mercuri; Andrea Perchiazzo; Edit Rroji
    Abstract: In this paper we introduce a new model named CARMA(p,q)-Hawkes process as the Hawkes model with exponential kernel implies a strictly decreasing behaviour of the autocorrelation function and empirically evidences reject the monotonicity assumption on the autocorrelation function. The proposed model is a Hawkes process where the intensity follows a Continuous Time Autoregressive Moving Average (CARMA) process and specifically is able to reproduce more realistic dependence structures. We also study the conditions of stationarity and positivity for the intensity and the strong mixing property for the increments. Furthermore we compute the likelihood, present a simulation method and discuss an estimation method based on the autocorrelation function. A simulation and estimation exercise highlights the main features of the CARMA(p,q)-Hawkes.
    Date: 2022–08
  15. By: Yuta Shimodaira; Kohei Shiozawa; Keigo Inukai
    Abstract: The convex time budget (CTB) method is a widely used experimental method for eliciting an individual’s time preference. Researchers adopting the CTB experiment usually assume quasi-hyperbolic discounting utility as a behavioural model and estimate the parameters of the utility function. However, few studies using the CTB method have examined parameter recovery. We conduct simulations and find that the estimation error of the present bias parameter is so large that its effect is difficult to detect. The large error is due to the improper combination of the experimental method and the utility model, and it is not a problem we can deal with after the data collection. This paper suggests the importance of running parameter recovery simulations to audit estimation errors in the experimental design.
    Date: 2022–08
  16. By: Gentry, Matthew; Komarova, Tatiana; Schiraldi, Pasquale
    Abstract: Motivated by the prevalence of simultaneous bidding across a wide range of auction markets, we develop and estimate a model of strategic interaction in simultaneous first-price auctions when objects are heterogeneous and bidders have non-additive preferences over combinations. We establish non-parametric identification of primitives in this model under standard exclusion restrictions, providing a basis for both estimation and testing of preferences over combinations. We then apply our model to data on Michigan Department of Transportation (MDOT) highway procurement auctions, quantifying the magnitude of cost synergies and evaluating the performance of the simultaneous first-price mechanism in the MDOT marketplace.
    Keywords: auctions; complementarities; identification; ES/N000056/1
    JEL: D44
    Date: 2022–07–04
  17. By: Gael M. Martin; David T. Frazier; Christian P. Robert
    Abstract: This paper takes the reader on a journey through the history of Bayesian computation, from the 18th century to the present day. Beginning with the one-dimensional integral first confronted by Bayes in 1763, we highlight the key contributions of: Laplace, Metropolis (and, importantly, his coauthors!), Hammersley and Handscomb, and Hastings, all of which set the foundations for the computational revolution in the late 20th century -- led, primarily, by Markov chain Monte Carlo (MCMC) algorithms. A very short outline of 21st century computational methods -- including pseudo-marginal MCMC, Hamiltonian Monte Carlo, sequential Monte Carlo, and the various `approximate' methods -- completes the paper.
    Keywords: History of Bayesian computation, Laplace approximation, Metropolis-Hastings algorithm, importance sampling, Markov chain Monte Carlo, pseudo-marginal methods, Hamiltonian Monte Carlo, sequential Monte Carlo, approximate Bayesian methods
    Date: 2022
  18. By: Carlos Montes-Galdón (European Central Bank); Eva Ortega (Banco de España)
    Abstract: This paper proposes a vector autoregressive model with structural shocks (SVAR) that are identified using sign restrictions and whose distribution is subject to time-varying skewness. It also presents an efficient Bayesian algorithm to estimate the model. The model allows for the joint tracking of asymmetric risks to macroeconomic variables included in the SVAR. It also provides a narrative about the structural reasons for the changes over time in those risks. Using euro area data, our estimation suggests that there has been a significant variation in the skewness of demand, supply and monetary policy shocks between 1999 and 2019. This variation lies behind a significant proportion of the joint dynamics of real GDP growth and inflation in the euro area over this period, and also generates important asymmetric tail risks in these macroeconomic variables. Finally, compared to the literature on growth- and inflation-at-risk, we found that financial stress indicators do not suffice to explain all the macroeconomic tail risks.
    Keywords: Bayesian SVAR, skewness, growth-at-risk, inflation-at-risk
    JEL: C11 C32 C51 E31 E32
    Date: 2022–03
  19. By: Yanqiu Ruan; Xiaobo Li; Karthyek Murthy; Karthik Natarajan
    Abstract: Given data on choices made by consumers for different assortments, a key challenge is to develop parsimonious models that describe and predict consumer choice behavior. One such choice model is the marginal distribution model which requires only the specification of the marginal distributions of the random utilities of the alternatives to explain choice data. In this paper, we develop an exact characterisation of the set of choice probabilities which are representable by the marginal distribution model consistently across any collection of assortments. Allowing for the possibility of alternatives to be grouped based on the marginal distribution of their utilities, we show (a) verifying consistency of choice probability data with this model is possible in polynomial time and (b) finding the closest fit reduces to solving a mixed integer convex program. Our results show that the marginal distribution model provides much better representational power as compared to multinomial logit and much better computational performance as compared to the random utility model.
    Date: 2022–08

This nep-ecm issue is ©2022 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.