
on Econometrics 
By:  Frédérique Bec; Alain Guay; Heino Bohn Nielsen; Sarra Saïdi (Université de CergyPontoise, THEMA) 
Abstract:  The increasing sophistication of economic and financial time series modelling creates a need for a test of the time dependence structure of the series which does not require a proper specification of the alternative. Indeed, the latter is unknown beforehand. Yet, the stationarity has to be established before proceeding to the estimation and testing of causal/noncausal or linear/nonlinear models as their econometric theory has been developed under the maintained assumption of stationarity. In this paper, we propose a new unit root test statistics which is both asymptotically consistent against all stationary alternatives and still keeps good power properties in finite sample. A large simulation study is performed to assess the power of our test compared to existing unit root tests built specifically for various kinds of stationary alternatives, when the true DGP is either causal or noncausal, linear or nonlinear stationary. Based on various sample sizes and degrees of persistence, it turns out that our new test performs very well in terms of power in finite sample, no matter the alternative under consideration. 
Keywords:  Unit root test, Threshold autoregressive model, Noncausal model. 
JEL:  C12 C22 C32 
Date:  2022 
URL:  http://d.repec.org/n?u=RePEc:ema:worpap:202214&r= 
By:  Vincenzo Verardi (Université de Namur) 
Abstract:  In spatial econometrics, estimation of models by maximum likelihood (ML) generally relies on the assumption of normally distributed errors. While this approach leads to highly efficient estimators when the distribution is Gaussian, GMM might yield more efficient estimators if the distribution is misspecified. For the SAR model, Lee (2004) proposes an alternative QML estimator that is less sensitive to the violation of the normality assumption. In this presentation, I derive an estimator that is highly efficient for skewed and heavytailed distributions. More precisely, I here assume that the distribution of the errors is a Tukey gandh (Tgh). However, because the density function of the Tgh has no explicit form, the optimization program for the MLE needs a numeric inversion of the quantile function to fit the model, which is a computationally demanding task. To solve this difficulty, I rely on the local asymptotic normality (LAN) property of spatial econometrics models to propose an estimator that avoids such a computational burden. My Monte Carlo simulations show that our estimator outperforms the ones available as soon as the distribution of the errors departs from Gaussianity either by exhibiting heavier tails or skewness. I illustrate the usefulness of the suggested procedure relying on a trade regression. 
Date:  2022–08–01 
URL:  http://d.repec.org/n?u=RePEc:boc:fsug22:18&r= 
By:  Giuseppe Cavaliere; Thomas Mikosch; Anders Rahbek; Frederik Vilandt 
Abstract:  We discuss estimation and inference in financial durations models. For the classical autoregressive conditional duration (ACD) models by Engle and Russell (1998, Econometrica 66, 11271162), we show the surprising result that the large sample behavior of likelihood estimators depends on the tail behavior of the durations. Even under stationarity, asymptotic normality breaks down for tail indices smaller than one. Instead, estimators are mixed Gaussian with nonstandard rates of convergence. We exploit here the crucial fact that for duration data the number of observations within any time span is random. Our results apply to general econometric models where the number of observed events is random. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.02098&r= 
By:  Akanksha Negi; Digvijay Singh Negi 
Abstract:  This paper studies identification and estimation of the average treatment effect on the treated (ATT) in differenceindifference (DID) designs when the variable that classifies individuals into treatment and control groups (treatment status, D) is endogenously misclassified. We show that misclassification in D hampers consistent estimation of ATT because 1) it restricts us from identifying the truly treated from those misclassified as being treated and 2) differential misclassification in counterfactual trends may result in parallel trends being violated with D even when they hold with the true but unobserved D*. We propose a solution to correct for endogenous onesided misclassification in the context of a parametric DID regression which allows for considerable heterogeneity in treatment effects and establish its asymptotic properties in panel and repeated cross section settings. Furthermore, we illustrate the method by using it to estimate the insurance impact of a largescale inkind food transfer program in India which is known to suffer from large targeting errors. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.02412&r= 
By:  Giuseppe Cavaliere; S\'ilvia Gon\c{c}alves; Morten {\O}rregaard Nielsen 
Abstract:  We consider bootstrap inference for estimators which are (asymptotically) biased. We show that, even when the bias term cannot be consistently estimated, valid inference can be obtained by proper implementations of the bootstrap. Specifically, we show that the prepivoting approach of Beran (1987, 1988), originally proposed to deliver higherorder refinements, restores bootstrap validity by transforming the original bootstrap pvalue into an asymptotically uniform random variable. We propose two different implementations of prepivoting (plugin and double bootstrap), and provide general highlevel conditions that imply validity of bootstrap inference. To illustrate the practical relevance and implementation of our results, we discuss five applications: (i) a simple location model for i.i.d. data, possibly with infinite variance; (ii) regression models with omitted controls; (iii) inference on a target parameter based on model averaging; (iv) ridgetype regularized estimators; and (v) dynamic panel data models. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.02028&r= 
By:  Karthik Rajkumar 
Abstract:  These notes shows how to do inference on the Demographic Parity (DP) metric. Although the metric is a complex statistic involving min and max computations, we propose a smooth approximation of those functions and derive its asymptotic distribution. The limit of these approximations and their gradients converge to those of the true max and min functions, wherever they exist. More importantly, when the true max and min functions are not differentiable, the approximations still are, and they provide valid asymptotic inference everywhere in the domain. We conclude with some directions on how to compute confidence intervals for DP, how to test if it is under 0.8 (the U.S. Equal Employment Opportunity Commission fairness threshold), and how to do inference in an A/B test. 
Date:  2022–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2207.13797&r= 
By:  Frank Windmeijer 
Abstract:  This paper is concerned with the findings related to the robust firststage Fstatistic in the Monte Carlo analysis of Andrews (2018), who found in a heteroskedastic groupeddata design that even for very large values of the robust Fstatistic, the standard 2SLS confidence intervals had large coverage distortions. This finding appears to discredit the robust Fstatistic as a test for underidentification. However, it is shown here that large values of the robust Fstatistic do imply that there is firststage information, but this may not be utilized well by the 2SLS estimator, or the standard GMM estimator. An estimator that corrects for this is a robust GMM estimator, denoted GMMf, with the robust weight matrix not based on the structural residuals, but on the firststage residuals. For the groupeddata setting of Andrews (2018), this GMMf estimator gives the weights to the group specific estimators according to the group specific concentration parameters in the same way as 2SLS does under homoskedasticity, which is formally shown using weak instrument asymptotics. The GMMf estimator is much better behaved than the 2SLS estimator in the Andrews (2018) design, behaving well in terms of relative bias and Waldtest size distortion at more standard values of the robust Fstatistic. We show that the same patterns can occur in a dynamic panel data model when the error variance is heteroskedastic over time. We further derive the conditions under which the Stock and Yogo (2005) weak instruments critical values apply to the robust Fstatistic in relation to the behaviour of the GMMf estimator. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.01967&r= 
By:  Mugnier, Martin (CREST, ENSAE, Institut Polytechnique de Paris); Wang, Ao (University of Warwick, CAGE Research Centre) 
Abstract:  We study a nonlinear twoway fixed effects panel model that allows for unobserved individual heterogeneity in slopes (interacting with covariates) and (unknown) flexibly specified link function. The former is particularly relevant when the researcher is interested in the distributional causal effects of covariates, and the latter mitigates potential misspecification errors due to imposing a known link function. We show that the fixed effects parameters and the (nonparametrically specified) link function can be identified when both individual and time dimensions are large. We propose a novel iterative GaussSeidel estimation procedure that overcomes the practical challenge of dimensionality in the number of fixed effects when the dataset is large. We revisit two empirical studies in trade (Helpman et al., 2008) and innovation (Aghion et al., 2013), and find nonnegligible unobserved dispersion in trade elasticity (across countries) and the effect of institutional ownership on innovation (across firms). These exercises emphasize the usefulness of our method in capturing flexible (and unobserved) heterogeneity in the causal relationship of interest that may have important implications for the subsequent policy analysis. 
Date:  2022 
URL:  http://d.repec.org/n?u=RePEc:wrk:warwec:1422&r= 
By:  Matteo Barigozzi; Giuseppe Cavaliere; Graziano Moramarco 
Abstract:  We propose a factor network autoregressive (FNAR) model for time series with complex network structures. The coefficients of the model reflect many different types of connections between economic agents ("multilayer network"), which are summarized into a smaller number of network matrices ("network factors") through a novel tensorbased principal component approach. We provide consistency results for the estimation of the factors and the coefficients of the FNAR. Our approach combines two different dimensionreduction techniques and can be applied to ultrahigh dimensional datasets. In an empirical application, we use the FNAR to investigate the crosscountry interdependence of GDP growth rates based on a variety of international trade and financial linkages. The model provides a rich characterization of macroeconomic network effects and exhibits good forecast performance compared to popular dimensionreduction methods. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.02925&r= 
By:  Paolo Brunori; Pedro SalasRojo; Paolo Verme 
Abstract:  The measurement of income inequality is affected by missing observations, especially if they are concentrated on the tails of an income distribution. This paper conducts an experiment to test how the different correction methods proposed by the statistical, econometric and machine learning literature address measurement biases of inequality due to item non response. We take a baseline survey and artificially corrupt the data employing several alternative nonlinear functions that simulate patterns of income nonresponse, and show how biased inequality statistics can be when item nonresponses are ignored. The comparative assessment of correction methods indicates that most methods are able to partially correct for missing data biases. Sample reweighting based on probabilities on nonresponse produces inequality estimates quite close to true values in most simulated missing data patterns. Matching and Pareto corrections can also be effective to correct for selected missing data patterns. Other methods, such as Single and Multiple imputations and Machine Learning methods are less effective. A final discussion provides some elements that help explaining these findings. 
Keywords:  Income Inequality; Item nonresponse; Income Distributions; Inequality Predictions; Imputations. 
JEL:  D31 D63 E64 O15 
Date:  2022 
URL:  http://d.repec.org/n?u=RePEc:frz:wpaper:wp2022_19.rdf&r= 
By:  MolerZapata, S.;; Grieve, R.;; Basu, A.;; O'Neill, S.; 
Abstract:  Local instrumental variable (LIV) approaches use continuous/multivalued instrumental variables (IV) to generate consistent estimates of average treatment effects (ATEs) and Conditional Average Treatment Effects (CATEs). However, there is little evidence on how LIV approaches perform with different sample sizes or according to the strength of the IV (as measured by the firststage Fstatistic). We examined the performance of an LIV approach and a twostage least squares (2SLS) approach in settings with different sample sizes and IV strengths, and considered the implications for practice. Our simulation study considered three sample sizes (n = 5000, 10000, 50000), six levels of IV strength (Fstatistic = 10, 25, 50, 100, 500, 1000) under four â€˜heterogeneityâ€™ scenarios: effect homogeneity, overt heterogeneity (over measured covariates), essential heterogeneity (over unmeasured covariates), and overt and essential heterogeneity combined. Compared to 2SLS, the LIV approach provided estimates for ATE and CATE with lower levels of bias and RMSE, irrespective of the sample size or IV strength. With smaller sample sizes, both approaches required IVs with greater strength to ensure low (less than 5%) levels of bias. In the presence of overt and/or essential heterogeneity, the LIV approach reported estimates with low bias even when the sample size was smaller (n = 5000), provided that the instrument was moderately strong (Fstatistic greater than 50, for the ATE estimand). We considered both methods in evaluating emergency surgery across three different acute conditions with IVs of differing strengths (Fstatistic ranging from 100 to 9000), and sample sizes (100000 to 300000). We found that 2SLS did not detect significant differences in effectiveness across subgroups, even with subgroup by treatment interactions included in the model. The LIV approach found there were substantive differences in the effectiveness of emergency surgery according to subgroups; for each of the three acute conditions, frail patients had worse outcomes following emergency surgery. These findings indicate that when a continuous IV of a moderate strength is available, LIV approaches are better suited than 2SLS to estimate policyrelevant treatment effect parameters. 
Keywords:  instrumental variables; instrument strength; tendency to operate; emergency surgery; 
Date:  2022–07 
URL:  http://d.repec.org/n?u=RePEc:yor:hectdg:22/18&r= 
By:  Dillon Bowen 
Abstract:  Decisionmaking often involves ranking and selection. For example, to assemble a team of political forecasters, we might begin by narrowing our choice set to the candidates we are confident rank among the top 10% in forecasting ability. Unfortunately, we do not know each candidate's true ability but observe a noisy estimate of it. This paper develops new Bayesian algorithms to rank and select candidates based on noisy estimates. Using simulations based on empirical data, we show that our algorithms often outperform frequentist ranking and selection algorithms. Our Bayesian ranking algorithms yield shorter rank confidence intervals while maintaining approximately correct coverage. Our Bayesian selection algorithms select more candidates while maintaining correct error rates. We apply our ranking and selection procedures to field experiments, economic mobility, forecasting, and similar problems. Finally, we implement our ranking and selection techniques in a userfriendly Python package documented here: https://dsbowenconditionalinference.re adthedocs.io/en/latest/. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.02038&r= 
By:  Bates, Stephen (UC Berkeley); Candes, Emmanuel (Stanford U); Lei, Lihua (Stanford U); Romano, Yaniv (Israel Institute of Technology); Sesia, Matteo (University of Southern California) 
Abstract:  This paper studies the construction of pvalues for nonparametric outlier detection, taking a multipletesting perspective. The goal is to test whether new independent samples belong to the same distribution as a reference data set or are outliers. We propose a solution based on conformal inference, a broadly applicable framework which yields pvalues that are marginally valid but mutually dependent for different test points. We prove these pvalues are positively dependent and enable exact false discovery rate control, although in a relatively weak marginal sense. We then introduce a new method to compute pvalues that are both valid conditionally on the training data and independent of each other for different test points; this paves the way to stronger typeI error guarantees. Our results depart from classical conformal inference as we leverage concentration inequalities rather than combinatorial arguments to establish our finitesample guarantees. Furthermore, our techniques also yield a uniform confidence bound for the false positive rate of any outlier detection algorithm, as a function of the threshold applied to its raw statistics. Finally, the relevance of our results is demonstrated by numerical experiments on real and simulated data. 
Date:  2022–05 
URL:  http://d.repec.org/n?u=RePEc:ecl:stabus:4027&r= 
By:  Lorenzo Mercuri; Andrea Perchiazzo; Edit Rroji 
Abstract:  In this paper we introduce a new model named CARMA(p,q)Hawkes process as the Hawkes model with exponential kernel implies a strictly decreasing behaviour of the autocorrelation function and empirically evidences reject the monotonicity assumption on the autocorrelation function. The proposed model is a Hawkes process where the intensity follows a Continuous Time Autoregressive Moving Average (CARMA) process and specifically is able to reproduce more realistic dependence structures. We also study the conditions of stationarity and positivity for the intensity and the strong mixing property for the increments. Furthermore we compute the likelihood, present a simulation method and discuss an estimation method based on the autocorrelation function. A simulation and estimation exercise highlights the main features of the CARMA(p,q)Hawkes. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.02659&r= 
By:  Yuta Shimodaira; Kohei Shiozawa; Keigo Inukai 
Abstract:  The convex time budget (CTB) method is a widely used experimental method for eliciting an individual’s time preference. Researchers adopting the CTB experiment usually assume quasihyperbolic discounting utility as a behavioural model and estimate the parameters of the utility function. However, few studies using the CTB method have examined parameter recovery. We conduct simulations and find that the estimation error of the present bias parameter is so large that its effect is difficult to detect. The large error is due to the improper combination of the experimental method and the utility model, and it is not a problem we can deal with after the data collection. This paper suggests the importance of running parameter recovery simulations to audit estimation errors in the experimental design. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:dpr:wpaper:1185&r= 
By:  Gentry, Matthew; Komarova, Tatiana; Schiraldi, Pasquale 
Abstract:  Motivated by the prevalence of simultaneous bidding across a wide range of auction markets, we develop and estimate a model of strategic interaction in simultaneous firstprice auctions when objects are heterogeneous and bidders have nonadditive preferences over combinations. We establish nonparametric identification of primitives in this model under standard exclusion restrictions, providing a basis for both estimation and testing of preferences over combinations. We then apply our model to data on Michigan Department of Transportation (MDOT) highway procurement auctions, quantifying the magnitude of cost synergies and evaluating the performance of the simultaneous firstprice mechanism in the MDOT marketplace. 
Keywords:  auctions; complementarities; identification; ES/N000056/1 
JEL:  D44 
Date:  2022–07–04 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:115627&r= 
By:  Gael M. Martin; David T. Frazier; Christian P. Robert 
Abstract:  This paper takes the reader on a journey through the history of Bayesian computation, from the 18th century to the present day. Beginning with the onedimensional integral first confronted by Bayes in 1763, we highlight the key contributions of: Laplace, Metropolis (and, importantly, his coauthors!), Hammersley and Handscomb, and Hastings, all of which set the foundations for the computational revolution in the late 20th century  led, primarily, by Markov chain Monte Carlo (MCMC) algorithms. A very short outline of 21st century computational methods  including pseudomarginal MCMC, Hamiltonian Monte Carlo, sequential Monte Carlo, and the various `approximate' methods  completes the paper. 
Keywords:  History of Bayesian computation, Laplace approximation, MetropolisHastings algorithm, importance sampling, Markov chain Monte Carlo, pseudomarginal methods, Hamiltonian Monte Carlo, sequential Monte Carlo, approximate Bayesian methods 
Date:  2022 
URL:  http://d.repec.org/n?u=RePEc:msh:ebswps:202214&r= 
By:  Carlos MontesGaldón (European Central Bank); Eva Ortega (Banco de España) 
Abstract:  This paper proposes a vector autoregressive model with structural shocks (SVAR) that are identified using sign restrictions and whose distribution is subject to timevarying skewness. It also presents an efficient Bayesian algorithm to estimate the model. The model allows for the joint tracking of asymmetric risks to macroeconomic variables included in the SVAR. It also provides a narrative about the structural reasons for the changes over time in those risks. Using euro area data, our estimation suggests that there has been a significant variation in the skewness of demand, supply and monetary policy shocks between 1999 and 2019. This variation lies behind a significant proportion of the joint dynamics of real GDP growth and inflation in the euro area over this period, and also generates important asymmetric tail risks in these macroeconomic variables. Finally, compared to the literature on growth and inflationatrisk, we found that financial stress indicators do not suffice to explain all the macroeconomic tail risks. 
Keywords:  Bayesian SVAR, skewness, growthatrisk, inflationatrisk 
JEL:  C11 C32 C51 E31 E32 
Date:  2022–03 
URL:  http://d.repec.org/n?u=RePEc:bde:wpaper:2208&r= 
By:  Yanqiu Ruan; Xiaobo Li; Karthyek Murthy; Karthik Natarajan 
Abstract:  Given data on choices made by consumers for different assortments, a key challenge is to develop parsimonious models that describe and predict consumer choice behavior. One such choice model is the marginal distribution model which requires only the specification of the marginal distributions of the random utilities of the alternatives to explain choice data. In this paper, we develop an exact characterisation of the set of choice probabilities which are representable by the marginal distribution model consistently across any collection of assortments. Allowing for the possibility of alternatives to be grouped based on the marginal distribution of their utilities, we show (a) verifying consistency of choice probability data with this model is possible in polynomial time and (b) finding the closest fit reduces to solving a mixed integer convex program. Our results show that the marginal distribution model provides much better representational power as compared to multinomial logit and much better computational performance as compared to the random utility model. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.06115&r= 