
on Econometrics 
By:  Simar, Léopold (Université catholique de Louvain, LIDAM/ISBA, Belgium); Zelenyuk, Valentin; Zhao, Shirong 
Abstract:  We develop new central limit theorems (CLTs) for the aggregate directional distance functions (DDFs), which embed the CLTs for the aggregate efficiency and simple mean DDFs as special cases. Moreover, we develop new CLTs for the aggregate DDFs in the presence of the weak disposability of undesirable outputs. Our MonteCarlo simulations confirm the good performance of statistical inference based on the new CLTs we have derived and illustrate how wrong the inference based on the standard CLTs can be. To our knowledge, this is the first study that provides both the asymptotic theory and the simulation evidence for the nonparametric frontier approaches when some outputs are undesirable. Finally, we provide an empirical illustration using a data set from large US banks as well as supply the computational code for alternative applications. 
Keywords:  Inference ; Data Envelopment Analysis ; Nonparametric Efficiency Estimators ; Undesirable Outputs ; Weak Disposability 
JEL:  C12 C13 C14 
Date:  2024–03–04 
URL:  http://d.repec.org/n?u=RePEc:aiz:louvad:2024010&r=ecm 
By:  Zihao Li; Hui Lan; Vasilis Syrgkanis; Mengdi Wang; Masatoshi Uehara 
Abstract:  In this paper, we study nonparametric estimation of instrumental variable (IV) regressions. While recent advancements in machine learning have introduced flexible methods for IV estimation, they often encounter one or more of the following limitations: (1) restricting the IV regression to be uniquely identified; (2) requiring minimax computation oracle, which is highly unstable in practice; (3) absence of model selection procedure. In this paper, we present the first method and analysis that can avoid all three limitations, while still enabling general function approximation. Specifically, we propose a minimaxoraclefree method called Regularized DeepIV (RDIV) regression that can converge to the leastnorm IV solution. Our method consists of two stages: first, we learn the conditional distribution of covariates, and by utilizing the learned distribution, we learn the estimator by minimizing a Tikhonovregularized loss function. We further show that our method allows model selection procedures that can achieve the oracle rates in the misspecified regime. When extended to an iterative estimator, our method matches the current stateoftheart convergence rate. Our method is a Tikhonov regularized variant of the popular DeepIV method with a nonparametric MLE firststage estimator, and our results provide the first rigorous guarantees for this empirically used method, showcasing the importance of regularization which was absent from the original work. 
Date:  2024–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2403.04236&r=ecm 
By:  Chad Hazlett; Tanvi Shinkre 
Abstract:  Researchers in many fields endeavor to estimate treatment effects by regressing outcome data (Y) on a treatment (D) and observed confounders (X). Even absent unobserved confounding, the regression coefficient on the treatment reports a weighted average of strataspecific treatment effects (Angrist, 1998). Where heterogeneous treatment effects cannot be ruled out, the resulting coefficient is thus not generally equal to the average treatment effect (ATE), and is unlikely to be the quantity of direct scientific or policy interest. The difference between the coefficient and the ATE has led researchers to propose various interpretational, bounding, and diagnostic aids (Humphreys, 2009; Aronow and Samii, 2016; Sloczynski, 2022; Chattopadhyay and Zubizarreta, 2023). We note that the linear regression of Y on D and X can be misspecified when the treatment effect is heterogeneous in X. The "weights of regression", for which we provide a new (more general) expression, simply characterize how the OLS coefficient will depart from the ATE under the misspecification resulting from unmodeled treatment effect heterogeneity. Consequently, a natural alternative to suffering these weights is to address the misspecification that gives rise to them. For investigators committed to linear approaches, we propose relying on the slightly weaker assumption that the potential outcomes are linear in X. Numerous wellknown estimators are unbiased for the ATE under this assumption, namely regressionimputation/gcomputation/Tlearner, regression with an interaction of the treatment and covariates (Lin, 2013), and balancing weights. Any of these approaches avoid the apparent weighting problem of the misspecified linear regression, at an efficiency cost that will be small when there are few covariates relative to sample size. We demonstrate these lessons using simulations in observational and experimental settings. 
Date:  2024–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2403.03299&r=ecm 
By:  Jacquemain, Alexandre (Université catholique de Louvain, LIDAM/ISBA, Belgium); Heuchenne, Cédric (Université de Liège); Pircalabelu, Eugen (Université catholique de Louvain, LIDAM/ISBA, Belgium) 
Abstract:  The Lorenz regression estimates the explained Gini coefficient, a quantity with a natural application in the measurement of inequality of opportunity. Assuming a singleindex model, it corresponds to the Gini coefficient of the conditional expectation of a response given some covariates and it can be estimated without having to estimate the link function. However, it is prone to overestimation when many covariates are included. In this paper, we propose a penalised bootstrap procedure which selects the relevant covariates and produces valid inference for the explained Gini coefficient. The obtained estimator achieves the Oracle property. Numerically, it is computed by the SCADFABS algorithm, an adaptation of the FABS algorithm to the SCAD penalty. The performance of the procedure is ensured by theoretical guarantees and assessed via MonteCarlo simulations. Finally, a real data example is presented. 
Keywords:  FABS algorithm ; Gini coefficient ; Lorenz regression ; SCAD penalty ; singleindex models 
Date:  2024–02–13 
URL:  http://d.repec.org/n?u=RePEc:aiz:louvad:2024005&r=ecm 
By:  Romain Aumond (CREST, ENSAE and Institut Polytechnique de Paris); Julien Royer (CREST and Institut Polytechnique de Paris) 
Abstract:  Tracking macroeconomic data at a high frequency is difficult as most time series are only available at a low frequency. Recently, the development of macroeconomic nowcasters to infer the current position of the economic cycle has attracted the attention of both academics and practitioners, with most of the central banks having developed statistical tools to track their economic situation. The specifications usually rely on a Markovswitching dynamic factor model with mixedfrequency data whose states allow for the identification of recession and expansion periods. However, such models are notoriously not robust to the occurrence of extreme shocks such as Covid19. In this paper, we show how the addition of timevarying volatilities in the dynamics of the model alleviates the effect of extreme observations and renders the dating of recessions more robust. Both stochastic and conditional volatility models are considered and we adapt recent Bayesian estimation techniques to infer the competing models parameters. We illustrate the good behavior of our estimation procedure as well as the robustness of our proposed model to various misspecifications through simulations. Additionally, in a real data exercise, it is shown how, both insample and in an outofsample exercise, the inclusion of a dynamic volatility component is beneficial for the identification of phases of the US economy 
Keywords:  Nowcasting; BayesianInference; DynamicFactorModels; Markov Switching 
Date:  2024–03–08 
URL:  http://d.repec.org/n?u=RePEc:crs:wpaper:202404&r=ecm 
By:  Markus Bibinger; Nikolaus Hautsch; Alexander Ristig 
Abstract:  We propose methods to infer jumps of a semimartingale, which describes longterm price dynamics based on discrete, noisy, highfrequency observations. Different to the classical model of additive, centered market microstructure noise, we consider onesided microstructure noise for order prices in a limit order book. We develop methods to estimate, locate and test for jumps using local order statistics. We provide a local test and show that we can consistently estimate price jumps. The main contribution is a global test for jumps. We establish the asymptotic properties and optimality of this test. We derive the asymptotic distribution of a maximum statistic under the null hypothesis of no jumps based on extreme value theory. We prove consistency under the alternative hypothesis. The rate of convergence for local alternatives is determined and shown to be much faster than optimal rates for the standard market microstructure noise model. This allows the identification of smaller jumps. In the process, we establish uniform consistency for spot volatility estimation under onesided microstructure noise. A simulation study sheds light on the finitesample implementation and properties of our new statistics and draws a comparison to a popular method for market microstructure noise. We showcase how our new approach helps to improve jump detection in an empirical analysis of intradaily limit order book data. 
Date:  2024–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2403.00819&r=ecm 
By:  Thilo Reinschl\"ussel; Martin C. Arnold 
Abstract:  We propose a novel approach to elicit the weight of a potentially nonstationary regressor in the consistent and oracleefficient estimation of autoregressive models using the adaptive Lasso. The enhanced weight builds on a statistic that exploits distinct orders in probability of the OLS estimator in time series regressions when the degree of integration differs. We provide theoretical results on the benefit of our approach for detecting stationarity when a tuning criterion selects the $\ell_1$ penalty parameter. Monte Carlo evidence shows that our proposal is superior to using OLSbased weights, as suggested by Kock [Econom. Theory, 32, 2016, 243259]. We apply the modified estimator to model selection for German inflation rates after the introduction of the Euro. The results indicate that energy commodity price inflation and headline inflation are best described by stationary autoregressions. 
Date:  2024–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2402.16580&r=ecm 
By:  Masahiro Kato; Akihiro Oga; Wataru Komatsubara; Ryo Inokuchi 
Abstract:  This study designs an adaptive experiment for efficiently estimating average treatment effect (ATEs). We consider an adaptive experiment where an experimenter sequentially samples an experimental unit from a covariate density decided by the experimenter and assigns a treatment. After assigning a treatment, the experimenter observes the corresponding outcome immediately. At the end of the experiment, the experimenter estimates an ATE using gathered samples. The objective of the experimenter is to estimate the ATE with a smaller asymptotic variance. Existing studies have designed experiments that adaptively optimize the propensity score (treatmentassignment probability). As a generalization of such an approach, we propose a framework under which an experimenter optimizes the covariate density, as well as the propensity score, and find that optimizing both covariate density and propensity score reduces the asymptotic variance more than optimizing only the propensity score. Based on this idea, in each round of our experiment, the experimenter optimizes the covariate density and propensity score based on past observations. To design an adaptive experiment, we first derive the efficient covariate density and propensity score that minimizes the semiparametric efficiency bound, a lower bound for the asymptotic variance given a fixed covariate density and a fixed propensity score. Next, we design an adaptive experiment using the efficient covariate density and propensity score sequentially estimated during the experiment. Lastly, we propose an ATE estimator whose asymptotic variance aligns with the minimized semiparametric efficiency bound. 
Date:  2024–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2403.03589&r=ecm 
By:  Laura Battaglia; Timothy Christensen; Stephen Hansen; Szymon Sacher 
Abstract:  The leading strategy for analyzing unstructured data uses two steps. First, latent variables of economic interest are estimated with an upstream information retrieval model. Second, the estimates are treated as "data" in a downstream econometric model. We establish theoretical arguments for why this twostep strategy leads to biased inference in empirically plausible settings. More constructively, we propose a onestep strategy for valid inference that uses the upstream and downstream models jointly. The onestep strategy (i) substantially reduces bias in simulations; (ii) has quantitatively important effects in a leading application using CEO timeuse data; and (iii) can be readily adapted by applied researchers. 
Date:  2024–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2402.15585&r=ecm 
By:  Luis Antonio Fantozzi Alvarez; Rodrigo Toneto 
Abstract:  This note introduces a novel weighted local average treatment effect representation for the twostages leastsquares (2SLS) estimand in the continuous instrument with binary treatment case. Under standard conditions, we obtain weights that are nonnegative, integrate to unity, and assign larger values to instrument support points that deviate from their average. Our representation does not require instruments to be discretized nor relies on limiting arguments, such as those used in the definition of the marginal treatment effect (MTE). The pattern of the weights also has a clear interpretation. We believe these features of the representation to be useful for applied researchers when communicating their results. As a direct byproduct of our approach, we also obtain a representation of the 2SLS estimand as a weighted average of treatment effects among ``marginal compliance'' groups, without having to resort to the thresholdcrossing representation underlying the MTE construction. As an application, we consider the interpretation of ``eventstudy 2SLS'' specifications with continuous instruments. 
Keywords:  Instrumental variables; Local average treatment effects; Eventstudy 
JEL:  C21 C23 C26 
Date:  2024–03–12 
URL:  http://d.repec.org/n?u=RePEc:spa:wpaper:2024wpecon11&r=ecm 
By:  Benjamin Wee 
Abstract:  Simulation Based Calibration (SBC) is applied to analyse two commonly used, competing Markov chain Monte Carlo algorithms for estimating the posterior distribution of a stochastic volatility model. In particular, the bespoke 'offset mixture approximation' algorithm proposed by Kim, Shephard, and Chib (1998) is explored together with a Hamiltonian Monte Carlo algorithm implemented through Stan. The SBC analysis involves a simulation study to assess whether each sampling algorithm has the capacity to produce valid inference for the correctly specified model, while also characterising statistical efficiency through the effective sample size. Results show that Stan's NoUTurn sampler, an implementation of Hamiltonian Monte Carlo, produces a wellcalibrated posterior estimate while the celebrated offset mixture approach is less efficient and poorly calibrated, though model parameterisation also plays a role. Limitations and restrictions of generality are discussed. 
Date:  2024–01 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2402.12384&r=ecm 
By:  Dalderop, J.; Linton, O. B. 
Abstract:  Optionimplied riskneutral densities are widely used for constructing forwardlooking risk measures. Meanwhile, investor risk aversion introduces a multiplicative pricing kernel between the riskneutral and true conditional densities of the underlying assetâ€™s return. This paper proposes a simple local estimator of the pricing kernel based on inverse density weighting, and characterizes its asymptotic bias and variance. The estimator can be used to correct biased density forecasts, and performs well in a simulation study. A local exponential linear variant of the estimator is proposed to include conditioning variables. In an application, we estimate a demandbased model for S&P 500 index options using net positions data, and attribute the Ushaped pricing kernel to heterogeneous beliefs about conditional volatility. 
Keywords:  Density Forecasting, Nonparametric Estimation, Option Pricing, Trade Data 
JEL:  C14 G13 
Date:  2024–03–05 
URL:  http://d.repec.org/n?u=RePEc:cam:camdae:2411&r=ecm 
By:  Robin M. Cross; Steven T. Buccola 
Abstract:  This paper considers the problem of interpreting orthogonalization model coefficients. We derive a causal economic interpretation of the GramSchmidt orthogonalization process and provide the conditions for its equivalence to total effects from a recursive Directed Acyclic Graph. We extend the GramSchmidt process to groups of simultaneous regressors common in economic data sets and derive its finite sample properties, finding its coefficients to be unbiased, stable, and more efficient than those from Ordinary Least Squares. Finally, we apply the estimator to childhood reading comprehension scores, controlling for such highly collinear characteristics as race, education, and income. The model expands Bohren et al.'s decomposition of systemic discrimination into channelspecific effects and improves its coefficient significance levels. 
Date:  2024–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2402.17103&r=ecm 
By:  Yuichi Kitamura; Louise Laage 
Abstract:  In the standard stochastic block model for networks, the probability of a connection between two nodes, often referred to as the edge probability, depends on the unobserved communities each of these nodes belongs to. We consider a flexible framework in which each edge probability, together with the probability of community assignment, are also impacted by observed covariates. We propose a computationally tractable twostep procedure to estimate the conditional edge probabilities as well as the community assignment probabilities. The first step relies on a spectral clustering algorithm applied to a localized adjacency matrix of the network. In the second step, knearest neighbor regression estimates are computed on the extracted communities. We study the statistical properties of these estimators by providing nonasymptotic bounds. 
Date:  2024–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2402.16322&r=ecm 
By:  Thanasis Stengos (Department of Economics and Finance, University of Guelph, Guelph ON Canada); Stelios Arvanitis (Athens University); Mehmet Pinar (Universidad de Sevilla); Nikolas Topaloglou (Athens University) 
Abstract:  In the Frequentistic Model Averaging framework and within a linear model background, we consider averaging methodologies that extend the analysis of both the generalized Jacknife Model Averaging (JMA) and the Mallows Model Averaging (MMA) criteria in a multiobjective setting. We consider an estiÂmator arising from a stochastic dominance perspective. We also consider averÂaging estimators that emerge from the minimization of several scalarizations of the vector criterion consisting of both the MMA and the JMA criteria as well as an estimator that can be represented as a Nash bargaining solution between the competing scalar criteria. We derive the limit theory of the estiÂmators under both a correct specification and a global misspecification frameÂwork. Characterizations of the averaging estimators introduced in the context of conservative optimization are also provided. Monte Carlo experiments sugÂgest that the averaging estimators proposed here occasionally provide with bias and/or MSE/MAE reductions. An empirical application using data from growth theory suggests that our model averaging methods assign relatively higher weights towards the traditional Solow type growth variables, yet they do not seem to exclude regressors that underpin the importance of factors like geography or institutions. 
Keywords:  frequentistic model averaging, Jacknife MA, Mallows MA, multiÂobjective optimization, stochastic dominance, approximate bound, Â£Pscalarization, Nash bargaining solution, growth regressions, core regressors, auxiliary regresÂsors. 
JEL:  C51 C52 
Date:  2024 
URL:  http://d.repec.org/n?u=RePEc:gue:guelph:202401&r=ecm 
By:  Andrew Ellis; Ran Spiegler 
Abstract:  A representative researcher pursuing a question has repeated opportunities for empirical research. To process findings, she must impose an identifying assumption, which ensures that repeated observation would provide a definitive answer to her question. Research designs vary in quality and are implemented only when the assumption is plausible enough according to a KLdivergencebased criterion, and then beliefs are Bayesupdated as if the assumption were perfectly valid. We study the dynamics of this learning process and its induced longrun beliefs. The rate of research cannot uniformly accelerate over time. We characterize environments in which it is stationary. Longrun beliefs can exhibit historydependence. We apply the model to stylized examples of empirical methodologies: experiments, causalinference techniques, and (in an extension) ``structural'' identification methods such as ``calibration'' and ``Heckman selection.'' 
Date:  2024–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2402.18713&r=ecm 
By:  Sukjin Han; Hiroaki Kaido; Lorenzo Magnolfi 
Abstract:  A key primitive of a strategic environment is the information available to players. Specifying a priori an information structure is often difficult for empirical researchers. We develop a test of information ordering that allows researchers to examine if the true information structure is at least as informative as a proposed baseline. We construct a computationally tractable test statistic by utilizing the notion of Bayes Correlated Equilibrium (BCE) to translate the ordering of information structures into an ordering of functions. We apply our test to examine whether hubs provide informational advantages to certain airlines in addition to market power. 
Date:  2024–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2402.19425&r=ecm 
By:  Hafner, Christian (Université catholique de Louvain, LIDAM/ISBA, Belgium); Linton, Oliver (obl20@cam.ac.uk); Wang, Linqi 
Abstract:  We develop a dynamic framework to detect the occurrence of permanent and transitory breaks in the illiquidity process. We propose various tests that can be applied separately to individual events and can be aggregated across different events over time for a given firm or across different firms. In an empirical study, we use this methodology to study the impact of stock splits on the illiquidity dynamics of the Dow Jones index constituents and the effects of reverse splits using stocks from the S&P 500, S&P 400 and S&P 600 indices. Our empirical results show that stock splits have a positive and significant effect on the permanent component of the illiquidity process while a majority of the stocks engaging in reverse splits experience an improvement in liquidity conditions. 
Keywords:  Amihud illiquidity ; Difference in Difference ; Event Study ; Nonparametric Estimation ; Reverse Split ; Structural Change 
JEL:  C12 C14 G14 G32 
Date:  2024–03–01 
URL:  http://d.repec.org/n?u=RePEc:aiz:louvad:2024007&r=ecm 