nep-ecm 2024-04-08 papers

on Econometrics

Issue of 2024‒04‒08
eighteen papers chosen by
Sune Karlsson, Örebro universitet

Central Limit Theorems for Directional Distance Functions with and without Undesirable Outputs By Simar, Léopold; Zelenyuk, Valentin; Zhao, Shirong
Regularized DeepIV with Model Selection By Zihao Li; Hui Lan; Vasilis Syrgkanis; Mengdi Wang; Masatoshi Uehara
Understanding and avoiding the "weights of regression": Heterogeneous effects, misspecification, and longstanding solutions By Chad Hazlett; Tanvi Shinkre
A penalised bootstrap estimation procedure for the explained Gini coefficient By Jacquemain, Alexandre; Heuchenne, Cédric; Pircalabelu, Eugen
Improving the robustness of Markov-switching dynamic factor models with time-varying volatility By Romain Aumond; Julien Royer
Jump detection in high-frequency order prices By Markus Bibinger; Nikolaus Hautsch; Alexander Ristig
Information-Enriched Selection of Stationary and Non-Stationary Autoregressions using the Adaptive Lasso By Thilo Reinschl\"ussel; Martin C. Arnold
Active Adaptive Experimental Design for Treatment Effect Estimation with Covariate Choices By Masahiro Kato; Akihiro Oga; Wataru Komatsubara; Ryo Inokuchi
Inference for Regression with Variables Generated from Unstructured Data By Laura Battaglia; Timothy Christensen; Stephen Hansen; Szymon Sacher
The interpetation of 2SLS with a continuous instrument: a weighted LATE representation By Luis Antonio Fantozzi Alvarez; Rodrigo Toneto
Comparing MCMC algorithms in Stochastic Volatility Models using Simulation Based Calibration By Benjamin Wee
Estimating a Density Ratio Model for Stock Market Risk and Option Demand By Dalderop, J.; Linton, O. B.
Causal Orthogonalization: Multicollinearity, Economic Interpretability, and the Gram-Schmidt Process By Robin M. Cross; Steven T. Buccola
Estimating Stochastic Block Models in the Presence of Covariates By Yuichi Kitamura; Louise Laage
Multi-Objective Frequentistic Model Averaging with an Application to Economic Growth By Thanasis Stengos; Stelios Arvanitis; Mehmet Pinar; Nikolas Topaloglou
Identifying Assumptions and Research Dynamics By Andrew Ellis; Ran Spiegler
Testing Information Ordering for Strategic Agents By Sukjin Han; Hiroaki Kaido; Lorenzo Magnolfi
The effect of stock splits on liquidity in a dynamic model By Hafner, Christian; Linton, Oliver; Wang, Linqi

Central Limit Theorems for Directional Distance Functions with and without Undesirable Outputs

By:	Simar, Léopold (Université catholique de Louvain, LIDAM/ISBA, Belgium); Zelenyuk, Valentin; Zhao, Shirong
Abstract:	We develop new central limit theorems (CLTs) for the aggregate directional distance functions (DDFs), which embed the CLTs for the aggregate efficiency and simple mean DDFs as special cases. Moreover, we develop new CLTs for the aggregate DDFs in the presence of the weak disposability of undesirable outputs. Our Monte-Carlo simulations confirm the good performance of statistical inference based on the new CLTs we have derived and illustrate how wrong the inference based on the standard CLTs can be. To our knowledge, this is the first study that provides both the asymptotic theory and the simulation evidence for the non-parametric frontier approaches when some outputs are undesirable. Finally, we provide an empirical illustration using a data set from large US banks as well as supply the computational code for alternative applications.
Keywords:	Inference ; Data Envelopment Analysis ; Non-parametric Efficiency Estimators ; Undesirable Outputs ; Weak Disposability
JEL:	C12 C13 C14
Date:	2024–03–04
URL:	http://d.repec.org/n?u=RePEc:aiz:louvad:2024010&r=ecm

Regularized DeepIV with Model Selection

By:	Zihao Li; Hui Lan; Vasilis Syrgkanis; Mengdi Wang; Masatoshi Uehara
Abstract:	In this paper, we study nonparametric estimation of instrumental variable (IV) regressions. While recent advancements in machine learning have introduced flexible methods for IV estimation, they often encounter one or more of the following limitations: (1) restricting the IV regression to be uniquely identified; (2) requiring minimax computation oracle, which is highly unstable in practice; (3) absence of model selection procedure. In this paper, we present the first method and analysis that can avoid all three limitations, while still enabling general function approximation. Specifically, we propose a minimax-oracle-free method called Regularized DeepIV (RDIV) regression that can converge to the least-norm IV solution. Our method consists of two stages: first, we learn the conditional distribution of covariates, and by utilizing the learned distribution, we learn the estimator by minimizing a Tikhonov-regularized loss function. We further show that our method allows model selection procedures that can achieve the oracle rates in the misspecified regime. When extended to an iterative estimator, our method matches the current state-of-the-art convergence rate. Our method is a Tikhonov regularized variant of the popular DeepIV method with a non-parametric MLE first-stage estimator, and our results provide the first rigorous guarantees for this empirically used method, showcasing the importance of regularization which was absent from the original work.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.04236&r=ecm

Understanding and avoiding the "weights of regression": Heterogeneous effects, misspecification, and longstanding solutions

By:	Chad Hazlett; Tanvi Shinkre
Abstract:	Researchers in many fields endeavor to estimate treatment effects by regressing outcome data (Y) on a treatment (D) and observed confounders (X). Even absent unobserved confounding, the regression coefficient on the treatment reports a weighted average of strata-specific treatment effects (Angrist, 1998). Where heterogeneous treatment effects cannot be ruled out, the resulting coefficient is thus not generally equal to the average treatment effect (ATE), and is unlikely to be the quantity of direct scientific or policy interest. The difference between the coefficient and the ATE has led researchers to propose various interpretational, bounding, and diagnostic aids (Humphreys, 2009; Aronow and Samii, 2016; Sloczynski, 2022; Chattopadhyay and Zubizarreta, 2023). We note that the linear regression of Y on D and X can be misspecified when the treatment effect is heterogeneous in X. The "weights of regression", for which we provide a new (more general) expression, simply characterize how the OLS coefficient will depart from the ATE under the misspecification resulting from unmodeled treatment effect heterogeneity. Consequently, a natural alternative to suffering these weights is to address the misspecification that gives rise to them. For investigators committed to linear approaches, we propose relying on the slightly weaker assumption that the potential outcomes are linear in X. Numerous well-known estimators are unbiased for the ATE under this assumption, namely regression-imputation/g-computation/T-learner, regression with an interaction of the treatment and covariates (Lin, 2013), and balancing weights. Any of these approaches avoid the apparent weighting problem of the misspecified linear regression, at an efficiency cost that will be small when there are few covariates relative to sample size. We demonstrate these lessons using simulations in observational and experimental settings.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.03299&r=ecm

A penalised bootstrap estimation procedure for the explained Gini coefficient

By:	Jacquemain, Alexandre (Université catholique de Louvain, LIDAM/ISBA, Belgium); Heuchenne, Cédric (Université de Liège); Pircalabelu, Eugen (Université catholique de Louvain, LIDAM/ISBA, Belgium)
Abstract:	The Lorenz regression estimates the explained Gini coefficient, a quantity with a natural application in the measurement of inequality of opportunity. Assuming a single-index model, it corresponds to the Gini coefficient of the conditional expectation of a response given some covariates and it can be estimated without having to estimate the link function. However, it is prone to overestimation when many covariates are included. In this paper, we propose a penalised bootstrap procedure which selects the relevant covariates and produces valid inference for the explained Gini coefficient. The obtained estimator achieves the Oracle property. Numerically, it is computed by the SCAD-FABS algorithm, an adaptation of the FABS algorithm to the SCAD penalty. The performance of the procedure is ensured by theoretical guarantees and assessed via Monte-Carlo simulations. Finally, a real data example is presented.
Keywords:	FABS algorithm ; Gini coefficient ; Lorenz regression ; SCAD penalty ; single-index models
Date:	2024–02–13
URL:	http://d.repec.org/n?u=RePEc:aiz:louvad:2024005&r=ecm

Improving the robustness of Markov-switching dynamic factor models with time-varying volatility

By:	Romain Aumond (CREST, ENSAE and Institut Polytechnique de Paris); Julien Royer (CREST and Institut Polytechnique de Paris)
Abstract:	Tracking macroeconomic data at a high frequency is difficult as most time series are only available at a low frequency. Recently, the development of macroeconomic nowcasters to infer the current position of the economic cycle has attracted the attention of both academics and practitioners, with most of the central banks having developed statistical tools to track their economic situation. The specifications usually rely on a Markov-switching dynamic factor model with mixed-frequency data whose states allow for the identification of recession and expansion periods. However, such models are notoriously not robust to the occurrence of extreme shocks such as Covid-19. In this paper, we show how the addition of time-varying volatilities in the dynamics of the model alleviates the effect of extreme observations and renders the dating of recessions more robust. Both stochastic and conditional volatility models are considered and we adapt recent Bayesian estimation techniques to infer the competing models parameters. We illustrate the good behavior of our estimation procedure as well as the robustness of our proposed model to various misspecifications through simulations. Additionally, in a real data exercise, it is shown how, both insample and in an out-of-sample exercise, the inclusion of a dynamic volatility component is beneficial for the identification of phases of the US economy
Keywords:	Nowcasting; BayesianInference; DynamicFactorModels; Markov Switching
Date:	2024–03–08
URL:	http://d.repec.org/n?u=RePEc:crs:wpaper:2024-04&r=ecm

Jump detection in high-frequency order prices

By:	Markus Bibinger; Nikolaus Hautsch; Alexander Ristig
Abstract:	We propose methods to infer jumps of a semi-martingale, which describes long-term price dynamics based on discrete, noisy, high-frequency observations. Different to the classical model of additive, centered market microstructure noise, we consider one-sided microstructure noise for order prices in a limit order book. We develop methods to estimate, locate and test for jumps using local order statistics. We provide a local test and show that we can consistently estimate price jumps. The main contribution is a global test for jumps. We establish the asymptotic properties and optimality of this test. We derive the asymptotic distribution of a maximum statistic under the null hypothesis of no jumps based on extreme value theory. We prove consistency under the alternative hypothesis. The rate of convergence for local alternatives is determined and shown to be much faster than optimal rates for the standard market microstructure noise model. This allows the identification of smaller jumps. In the process, we establish uniform consistency for spot volatility estimation under one-sided microstructure noise. A simulation study sheds light on the finite-sample implementation and properties of our new statistics and draws a comparison to a popular method for market microstructure noise. We showcase how our new approach helps to improve jump detection in an empirical analysis of intra-daily limit order book data.
Date:	2024–02
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.00819&r=ecm

Information-Enriched Selection of Stationary and Non-Stationary Autoregressions using the Adaptive Lasso

By:	Thilo Reinschl\"ussel; Martin C. Arnold
Abstract:	We propose a novel approach to elicit the weight of a potentially non-stationary regressor in the consistent and oracle-efficient estimation of autoregressive models using the adaptive Lasso. The enhanced weight builds on a statistic that exploits distinct orders in probability of the OLS estimator in time series regressions when the degree of integration differs. We provide theoretical results on the benefit of our approach for detecting stationarity when a tuning criterion selects the $\ell_1$ penalty parameter. Monte Carlo evidence shows that our proposal is superior to using OLS-based weights, as suggested by Kock [Econom. Theory, 32, 2016, 243-259]. We apply the modified estimator to model selection for German inflation rates after the introduction of the Euro. The results indicate that energy commodity price inflation and headline inflation are best described by stationary autoregressions.
Date:	2024–02
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2402.16580&r=ecm

Active Adaptive Experimental Design for Treatment Effect Estimation with Covariate Choices

By:	Masahiro Kato; Akihiro Oga; Wataru Komatsubara; Ryo Inokuchi
Abstract:	This study designs an adaptive experiment for efficiently estimating average treatment effect (ATEs). We consider an adaptive experiment where an experimenter sequentially samples an experimental unit from a covariate density decided by the experimenter and assigns a treatment. After assigning a treatment, the experimenter observes the corresponding outcome immediately. At the end of the experiment, the experimenter estimates an ATE using gathered samples. The objective of the experimenter is to estimate the ATE with a smaller asymptotic variance. Existing studies have designed experiments that adaptively optimize the propensity score (treatment-assignment probability). As a generalization of such an approach, we propose a framework under which an experimenter optimizes the covariate density, as well as the propensity score, and find that optimizing both covariate density and propensity score reduces the asymptotic variance more than optimizing only the propensity score. Based on this idea, in each round of our experiment, the experimenter optimizes the covariate density and propensity score based on past observations. To design an adaptive experiment, we first derive the efficient covariate density and propensity score that minimizes the semiparametric efficiency bound, a lower bound for the asymptotic variance given a fixed covariate density and a fixed propensity score. Next, we design an adaptive experiment using the efficient covariate density and propensity score sequentially estimated during the experiment. Lastly, we propose an ATE estimator whose asymptotic variance aligns with the minimized semiparametric efficiency bound.
Date:	2024–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2403.03589&r=ecm

Inference for Regression with Variables Generated from Unstructured Data

By:	Laura Battaglia; Timothy Christensen; Stephen Hansen; Szymon Sacher
Abstract:	The leading strategy for analyzing unstructured data uses two steps. First, latent variables of economic interest are estimated with an upstream information retrieval model. Second, the estimates are treated as "data" in a downstream econometric model. We establish theoretical arguments for why this two-step strategy leads to biased inference in empirically plausible settings. More constructively, we propose a one-step strategy for valid inference that uses the upstream and downstream models jointly. The one-step strategy (i) substantially reduces bias in simulations; (ii) has quantitatively important effects in a leading application using CEO time-use data; and (iii) can be readily adapted by applied researchers.
Date:	2024–02
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2402.15585&r=ecm

The interpetation of 2SLS with a continuous instrument: a weighted LATE representation

By:	Luis Antonio Fantozzi Alvarez; Rodrigo Toneto
Abstract:	This note introduces a novel weighted local average treatment effect representation for the two-stages least-squares (2SLS) estimand in the continuous instrument with binary treatment case. Under standard conditions, we obtain weights that are nonnegative, integrate to unity, and assign larger values to instrument support points that deviate from their average. Our representation does not require instruments to be discretized nor relies on limiting arguments, such as those used in the definition of the marginal treatment effect (MTE). The pattern of the weights also has a clear interpretation. We believe these features of the representation to be useful for applied researchers when communicating their results. As a direct byproduct of our approach, we also obtain a representation of the 2SLS estimand as a weighted average of treatment effects among ``marginal compliance'' groups, without having to resort to the threshold-crossing representation underlying the MTE construction. As an application, we consider the interpretation of ``event-study 2SLS'' specifications with continuous instruments.
Keywords:	Instrumental variables; Local average treatment effects; Event-study
JEL:	C21 C23 C26
Date:	2024–03–12
URL:	http://d.repec.org/n?u=RePEc:spa:wpaper:2024wpecon11&r=ecm

Comparing MCMC algorithms in Stochastic Volatility Models using Simulation Based Calibration

By:	Benjamin Wee
Abstract:	Simulation Based Calibration (SBC) is applied to analyse two commonly used, competing Markov chain Monte Carlo algorithms for estimating the posterior distribution of a stochastic volatility model. In particular, the bespoke 'off-set mixture approximation' algorithm proposed by Kim, Shephard, and Chib (1998) is explored together with a Hamiltonian Monte Carlo algorithm implemented through Stan. The SBC analysis involves a simulation study to assess whether each sampling algorithm has the capacity to produce valid inference for the correctly specified model, while also characterising statistical efficiency through the effective sample size. Results show that Stan's No-U-Turn sampler, an implementation of Hamiltonian Monte Carlo, produces a well-calibrated posterior estimate while the celebrated off-set mixture approach is less efficient and poorly calibrated, though model parameterisation also plays a role. Limitations and restrictions of generality are discussed.
Date:	2024–01
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2402.12384&r=ecm

Estimating a Density Ratio Model for Stock Market Risk and Option Demand

By:	Dalderop, J.; Linton, O. B.
Abstract:	Option-implied risk-neutral densities are widely used for constructing forward-looking risk measures. Meanwhile, investor risk aversion introduces a multiplicative pricing kernel between the risk-neutral and true conditional densities of the underlying assetâ€™s return. This paper proposes a simple local estimator of the pricing kernel based on inverse density weighting, and characterizes its asymptotic bias and variance. The estimator can be used to correct biased density forecasts, and performs well in a simulation study. A local exponential linear variant of the estimator is proposed to include conditioning variables. In an application, we estimate a demand-based model for S&P 500 index options using net positions data, and attribute the U-shaped pricing kernel to heterogeneous beliefs about conditional volatility.
Keywords:	Density Forecasting, Nonparametric Estimation, Option Pricing, Trade Data
JEL:	C14 G13
Date:	2024–03–05
URL:	http://d.repec.org/n?u=RePEc:cam:camdae:2411&r=ecm

Causal Orthogonalization: Multicollinearity, Economic Interpretability, and the Gram-Schmidt Process

By:	Robin M. Cross; Steven T. Buccola
Abstract:	This paper considers the problem of interpreting orthogonalization model coefficients. We derive a causal economic interpretation of the Gram-Schmidt orthogonalization process and provide the conditions for its equivalence to total effects from a recursive Directed Acyclic Graph. We extend the Gram-Schmidt process to groups of simultaneous regressors common in economic data sets and derive its finite sample properties, finding its coefficients to be unbiased, stable, and more efficient than those from Ordinary Least Squares. Finally, we apply the estimator to childhood reading comprehension scores, controlling for such highly collinear characteristics as race, education, and income. The model expands Bohren et al.'s decomposition of systemic discrimination into channel-specific effects and improves its coefficient significance levels.
Date:	2024–02
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2402.17103&r=ecm

Estimating Stochastic Block Models in the Presence of Covariates

By:	Yuichi Kitamura; Louise Laage
Abstract:	In the standard stochastic block model for networks, the probability of a connection between two nodes, often referred to as the edge probability, depends on the unobserved communities each of these nodes belongs to. We consider a flexible framework in which each edge probability, together with the probability of community assignment, are also impacted by observed covariates. We propose a computationally tractable two-step procedure to estimate the conditional edge probabilities as well as the community assignment probabilities. The first step relies on a spectral clustering algorithm applied to a localized adjacency matrix of the network. In the second step, k-nearest neighbor regression estimates are computed on the extracted communities. We study the statistical properties of these estimators by providing non-asymptotic bounds.
Date:	2024–02
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2402.16322&r=ecm

Multi-Objective Frequentistic Model Averaging with an Application to Economic Growth

By:	Thanasis Stengos (Department of Economics and Finance, University of Guelph, Guelph ON Canada); Stelios Arvanitis (Athens University); Mehmet Pinar (Universidad de Sevilla); Nikolas Topaloglou (Athens University)
Abstract:	In the Frequentistic Model Averaging framework and within a linear model background, we consider averaging methodologies that extend the analysis of both the generalized Jacknife Model Averaging (JMA) and the Mallows Model Averaging (MMA) criteria in a multi-objective setting. We consider an estiÂmator arising from a stochastic dominance perspective. We also consider averÂaging estimators that emerge from the minimization of several scalarizations of the vector criterion consisting of both the MMA and the JMA criteria as well as an estimator that can be represented as a Nash bargaining solution between the competing scalar criteria. We derive the limit theory of the estiÂmators under both a correct specification and a global misspecification frameÂwork. Characterizations of the averaging estimators introduced in the context of conservative optimization are also provided. Monte Carlo experiments sugÂgest that the averaging estimators proposed here occasionally provide with bias and/or MSE/MAE reductions. An empirical application using data from growth theory suggests that our model averaging methods assign relatively higher weights towards the traditional Solow type growth variables, yet they do not seem to exclude regressors that underpin the importance of factors like geography or institutions.
Keywords:	frequentistic model averaging, Jacknife MA, Mallows MA, multiÂobjective optimization, stochastic dominance, approximate bound, Â£P-scalarization, Nash bargaining solution, growth regressions, core regressors, auxiliary regresÂsors.
JEL:	C51 C52
Date:	2024
URL:	http://d.repec.org/n?u=RePEc:gue:guelph:2024-01&r=ecm

Identifying Assumptions and Research Dynamics

By:	Andrew Ellis; Ran Spiegler
Abstract:	A representative researcher pursuing a question has repeated opportunities for empirical research. To process findings, she must impose an identifying assumption, which ensures that repeated observation would provide a definitive answer to her question. Research designs vary in quality and are implemented only when the assumption is plausible enough according to a KL-divergence-based criterion, and then beliefs are Bayes-updated as if the assumption were perfectly valid. We study the dynamics of this learning process and its induced long-run beliefs. The rate of research cannot uniformly accelerate over time. We characterize environments in which it is stationary. Long-run beliefs can exhibit history-dependence. We apply the model to stylized examples of empirical methodologies: experiments, causal-inference techniques, and (in an extension) ``structural'' identification methods such as ``calibration'' and ``Heckman selection.''
Date:	2024–02
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2402.18713&r=ecm

Testing Information Ordering for Strategic Agents

By:	Sukjin Han; Hiroaki Kaido; Lorenzo Magnolfi
Abstract:	A key primitive of a strategic environment is the information available to players. Specifying a priori an information structure is often difficult for empirical researchers. We develop a test of information ordering that allows researchers to examine if the true information structure is at least as informative as a proposed baseline. We construct a computationally tractable test statistic by utilizing the notion of Bayes Correlated Equilibrium (BCE) to translate the ordering of information structures into an ordering of functions. We apply our test to examine whether hubs provide informational advantages to certain airlines in addition to market power.
Date:	2024–02
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2402.19425&r=ecm

The effect of stock splits on liquidity in a dynamic model

By:	Hafner, Christian (Université catholique de Louvain, LIDAM/ISBA, Belgium); Linton, Oliver (obl20@cam.ac.uk); Wang, Linqi
Abstract:	We develop a dynamic framework to detect the occurrence of permanent and transitory breaks in the illiquidity process. We propose various tests that can be applied separately to individual events and can be aggregated across different events over time for a given firm or across different firms. In an empirical study, we use this methodology to study the impact of stock splits on the illiquidity dynamics of the Dow Jones index constituents and the effects of reverse splits using stocks from the S&P 500, S&P 400 and S&P 600 indices. Our empirical results show that stock splits have a positive and significant effect on the permanent component of the illiquidity process while a majority of the stocks engaging in reverse splits experience an improvement in liquidity conditions.
Keywords:	Amihud illiquidity ; Difference in Difference ; Event Study ; Nonparametric Estimation ; Reverse Split ; Structural Change
JEL:	C12 C14 G14 G32
Date:	2024–03–01
URL:	http://d.repec.org/n?u=RePEc:aiz:louvad:2024007&r=ecm

This nep-ecm issue is ©2024 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.