|
on Econometrics |
By: | Aristide Houndetoungan; Abdoul Haki Maoude |
Abstract: | We present a simulation-based approach to approximate the asymptotic variance and asymptotic distribution function of two-stage estimators. We focus on extremum estimators in the second stage and consider a large class of estimators in the first stage. This class includes extremum estimators, high-dimensional estimators, and other types of estimators (e.g., Bayesian estimators). We accommodate scenarios where the asymptotic distributions of both the first- and second-stage estimators are non-normal. We also allow for the second-stage estimator to exhibit a significant bias due to the first-stage sampling error. We introduce a debiased plug-in estimator and establish its limiting distribution. Our method is readily implementable with complex models. Unlike resampling methods, we eliminate the need for multiple computations of the plug-in estimator. Monte Carlo simulations confirm the effectiveness of our approach in finite samples. We present an empirical application with peer effects on adolescent fast-food consumption habits, where we employ the proposed method to address the issue of biased instrumental variable estimates resulting from the presence of many weak instruments. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.05030&r=ecm |
By: | Stauskas, Ovidijus; De Vos, Ignace |
Abstract: | The Common Correlated Effects (CCE) estimator is a popular method to estimate panel data regression models with interactive effects. Due to its simplicity in approximating the common factors with cross-section averages of the observables, it lends itself to a wide range of applications. They include static and dynamic models, homogeneous or heterogeneous coefficients or possibly very general types of factor structure. Despite such flexibility, with very few exceptions, CCE properties are usually examined under a restrictive assumption that all the observed variables load on the same set factors, which ensures joint identification of the factor space. In this paper, we explore an empirically relevant scenario when the dependent and explanatory variables are driven by distinct but correlated factors. In doing this, we consider panel dimensions such that T/N is finite even in large samples, which is known to induce an asymptotic bias in CCE setting. We subsequently develop a toolbox to perform asymptotically valid inference in homogeneous and heterogeneous panels. |
Keywords: | Panel data, bootstrap, interactive effects, CCE, factors, information criterion |
JEL: | C15 C33 C38 |
Date: | 2024–02–15 |
URL: | http://d.repec.org/n?u=RePEc:pra:mprapa:120194&r=ecm |
By: | Masayuki Sawada; Takuya Ishihara; Daisuke Kurisu; Yasumasa Matsuda |
Abstract: | We introduce a multivariate local-linear estimator for multivariate regression discontinuity designs in which treatment is assigned by crossing a boundary in the space of running variables. The dominant approach uses the Euclidean distance from a boundary point as the scalar running variable; hence, multivariate designs are handled as uni-variate designs. However, the distance running variable is incompatible with the assumption for asymptotic validity. We handle multivariate designs as multivariate. In this study, we develop a novel asymptotic normality for multivariate local-polynomial estimators. Our estimator is asymptotically valid and can capture heterogeneous treatment effects over the boundary. We demonstrate the effectiveness of our estimator through numerical simulations. Our empirical illustration of a Colombian scholarship study reveals a richer heterogeneity (including its absence) of the treatment effect that is hidden in the original estimates. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.08941&r=ecm |
By: | Chotipong Charoensom |
Abstract: | This paper proposes an approach to develop regime switching models where latent process determining the switching is endogenously controlled by the model shocks with free functional forms. The linear endogeneity assumption in the conventional endogenous regime switching models can therefore be relaxed. A recursive filter technique is applied to proceed maximum likelihood estimation in order to estimate the model parameters. A nonlinear endogenous two-regime switching mean-volatility model is conducted in numerical examples to investigate the model performance. In the examples, the endogeneity in switching allows heterogeneous effects of the shock signs (asymmetric endogeneity) and of the states being before the switching determination (state-dependent endogeneity). Monte Carlo simulations show that the conventional switching model ignoring the nonlinear endogeneity leads to the volatility biases. The estimates tend to be over or under their true value depending on how the endogeneity characteristics are. In particular, the true model that accounts the nonlinear endogeneity effectively provides the more precise estimates. The same model is also applied to real data of excess returns on US stock market, and the estimation results informatively describe the effects influencing the regime shifts. |
Keywords: | Nonlinear endogeneity; Regime switching; Maximum likelihood estimation; Asymmetric endogeneity; State-dependent endogeneity |
JEL: | C13 C32 |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:pui:dpaper:217&r=ecm |
By: | Tobias R\"uttenauer; Ozan Aksoy |
Abstract: | The conventional Two-Way Fixed-Effects (TWFE) estimator has been under strain lately. Recent literature has revealed potential shortcomings of TWFE when the treatment effects are heterogeneous. Scholars have developed new advanced dynamic Difference-in-Differences (DiD) estimators to tackle these potential shortcomings. However, confusion remains in applied research as to when the conventional TWFE is biased and what issues the novel estimators can and cannot address. In this study, we first provide an intuitive explanation of the problems of TWFE and elucidate the key features of the novel alternative DiD estimators. We then systematically demonstrate the conditions under which the conventional TWFE is inconsistent. We employ Monte Carlo simulations to assess the performance of dynamic DiD estimators under violations of key assumptions, which likely happens in applied cases. While the new dynamic DiD estimators offer notable advantages in capturing heterogeneous treatment effects, we show that the conventional TWFE performs generally well if the model specifies an event-time function. All estimators are equally sensitive to violations of the parallel trends assumption, anticipation effects or violations of time-varying exogeneity. Despite their advantages, the new dynamic DiD estimators tackle a very specific problem and they do not serve as a universal remedy for violations of the most critical assumptions. We finally derive, based on our simulations, recommendations for how and when to use TWFE and the new DiD estimators in applied research. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.09928&r=ecm |
By: | Jooyoung Cha |
Abstract: | Impulse response analysis studies how the economy responds to shocks, such as changes in interest rates, and helps policymakers manage these effects. While Vector Autoregression Models (VARs) with structural assumptions have traditionally dominated the estimation of impulse responses, local projections, the projection of future responses on current shock, have recently gained attention for their robustness and interpretability. Including many lags as controls is proposed as a means of robustness, and including a richer set of controls helps in its interpretation as a causal parameter. In both cases, an extensive number of controls leads to the consideration of high-dimensional techniques. While methods like LASSO exist, they mostly rely on sparsity assumptions - most of the parameters are exactly zero, which has limitations in dense data generation processes. This paper proposes a novel approach that incorporates high-dimensional covariates in local projections without relying on sparsity constraints. Adopting the Orthogonal Greedy Algorithm with a high-dimensional AIC (OGA+HDAIC) model selection method, this approach offers advantages including robustness in both sparse and dense scenarios, improved interpretability by prioritizing cross-sectional explanatory power, and more reliable causal inference in local projections. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.07743&r=ecm |
By: | Cl\'ement de Chaisemartin; Xavier D'Haultf{\oe}uille; Gonzalo Vazquez-Bare |
Abstract: | Many treatments or policy interventions are continuous in nature. Examples include prices, taxes or temperatures. Empirical researchers have usually relied on two-way fixed effect regressions to estimate treatment effects in such cases. However, such estimators are not robust to heterogeneous treatment effects in general; they also rely on the linearity of treatment effects. We propose estimators for continuous treatments that do not impose those restrictions, and that can be used when there are no stayers: the treatment of all units changes from one period to the next. We start by extending the nonparametric results of de Chaisemartin et al. (2023) to cases without stayers. We also present a parametric estimator, and use it to revisit Desch\^enes and Greenstone (2012). |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.05432&r=ecm |
By: | Arnaud Dufays; Aristide Houndetoungan; Alain Co\"en |
Abstract: | Change-point processes are one flexible approach to model long time series. We propose a method to uncover which model parameter truly vary when a change-point is detected. Given a set of breakpoints, we use a penalized likelihood approach to select the best set of parameters that changes over time and we prove that the penalty function leads to a consistent selection of the true model. Estimation is carried out via the deterministic annealing expectation-maximization algorithm. Our method accounts for model selection uncertainty and associates a probability to all the possible time-varying parameter specifications. Monte Carlo simulations highlight that the method works well for many time series models including heteroskedastic processes. For a sample of 14 Hedge funds (HF) strategies, using an asset based style pricing model, we shed light on the promising ability of our method to detect the time-varying dynamics of risk exposures as well as to forecast HF returns. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.05329&r=ecm |
By: | Jochmans, Koen |
Abstract: | This paper concerns the analysis of network data when unobserved node-specific heterogeneity is present. We postulate a weighted version of the classic stochastic block model, where nodes belong to one of a finite number of latent communities and the placement of edges between them and any weight assigned to these depend on the communities to which the nodes belong. A simple rank condition is presented under which we establish that the number of latent communities, their distribution, and the conditional distribution of edges and weights given community membership are all nonparametrically identified from knowledge of the joint (marginal) distribution of edges and weights in graphs of a fixed size. The identification argument is constructive and we present a computationally-attractive nonparametric estimator based on it. Limit theory is derived under asymptotics where we observe a growing number of independent networks of a fixed size. The results of a series of numerical experiments are reported on. |
Keywords: | Heterogeneity; network; random graph; sorting; stochastic block model |
Date: | 2024–02–26 |
URL: | http://d.repec.org/n?u=RePEc:tse:wpaper:129137&r=ecm |
By: | Philipp Bach; Oliver Schacht; Victor Chernozhukov; Sven Klaassen; Martin Spindler |
Abstract: | Proper hyperparameter tuning is essential for achieving optimal performance of modern machine learning (ML) methods in predictive tasks. While there is an extensive literature on tuning ML learners for prediction, there is only little guidance available on tuning ML learners for causal machine learning and how to select among different ML learners. In this paper, we empirically assess the relationship between the predictive performance of ML methods and the resulting causal estimation based on the Double Machine Learning (DML) approach by Chernozhukov et al. (2018). DML relies on estimating so-called nuisance parameters by treating them as supervised learning problems and using them as plug-in estimates to solve for the (causal) parameter. We conduct an extensive simulation study using data from the 2019 Atlantic Causal Inference Conference Data Challenge. We provide empirical insights on the role of hyperparameter tuning and other practical decisions for causal estimation with DML. First, we assess the importance of data splitting schemes for tuning ML learners within Double Machine Learning. Second, we investigate how the choice of ML methods and hyperparameters, including recent AutoML frameworks, impacts the estimation performance for a causal parameter of interest. Third, we assess to what extent the choice of a particular causal model, as characterized by incorporated parametric assumptions, can be based on predictive performance metrics. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.04674&r=ecm |
By: | Yiqi Liu; Francesca Molinari |
Abstract: | Decision-making processes increasingly rely on the use of algorithms. Yet, algorithms' predictive ability frequently exhibit systematic variation across subgroups of the population. While both fairness and accuracy are desirable properties of an algorithm, they often come at the cost of one another. What should a fairness-minded policymaker do then, when confronted with finite data? In this paper, we provide a consistent estimator for a theoretical fairness-accuracy frontier put forward by Liang, Lu and Mu (2023) and propose inference methods to test hypotheses that have received much attention in the fairness literature, such as (i) whether fully excluding a covariate from use in training the algorithm is optimal and (ii) whether there are less discriminatory alternatives to an existing algorithm. We also provide an estimator for the distance between a given algorithm and the fairest point on the frontier, and characterize its asymptotic distribution. We leverage the fact that the fairness-accuracy frontier is part of the boundary of a convex set that can be fully represented by its support function. We show that the estimated support function converges to a tight Gaussian process as the sample size increases, and then express policy-relevant hypotheses as restrictions on the support function to construct valid test statistics. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.08879&r=ecm |
By: | Jungjun Choi; Ming Yuan |
Abstract: | This paper studies the principal components (PC) estimator for high dimensional approximate factor models with weak factors in that the factor loading ($\boldsymbol{\Lambda}^0$) scales sublinearly in the number $N$ of cross-section units, i.e., $\boldsymbol{\Lambda}^{0\top} \boldsymbol{\Lambda}^0 / N^\alpha$ is positive definite in the limit for some $\alpha \in (0, 1)$. While the consistency and asymptotic normality of these estimates are by now well known when the factors are strong, i.e., $\alpha=1$, the statistical properties for weak factors remain less explored. Here, we show that the PC estimator maintains consistency and asymptotical normality for any $\alpha\in(0, 1)$, provided suitable conditions regarding the dependence structure in the noise are met. This complements earlier result by Onatski (2012) that the PC estimator is inconsistent when $\alpha=0$, and the more recent work by Bai and Ng (2023) who established the asymptotic normality of the PC estimator when $\alpha \in (1/2, 1)$. Our proof strategy integrates the traditional eigendecomposition-based approach for factor models with leave-one-out analysis similar in spirit to those used in matrix completion and other settings. This combination allows us to deal with factors weaker than the former and at the same time relax the incoherence and independence assumptions often associated with the later. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.05789&r=ecm |
By: | Fabrizio Ghezzi; Eduardo Rossi; Lorenzo Trapani |
Abstract: | We study online changepoint detection in the context of a linear regression model. We propose a class of heavily weighted statistics based on the CUSUM process of the regression residuals, which are specifically designed to ensure timely detection of breaks occurring early on during the monitoring horizon. We subsequently propose a class of composite statistics, constructed using different weighing schemes; the decision rule to mark a changepoint is based on the largest statistic across the various weights, thus effectively working like a veto-based voting mechanism, which ensures fast detection irrespective of the location of the changepoint. Our theory is derived under a very general form of weak dependence, thus being able to apply our tests to virtually all time series encountered in economics, medicine, and other applied sciences. Monte Carlo simulations show that our methodologies are able to control the procedure-wise Type I Error, and have short detection delays in the presence of breaks. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.04433&r=ecm |
By: | H. Rangika Iroshani Peiris; Chao Wang; Richard Gerlach; Minh-Ngoc Tran |
Abstract: | A semi-parametric joint Value-at-Risk (VaR) and Expected Shortfall (ES) forecasting framework employing multiple realized measures is developed. The proposed framework extends the quantile regression using multiple realized measures as exogenous variables to model the VaR. Then, the information from realized measures is used to model the time-varying relationship between VaR and ES. Finally, a measurement equation that models the contemporaneous dependence between the quantile and realized measures is used to complete the model. A quasi-likelihood, built on the asymmetric Laplace distribution, enables the Bayesian inference for the proposed model. An adaptive Markov Chain Monte Carlo method is used for the model estimation. The empirical section evaluates the performance of the proposed framework with six stock markets from January 2000 to June 2022, covering the period of COVID-19. Three realized measures, including 5-minute realized variance, bi-power variation, and realized kernel, are incorporated and evaluated in the proposed framework. One-step ahead VaR and ES forecasting results of the proposed model are compared to a range of parametric and semi-parametric models, lending support to the effectiveness of the proposed framework. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.09985&r=ecm |
By: | Sobin Joseph; Shashi Jain |
Abstract: | An extension of the Hawkes process, the Marked Hawkes process distinguishes itself by featuring variable jump size across each event, in contrast to the constant jump size observed in a Hawkes process without marks. While extensive literature has been dedicated to the non-parametric estimation of both the linear and non-linear Hawkes process, there remains a significant gap in the literature regarding the marked Hawkes process. In response to this, we propose a methodology for estimating the conditional intensity of the marked Hawkes process. We introduce two distinct models: \textit{Shallow Neural Hawkes with marks}- for Hawkes processes with excitatory kernels and \textit{Neural Network for Non-Linear Hawkes with Marks}- for non-linear Hawkes processes. Both these approaches take the past arrival times and their corresponding marks as the input to obtain the arrival intensity. This approach is entirely non-parametric, preserving the interpretability associated with the marked Hawkes process. To validate the efficacy of our method, we subject the method to synthetic datasets with known ground truth. Additionally, we apply our method to model cryptocurrency order book data, demonstrating its applicability to real-world scenarios. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.04740&r=ecm |
By: | Zongwu Cai (Department of Economics, The University of Kansas, Lawrence, KS 66045, USA); Hongwei Mei (Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX 79409, USA); Rui Wang (Department of Economics, The University of Kansas, Lawrence, KS 66045, USA) |
Abstract: | For a heterogeneous agent model with aggregate shocks, the seminal paper by Krusell and Smith (1998) provides an equilibrium framework depending only on the (conditional) mean wealth rather than the wealth distribution of all agents, which is referred to as approximate aggregation for their prototype model. Their result can be obtained through the analysis of a forward-backward system consisting of the Hamilton-Jacobi-Bellman equation, the Fokker-Planck equation, and some constraint. Different from the existing literature, this paper proposes a statistical method to verify whether a heterogeneous agent model features approximate aggregation in the scenario that only one agent's wealth together with the aggregate shocks is observable over time. Our main approach lies in studying a model specification testing problem for the evolution of the wealth (i.e. the Fokker-Planck equation) in some appropriate parametric family featuring approximate aggregation. The key challenge stems from the partially observed information where the wealth distribution of all agents is infeasible. To overcome this difficulty, first, a novel two-step estimate is proposed for estimating the parameter in the parametric family. Then, several testing statistics are constructed, and their asymptotic properties are established, which in turn provides several testing rules. Finally, some Monte Carlo simulations are conducted to illustrate the finite sample performance of the proposed tests. |
Keywords: | Heterogeneous agent model with aggregate shocks, Approximate aggregation; Model specification test; Equilibrium estimator; Partial observation. |
JEL: | C12 C13 E20 |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:kan:wpaper:202405&r=ecm |
By: | Connor R. Forsythe; Cristian Arteaga; John P. Helveston |
Abstract: | This paper introduces the Heterogeneous Aggregate Valence Analysis (HAVAN) model, a novel class of discrete choice models. We adopt the term "valence'' to encompass any latent quantity used to model consumer decision-making (e.g., utility, regret, etc.). Diverging from traditional models that parameterize heterogeneous preferences across various product attributes, HAVAN models (pronounced "haven") instead directly characterize alternative-specific heterogeneous preferences. This innovative perspective on consumer heterogeneity affords unprecedented flexibility and significantly reduces simulation burdens commonly associated with mixed logit models. In a simulation experiment, the HAVAN model demonstrates superior predictive performance compared to state-of-the-art artificial neural networks. This finding underscores the potential for HAVAN models to improve discrete choice modeling capabilities. |
Date: | 2024–01 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.00184&r=ecm |
By: | Nigar Hashimzade; Oleg Kirsanov; Tatiana Kirsanova; Junior Maih |
Abstract: | This paper presents a framework for empirical analysis of dynamic macroeconomic models using Bayesian filtering, with a specific focus on the state-space formulation of New Keynesian Dynamic Stochastic General Equilibrium (NK DSGE) models with multiple regimes. We outline the theoretical foundations of model estimation, provide the details of two families of powerful multiple-regime filters, IMM and GPB, and construct corresponding multiple-regime smoothers. A simulation exercise, based on a prototypical NK DSGE model, is used to demonstrate the computational robustness of the proposed filters and smoothers and evaluate their accuracy and speed. We show that the canonical IMM filter is faster than the commonly used Kim and Nelson (1999) filter and is no less, and often more, accurate. Using it with the matching smoother improves the precision in recovering unobserved variables by about 25%. Furthermore, applying it to the U.S. 1947-2023 macroeconomic time series, we successfully identify significant past policy shifts including those related to the post-Covid-19 period. Our results demonstrate the practical applicability and potential of the proposed routines in macroeconomic analysis. |
Keywords: | Markov switching models, filtering, smoothing |
JEL: | C11 C32 C54 E52 |
Date: | 2024 |
URL: | http://d.repec.org/n?u=RePEc:ces:ceswps:_10941&r=ecm |
By: | Ryan T. Godwin |
Abstract: | We find that in zero-truncated count data (y=1, 2, ...), individuals often gain information at first observation (y=1), leading to a common but unaddressed phenomenon of "one-inflation". The current standard, the zero-truncated negative binomial (ZTNB) model, is misspecified under one-inflation, causing bias and inconsistency. To address this, we introduce the one-inflated zero-truncated negative binomial (OIZTNB) regression model. The importance of our model is highlighted through simulation studies, and through the discovery of one-inflation in four datasets that have traditionally championed ZTNB. We recommended OIZTNB over ZTNB for most data, and provide estimation, marginal effects, and testing in the accompanying R package oneinfl. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.02272&r=ecm |
By: | Stefano Pietrosanti (Bank of Italy); Edoardo Rainone (Bank of Italy) |
Abstract: | We present a simple model of a credit market in which firms borrow from multiple banks and credit relationships are simultaneous and interdependent. In this environment, financial and real shocks induce credit reallocation across more and less affected lenders and borrowers. We show that the interdependence introduces a bias in the standard estimates of the effect of shocks on credit relationships. Moreover, we show that the use of firm fixed effects does not solve the issue, may magnify the problem and that the same bias contaminates fixed effects estimates. We propose a novel model that nests commonly used ones, uses the same information set, accounts for and quantifies spillover effects among credit relationships. We document its properties with Monte Carlo simulations and apply it to real credit register data. Evidence from the empirical application suggests that estimates not accounting for spillovers are indeed highly biased. |
Keywords: | credit markets, shocks propagation, networks, identification |
JEL: | C30 L14 G21 |
Date: | 2023–12 |
URL: | http://d.repec.org/n?u=RePEc:bdi:wptemi:td_1436_23&r=ecm |
By: | Luofeng Liao; Christian Kroer; Sergei Leonenkov; Okke Schrijvers; Liang Shi; Nicolas Stier-Moses; Congshan Zhang |
Abstract: | Online A/B testing is widely used in the internet industry to inform decisions on new feature roll-outs. For online marketplaces (such as advertising markets), standard approaches to A/B testing may lead to biased results when buyers operate under a budget constraint, as budget consumption in one arm of the experiment impacts performance of the other arm. To counteract this interference, one can use a budget-split design where the budget constraint operates on a per-arm basis and each arm receives an equal fraction of the budget, leading to ``budget-controlled A/B testing.'' Despite clear advantages of budget-controlled A/B testing, performance degrades when budget are split too small, limiting the overall throughput of such systems. In this paper, we propose a parallel budget-controlled A/B testing design where we use market segmentation to identify submarkets in the larger market, and we run parallel experiments on each submarket. Our contributions are as follows: First, we introduce and demonstrate the effectiveness of the parallel budget-controlled A/B test design with submarkets in a large online marketplace environment. Second, we formally define market interference in first-price auction markets using the first price pacing equilibrium (FPPE) framework. Third, we propose a debiased surrogate that eliminates the first-order bias of FPPE, drawing upon the principles of sensitivity analysis in mathematical programs. Fourth, we derive a plug-in estimator for the surrogate and establish its asymptotic normality. Fifth, we provide an estimation procedure for submarket parallel budget-controlled A/B tests. Finally, we present numerical examples on semi-synthetic data, confirming that the debiasing technique achieves the desired coverage properties. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2402.07322&r=ecm |