|
on Econometrics |
By: | Sylvain Barde; Rowan Cherodian; Guy Tchuente |
Abstract: | We propose a novel estimation procedure for models with endogenous variables in the presence of spatial correlation based on Eigenvector Spatial Filtering. The procedure, called Moran's $I$ 2-Stage Lasso (Mi-2SL), uses a two-stage Lasso estimator where the Standardised Moran's I is used to set the Lasso tuning parameter. Unlike existing spatial econometric methods, this has the key benefit of not requiring the researcher to explicitly model the spatial correlation process, which is of interest in cases where they are only interested in removing the resulting bias when estimating the direct effect of covariates. We show the conditions necessary for consistent and asymptotically normal parameter estimation assuming the support (relevant) set of eigenvectors is known. Our Monte Carlo simulation results also show that Mi-2SL performs well against common alternatives in the presence of spatial correlation. Our empirical application replicates Cadena and Kovak (2016) instrumental variables estimates using Mi-2SL and shows that in that case, Mi-2SL can boost the performance of the first stage. |
Date: | 2024–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2404.02584&r=ecm |
By: | Gyungbae Park |
Abstract: | This paper studies debiased machine learning when nuisance parameters appear in indicator functions. An important example is maximized average welfare under optimal treatment assignment rules. For asymptotically valid inference for a parameter of interest, the current literature on debiased machine learning relies on Gateaux differentiability of the functions inside moment conditions, which does not hold when nuisance parameters appear in indicator functions. In this paper, we propose smoothing the indicator functions, and develop an asymptotic distribution theory for this class of models. The asymptotic behavior of the proposed estimator exhibits a trade-off between bias and variance due to smoothing. We study how a parameter which controls the degree of smoothing can be chosen optimally to minimize an upper bound of the asymptotic mean squared error. A Monte Carlo simulation supports the asymptotic distribution theory, and an empirical example illustrates the implementation of the method. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.15934&r=ecm |
By: | Yuya Shimizu; Taisuke Otsu |
Abstract: | This paper studies optimal hypothesis testing for nonregular statistical models with parameter-dependent support. We consider both one-sided and two-sided hypothesis testing and develop asymptotically uniformly most powerful tests based on the likelihood ratio process. The proposed one-sided test involves randomization to achieve asymptotic size control, some tuning constant to avoid discontinuities in the limiting likelihood ratio process, and a user-specified alternative hypothetical value to achieve the asymptotic optimality. Our two-sided test becomes asymptotically uniformly most powerful without imposing further restrictions such as unbiasedness. Simulation results illustrate desirable power properties of the proposed tests. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.16413&r=ecm |
By: | Myungkou Shin |
Abstract: | Treatment effect heterogeneity is of a great concern when evaluating the treatment. However, even with a simple case of a binary treatment, the distribution of treatment effect is difficult to identify due to the fundamental limitation that we cannot observe both treated potential outcome and untreated potential outcome for a given individual. This paper assumes a finite mixture model on the potential outcomes and a vector of control covariates to address treatment endogeneity and imposes a Markov condition on the potential outcomes and covariates within each type to identify the treatment effect distribution. The mixture weights of the finite mixture model are consistently estimated with a nonnegative matrix factorization algorithm, thus allowing us to consistently estimate the component distribution parameters, including ones for the treatment effect distribution. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.18503&r=ecm |
By: | Lauren Bin Dong; Davod E. A. Giles (Department of Economics, University of Victoria) |
Abstract: | The empirical likelihood ratio (ELR) test for the problem of testing for normality in a linear regression model is derived in this paper. The sampling properties of the ELR test and four other commonly used tests are explored and analyzed using Monte Carlo simulation. The ELR test has good power properties against various alternative hypotheses. |
Keywords: | Regression residual, empirical likelihood ratio, Monte Carlo simulation, normality JEL Classifications: C12, C15, C16 |
Date: | 2024–03–21 |
URL: | http://d.repec.org/n?u=RePEc:vic:vicddp:0402&r=ecm |
By: | Amparo Ba\'illo; Javier C\'arcamo; Carlos Mora-Corral |
Abstract: | We introduce a 2-dimensional stochastic dominance (2DSD) index to characterize both strict and almost stochastic dominance. Based on this index, we derive an estimator for the minimum violation ratio (MVR), also known as the critical parameter, of the almost stochastic ordering condition between two variables. We determine the asymptotic properties of the empirical 2DSD index and MVR for the most frequently used stochastic orders. We also provide conditions under which the bootstrap estimators of these quantities are strongly consistent. As an application, we develop consistent bootstrap testing procedures for almost stochastic dominance. The performance of the tests is checked via simulations and the analysis of real data. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.15258&r=ecm |
By: | Fryzlewicz, Piotr |
Abstract: | We propose Robust Narrowest Significance Pursuit (RNSP), a methodology for detecting localized regions in data sequences which each must contain a change-point in the median, at a prescribed global significance level. RNSP works by fitting the postulated constant model over many regions of the data using a new sign-multiresolution sup-norm-type loss, and greedily identifying the shortest intervals on which the constancy is significantly violated. By working with the signs of the data around fitted model candidates, RNSP fulfils its coverage promises under minimal assumptions, requiring only sign-symmetry and serial independence of the signs of the true residuals. In particular, it permits their heterogeneity and arbitrarily heavy tails. The intervals of significance returned by RNSP have a finite-sample character, are unconditional in nature and do not rely on any assumptions on the true signal. Code implementing RNSP is available at https://github.com/pfryz/nsp. |
Keywords: | confidence intervals; structural breaks; post-selection inference; post-inference selection; narrowest-over-threshold; Confidence intervals; EP/V053639/1; T&F deal |
JEL: | C1 J1 |
Date: | 2024–03–15 |
URL: | http://d.repec.org/n?u=RePEc:ehl:lserod:121646&r=ecm |
By: | Benjamin Lu; Jia Wan; Derek Ouyang; Jacob Goldin; Daniel E. Ho |
Abstract: | Measuring average differences in an outcome across racial or ethnic groups is a crucial first step for equity assessments, but researchers often lack access to data on individuals’ races and ethnicities to calculate them. A common solution is to impute the missing race or ethnicity labels using proxies, then use those imputations to estimate the disparity. Conventional standard errors mischaracterize the resulting estimate’s uncertainty because they treat the imputation model as given and fixed, instead of as an unknown object that must be estimated with uncertainty. We propose a dual-bootstrap approach that explicitly accounts for measurement uncertainty and thus enables more accurate statistical inference, which we demonstrate via simulation. In addition, we adapt our approach to the commonly used Bayesian Improved Surname Geocoding (BISG) imputation algorithm, where direct bootstrapping is infeasible because the underlying Census Bureau data are unavailable. In simulations, we find that measurement uncertainty is generally insignificant for BISG except in particular circumstances; bias, not variance, is likely the predominant source of error. We apply our method to quantify the uncertainty of prevalence estimates of common health conditions by race using data from the American Family Cohort. |
JEL: | C10 J10 J15 |
Date: | 2024–04 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:32312&r=ecm |
By: | Takuya Ura; Lina Zhang |
Abstract: | This paper provides a framework for the policy relevant treatment effects using instrumental variables. In the framework, a treatment selection may or may not satisfy the classical monotonicity condition and can accommodate multidimensional unobserved heterogeneity. We can bound the target parameter by extracting information from identifiable estimands. We also provide a more conservative yet computationally simpler bound by applying a convex relaxation method. Linear shape restrictions can be easily incorporated to further improve the bounds. Numerical and simulation results illustrate the informativeness of our convex-relaxation bounds, i.e., that our bounds are sufficiently tight. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.13738&r=ecm |
By: | Jiawei Fu; Tara Slough |
Abstract: | The credibility revolution advances the use of research designs that permit identification and estimation of causal effects. However, understanding which mechanisms produce measured causal effects remains a challenge. A dominant current approach to the quantitative evaluation of mechanisms relies on the detection of heterogeneous treatment effects with respect to pre-treatment covariates. This paper develops a framework to understand when the existence of such heterogeneous treatment effects can support inferences about the activation of a mechanism. We show first that this design cannot provide evidence of mechanism activation without additional, generally implicit, assumptions. Further, even when these assumptions are satisfied, if a measured outcome is produced by a non-linear transformation of a directly-affected outcome of theoretical interest, heterogeneous treatment effects are not informative of mechanism activation. We provide novel guidance for interpretation and research design in light of these findings. |
Date: | 2024–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2404.01566&r=ecm |
By: | Hiroki Masuda; Lorenzo Mercuri; Yuma Uehara |
Abstract: | The aim of this paper is to discuss an estimation and a simulation method in the \textsf{R} package YUIMA for a linear regression model driven by a Student-$t$ L\'evy process with constant scale and arbitrary degrees of freedom. This process finds applications in several fields, for example finance, physic, biology, etc. The model presents two main issues. The first is related to the simulation of a sample path at high-frequency level. Indeed, only the $t$-L\'evy increments defined on an unitary time interval are Student-$t$ distributed. In YUIMA, we solve this problem by means of the inverse Fourier transform for simulating the increments of a Student-$t$ L\'{e}vy defined on a interval with any length. A second problem is due to the fact that joint estimation of trend, scale, and degrees of freedom does not seem to have been investigated as yet. In YUIMA, we develop a two-step estimation procedure that efficiently deals with this issue. Numerical examples are given in order to explain methods and classes used in the YUIMA package. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.12078&r=ecm |
By: | Jonas Esser; Mateus Maia; Andrew C. Parnell; Judith Bosmans; Hanneke van Dongen; Thomas Klausch; Keefe Murphy |
Abstract: | In recent years, theoretical results and simulation evidence have shown Bayesian additive regression trees to be a highly-effective method for nonparametric regression. Motivated by cost-effectiveness analyses in health economics, where interest lies in jointly modelling the costs of healthcare treatments and the associated health-related quality of life experienced by a patient, we propose a multivariate extension of BART applicable in regression and classification analyses with several correlated outcome variables. Our framework overcomes some key limitations of existing multivariate BART models by allowing each individual response to be associated with different ensembles of trees, while still handling dependencies between the outcomes. In the case of continuous outcomes, our model is essentially a nonparametric version of seemingly unrelated regression. Likewise, our proposal for binary outcomes is a nonparametric generalisation of the multivariate probit model. We give suggestions for easily interpretable prior distributions, which allow specification of both informative and uninformative priors. We provide detailed discussions of MCMC sampling methods to conduct posterior inference. Our methods are implemented in the R package `suBART'. We showcase their performance through extensive simulations and an application to an empirical case study from health economics. By also accommodating propensity scores in a manner befitting a causal analysis, we find substantial evidence for a novel trauma care intervention's cost-effectiveness. |
Date: | 2024–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2404.02228&r=ecm |
By: | Shosei Sakaguchi |
Abstract: | Many public policies and medical interventions involve dynamics in their treatment assignments, where treatments are sequentially assigned to the same individuals across multiple stages, and the effect of treatment at each stage is usually heterogeneous with respect to the history of prior treatments and associated characteristics. We study statistical learning of optimal dynamic treatment regimes (DTRs) that guide the optimal treatment assignment for each individual at each stage based on the individual's history. We propose a step-wise doubly-robust approach to learn the optimal DTR using observational data under the assumption of sequential ignorability. The approach solves the sequential treatment assignment problem through backward induction, where, at each step, we combine estimators of propensity scores and action-value functions (Q-functions) to construct augmented inverse probability weighting estimators of values of policies for each stage. The approach consistently estimates the optimal DTR if either a propensity score or Q-function for each stage is consistently estimated. Furthermore, the resulting DTR can achieve the optimal convergence rate $n^{-1/2}$ of regret under mild conditions on the convergence rate for estimators of the nuisance parameters. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2404.00221&r=ecm |
By: | Patrick M. Kline; Evan K. Rose; Christopher R. Walters |
Abstract: | We develop an empirical Bayes ranking procedure that assigns ordinal grades to noisy measurements, balancing the information content of the assigned grades against the expected frequency of ranking errors. Applying the method to a massive correspondence experiment, we grade the race and gender contact gaps of 97 U.S. employers, the identities of which we disclose for the first time. The grades are presented alongside measures of uncertainty about each firm’s contact gap in an accessible report card that is easily adaptable to other settings where ranks and levels are of simultaneous interest. |
JEL: | C11 C13 J71 |
Date: | 2024–04 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:32313&r=ecm |
By: | Stephane Bonhomme; Angela Denis |
Abstract: | A growing number of applications involve settings where, in order to infer heterogeneous effects, a researcher compares various units. Examples of research designs include children moving between different neighborhoods, workers moving between firms, patients migrating from one city to another, and banks offering loans to different firms. We present a unified framework for these settings, based on a linear model with normal random coefficients and normal errors. Using the model, we discuss how to recover the mean and dispersion of effects, other features of their distribution, and to construct predictors of the effects. We provide moment conditions on the model's parameters, and outline various estimation strategies. A main objective of the paper is to clarify some of the underlying assumptions by highlighting their economic content, and to discuss and inform some of the key practical choices. |
Date: | 2024–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2404.01495&r=ecm |