nep-ecm 2023-11-27 papers

on Econometrics

Issue of 2023‒11‒27
sixteen papers chosen by
Sune Karlsson, Örebro universitet

Mean Group Instrumental Variable Estimation of Time-Varying Large Heterogeneous Panels with Endogenous Regressors By Yu Bai; Massimiliano Marcellino; George Kapetanios
CATE Lasso: Conditional Average Treatment Effect Estimation with High-Dimensional Linear Regression By Masahiro Kato; Masaaki Imaizumi
Empirical likelihood for network data By Matsushita, Yukitoshi; Otsu, Taisuke
Truncated two-parameter Poisson-Dirichlet approximation for Pitman-Yor process hierarchical models By Zhang, Junyi; Dassios, Angelos
Unobserved Grouped Heteroskedasticity and Fixed Effects By Jorge A. Rivero
On propensity score matching with a diverging number of matches By Yihui He; Fang Han
Bayesian Estimation of Panel Models under Potentially Sparse Heterogeneity By Hyungsik Roger Moon; Frank Schorfheide; Boyuan Zhang
Bias-Corrected Instrumental Variable Estimation in Linear Dynamic Panel Data Models By Chen, Weihao; Cizek, Pavel
Speeding up estimation of spatially varying coefficients models. By Ghislain Geniaux
Evaluating difference-in-differences models under different treatment assignment mechanism and in the presence of spillover effects By Guilherme Araújo Lima; Igor Viveiros Melo Souza; Mauro Sayar Ferreira
A joint modeling approach for longitudinal outcomes and non-ignorable dropout under population heterogeneity in mental health studies By Park, Jung Yeon; Wall, Melanie M; Moustaki, Irini; Grossman, Arnold
Rental Price Dynamics in Germany: A Distributional Regression Model with Heterogenous Covariate Effects By Julian Granna; Stefan Lang
Fair Adaptive Experiments By Waverly Wei; Xinwei Ma; Jingshen Wang
Learning Probability Distributions of Intraday Electricity Prices By Jozef Barunik; Lubos Hanus
How to build a cross-impact model from first principles: Theoretical requirements and empirical results By Mehdi Tomas; Iacopo Mastromatteo; Michael Benzaquen
The Hitchhiker's guide to markup estimation By Basile Grassi; Giovanni Morzenti; Maarten de Ridder

Mean Group Instrumental Variable Estimation of Time-Varying Large Heterogeneous Panels with Endogenous Regressors

By:	Yu Bai; Massimiliano Marcellino; George Kapetanios
Abstract:	The large heterogeneous panel data models are extended to the setting where the heterogenous coefficients are changing over time and the regressors are endogenous. Kernel-based non-parametric timevarying parameter instrumental variable mean group (TVP-IV-MG) estimator is proposed for the timevarying cross-sectional mean coefficients. The uniform consistency is shown and the pointwise asymptotic normality of the proposed estimator is derived. A data-driven bandwidth selection procedure is also proposed. The finite sample performance of the proposed estimator is investigated through a Monte Carlo study and an empirical application on multi-country Phillips curve with time-varying parameters.
Keywords:	large heterogeneous panels, non-parametric methods, time-varying parameters, mean group estimator
JEL:	C14 C26 C51
Date:	2023
URL:	http://d.repec.org/n?u=RePEc:msh:ebswps:2023-13&r=ecm

CATE Lasso: Conditional Average Treatment Effect Estimation with High-Dimensional Linear Regression

By:	Masahiro Kato; Masaaki Imaizumi
Abstract:	In causal inference about two treatments, Conditional Average Treatment Effects (CATEs) play an important role as a quantity representing an individualized causal effect, defined as a difference between the expected outcomes of the two treatments conditioned on covariates. This study assumes two linear regression models between a potential outcome and covariates of the two treatments and defines CATEs as a difference between the linear regression models. Then, we propose a method for consistently estimating CATEs even under high-dimensional and non-sparse parameters. In our study, we demonstrate that desirable theoretical properties, such as consistency, remain attainable even without assuming sparsity explicitly if we assume a weaker assumption called implicit sparsity originating from the definition of CATEs. In this assumption, we suppose that parameters of linear models in potential outcomes can be divided into treatment-specific and common parameters, where the treatment-specific parameters take difference values between each linear regression model, while the common parameters remain identical. Thus, in a difference between two linear regression models, the common parameters disappear, leaving only differences in the treatment-specific parameters. Consequently, the non-zero parameters in CATEs correspond to the differences in the treatment-specific parameters. Leveraging this assumption, we develop a Lasso regression method specialized for CATE estimation and present that the estimator is consistent. Finally, we confirm the soundness of the proposed method by simulation studies.
Date:	2023–10
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2310.16819&r=ecm

Empirical likelihood for network data

By:	Matsushita, Yukitoshi; Otsu, Taisuke
Abstract:	This article develops a concept of nonparametric likelihood for network data based on network moments, and proposes general inference methods by adapting the theory of jackknife empirical likelihood. Our methodology can be used not only to conduct inference on population network moments and parameters in network formation models, but also to implement goodness-of-fit testing, such as testing block size for stochastic block models. Theoretically we show that the jackknife empirical likelihood statistic for acyclic or cyclic subgraph moments loses its asymptotic pivotalness in severely or moderately sparse cases, respectively, and develop a modified statistic to recover pivotalness in such cases. The main advantage of our modified jackknife empirical likelihood method is its validity under weaker sparsity conditions than existing methods although it is computationally more demanding than the unmodified version. Supplementary materials for this article are available online.
Keywords:	Bootstrap/resampling; Goodness-of-fit methods; Nonparametric methods; Consolidator Grant (SNP 615882); T&F deal
JEL:	C1
Date:	2023–08–23
URL:	http://d.repec.org/n?u=RePEc:ehl:lserod:119936&r=ecm

Truncated two-parameter Poisson-Dirichlet approximation for Pitman-Yor process hierarchical models

By:	Zhang, Junyi; Dassios, Angelos
Abstract:	In this paper, we construct an approximation to the Pitman–Yor process by truncating its two-parameter Poisson–Dirichlet representation. The truncation is based on a decreasing sequence of random weights, thus having a lower approximation error compared to the popular truncated stick-breaking process. We develop an exact simulation algorithm to sample from the approximation process and provide an alternative MCMC algorithm for the parameter regime where the exact simulation algorithm becomes slow. The effectiveness of the simulation algorithms is demonstrated by the estimation of the functionals of a Pitman–Yor process. Then we adapt the approximation process into a Pitman–Yor process mixture model and devise a blocked Gibbs sampler for posterior inference.
Keywords:	Bayesian non-parametric statistics; Markov chain Monte Carlo; mixture model; Pitman-Yor process; two-parameter Poisson-Dirichlet distribution
JEL:	C1
Date:	2023–09–28
URL:	http://d.repec.org/n?u=RePEc:ehl:lserod:120294&r=ecm

Unobserved Grouped Heteroskedasticity and Fixed Effects

By:	Jorge A. Rivero
Abstract:	This paper extends the linear grouped fixed effects (GFE) panel model to allow for heteroskedasticity from a discrete latent group variable. Key features of GFE are preserved, such as individuals belonging to one of a finite number of groups and group membership is unrestricted and estimated. Ignoring group heteroskedasticity may lead to poor classification, which is detrimental to finite sample bias and standard errors of estimators. I introduce the "weighted grouped fixed effects" (WGFE) estimator that minimizes a weighted average of group sum of squared residuals. I establish $\sqrt{NT}$-consistency and normality under a concept of group separation based on second moments. A test of group homoskedasticity is discussed. A fast computation procedure is provided. Simulations show that WGFE outperforms alternatives that exclude second moment information. I demonstrate this approach by considering the link between income and democracy and the effect of unionization on earnings.
Date:	2023–10
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2310.14068&r=ecm

On propensity score matching with a diverging number of matches

By:	Yihui He; Fang Han
Abstract:	This paper reexamines Abadie and Imbens (2016)'s work on propensity score matching for average treatment effect estimation. We explore the asymptotic behavior of these estimators when the number of nearest neighbors, $M$, grows with the sample size. It is shown, hardly surprising but technically nontrivial, that the modified estimators can improve upon the original fixed-$M$ estimators in terms of efficiency. Additionally, we demonstrate the potential to attain the semiparametric efficiency lower bound when the propensity score achieves "sufficient" dimension reduction, echoing Hahn (1998)'s insight about the role of dimension reduction in propensity score-based causal inference.
Date:	2023–10
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2310.14142&r=ecm

Bayesian Estimation of Panel Models under Potentially Sparse Heterogeneity

By:	Hyungsik Roger Moon; Frank Schorfheide; Boyuan Zhang
Abstract:	We incorporate a version of a spike and slab prior, comprising a pointmass at zero ("spike") and a Normal distribution around zero ("slab") into a dynamic panel data framework to model coefficient heterogeneity. In addition to homogeneity and full heterogeneity, our specification can also capture sparse heterogeneity, that is, there is a core group of units that share common parameters and a set of deviators with idiosyncratic parameters. We fit a model with unobserved components to income data from the Panel Study of Income Dynamics. We find evidence for sparse heterogeneity for balanced panels composed of individuals with long employment histories.
Date:	2023–10
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2310.13785&r=ecm

Bias-Corrected Instrumental Variable Estimation in Linear Dynamic Panel Data Models

By:	Chen, Weihao (Tilburg University, Center For Economic Research); Cizek, Pavel (Tilburg University, Center For Economic Research)
Keywords:	bias correction; dynamic panel data models; endogeneity; instrumental variables
Date:	2023
URL:	http://d.repec.org/n?u=RePEc:tiu:tiucen:9bf2c16c-522f-4223-8037-ce88ed351cc3&r=ecm

Speeding up estimation of spatially varying coefficients models.

By:	Ghislain Geniaux (ECODEVELOPPEMENT - Unité de recherche d'Écodéveloppement - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement)
Abstract:	Spatially varying coefficients models, like GWR (Brunsdon et al., 1996; McMillen, 1996), are widely used in various application domains for which it may be of interest to consider the spatial heterogeneity of the coefficient?s model (house market, land use, population ecology, seis- mology, mining research, ...). In most application areas and disciplines, the continuous increase in spatial data sample sizes, both in terms of volume and richness of explanatory variables, has raised new method- ological issues. The two main issues concern here the time required to calculate each local coefficients and the memory requirements imposed for storing the hat matrix of size n × n for estimating the variance of parameters. To answer to these two issues, various avenues have been explored (Harris et al., 2010; Pozdnoukhov and Kaiser, 2011; Tran et al., 2016; Geniaux and Martinetti, 2018; Li et al., 2019; Murakami et al., 2020). The use of a subset of target points for local regressions which dates from the first explorations of weighted local regressions (Cleveland and Devlin, 1988; Loader, 1999) has been widely studied in the field of nonparametric econometrics. It has been little explored for varying coefficient models, and in particular for GWR where a single 2D ker- nel is used for all the coefficients. McMillen (2012) and McMillen and Soppelsa (2015) used a direct transposition of the Loader 1999' pro- posal in the context of the GWR for selecting target points. In this paper, we propose an original two-stage method. We select a set of tar- get points based on spatial smooth of residuals of a first stage regression to perform a GWR only on this subsample. In a second stage we use spatial smoothing method for extrapolating the remaining GWR coef- ficients. In addition to using an effective sample of target points, we explore the computational gain provided by using rough gaussian ker- nel. Monte Carlo experiments show that this way of selecting target points outperforms selection based on density of points or with ran- dom selection. Simulation results also show that using target points can even reduce the Bias and the RMSE of β coefficients compared to classic GWR by allowing to select more accurate bandwidth size. Our best estimator appears to be scalable under two conditions: the use of a ratio of target points that provides satisfactory approxima- tion of coefficients (10 to 20 % of locations) and an optimal bandwidth that remains within a reasonable neighborhood (
Date:	2022–05–19
URL:	http://d.repec.org/n?u=RePEc:hal:journl:hal-04229918&r=ecm

Evaluating difference-in-differences models under different treatment assignment mechanism and in the presence of spillover effects

By:	Guilherme Araújo Lima (UFMG); Igor Viveiros Melo Souza (UFMG); Mauro Sayar Ferreira (UFMG)
Abstract:	We conduct Monte Carlo experiments to evaluate the performance of different Difference-in-Differences estimators under treatment assignment mechanisms affected by shocks suffered by treated units and also in contexts where the treatment effect spills over to units in the control group. In particular, we compare the estimators proposed by Callaway and Sant'Anna (2021), Borusyak et al. (2021), and Sun and Abraham (2021), as well as the two-way fixed effects (TWFE) estimator. The results demonstrate that the treatment assignment mechanisms we design, and the presence of spillover effects can severely compromise the performance of the considered estimators, leading to bias and, even more importantly, inconsistency. Therefore, cautious for interpreting the results should be taken in applications where the environment studied resembles those we consider. The development of more robust estimators is a necessity and a prosperous research venue.
Keywords:	Difference-in-Differences; Causal Inference; Treatment assignment mechanisms; Spillover effects.
Date:	2023–10
URL:	http://d.repec.org/n?u=RePEc:cdp:texdis:td662&r=ecm

A joint modeling approach for longitudinal outcomes and non-ignorable dropout under population heterogeneity in mental health studies

By:	Park, Jung Yeon; Wall, Melanie M; Moustaki, Irini; Grossman, Arnold
Abstract:	The paper proposes a joint mixture model to model non-ignorable drop-out in longitudinal cohort studies of mental health outcomes. The model combines a (non)-linear growth curve model for the time-dependent outcomes and a discrete-time survival model for the drop-out with random effects shared by the two sub-models. The mixture part of the model takes into account population heterogeneity by accounting for latent subgroups of the shared effects that may lead to different patterns for the growth and the drop-out tendency. A simulation study shows that the joint mixture model provides greater precision in estimating the average slope and covariance matrix of random effects. We illustrate its benefits with data from a longitudinal cohort study that characterizes depression symptoms over time yet is hindered by non-trivial participant drop-out.
Keywords:	latent growth curve; MNAR drop-out; survival analysis; finite mixture model; mental health
JEL:	C1
Date:	2022–10–01
URL:	http://d.repec.org/n?u=RePEc:ehl:lserod:110867&r=ecm

Rental Price Dynamics in Germany: A Distributional Regression Model with Heterogenous Covariate Effects

By:	Julian Granna; Stefan Lang
Abstract:	Modeling real estate prices in the context of hedonic models typically involves fitting a Generalized Additive Model, where only the mean of a (lognormal) distribution is regressed on a set of variables, without taking into account other parameters of the distribution. Thus far, the application of regression models that model the full conditional distribution of the prices, has been infeasible for large data sets, even on powerful machines. Moreover, accounting for heterogeneity of effects regarding time and location, is often achieved by naive stratification of the data rather than on a model basis. We apply a novel batchwise backfitting algorithm in the context of a structured additive regression model that enables us to efficiently model all distributional parameters of an appropriate distribution. Using a large German dataset of rental prices comprising over a million observations, we choose variables relevant for modeling the location and scale parameters using a boosting variant of the algorithm. Moreover, we identify heterogeneity of covariates’ effects on the parameters with respect to both time and location on a model basis. In this way, we allow varying influence of variables on the prices’ distribution depending on the dwelling’s location and the date of sale. Modeling the full distribution of prices further enables us to investigate the influence of the variables not only on the median, but also on other quantiles of rental prices.
Keywords:	distributional regression; Hedonic regression; parameter instability
JEL:	R3
Date:	2023–01–01
URL:	http://d.repec.org/n?u=RePEc:arz:wpaper:eres2023_130&r=ecm

Fair Adaptive Experiments

By:	Waverly Wei; Xinwei Ma; Jingshen Wang
Abstract:	Randomized experiments have been the gold standard for assessing the effectiveness of a treatment or policy. The classical complete randomization approach assigns treatments based on a prespecified probability and may lead to inefficient use of data. Adaptive experiments improve upon complete randomization by sequentially learning and updating treatment assignment probabilities. However, their application can also raise fairness and equity concerns, as assignment probabilities may vary drastically across groups of participants. Furthermore, when treatment is expected to be extremely beneficial to certain groups of participants, it is more appropriate to expose many of these participants to favorable treatment. In response to these challenges, we propose a fair adaptive experiment strategy that simultaneously enhances data use efficiency, achieves an envy-free treatment assignment guarantee, and improves the overall welfare of participants. An important feature of our proposed strategy is that we do not impose parametric modeling assumptions on the outcome variables, making it more versatile and applicable to a wider array of applications. Through our theoretical investigation, we characterize the convergence rate of the estimated treatment effects and the associated standard deviations at the group level and further prove that our adaptive treatment assignment algorithm, despite not having a closed-form expression, approaches the optimal allocation rule asymptotically. Our proof strategy takes into account the fact that the allocation decisions in our design depend on sequentially accumulated data, which poses a significant challenge in characterizing the properties and conducting statistical inference of our method. We further provide simulation evidence to showcase the performance of our fair adaptive experiment strategy.
Date:	2023–10
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2310.16290&r=ecm

Learning Probability Distributions of Intraday Electricity Prices

By:	Jozef Barunik; Lubos Hanus
Abstract:	We propose a novel machine learning approach to probabilistic forecasting of hourly intraday electricity prices. In contrast to recent advances in data-rich probabilistic forecasting that approximate the distributions with some features such as moments, our method is non-parametric and selects the best distribution from all possible empirical distributions learned from the data. The model we propose is a multiple output neural network with a monotonicity adjusting penalty. Such a distributional neural network can learn complex patterns in electricity prices from data-rich environments and it outperforms state-of-the-art benchmarks.
Date:	2023–10
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2310.02867&r=ecm

How to build a cross-impact model from first principles: Theoretical requirements and empirical results

By:	Mehdi Tomas; Iacopo Mastromatteo (SISSA / ISAS - Scuola Internazionale Superiore di Studi Avanzati / International School for Advanced Studies); Michael Benzaquen (LadHyX - Laboratoire d'hydrodynamique - X - École polytechnique - CNRS - Centre National de la Recherche Scientifique)
Abstract:	Cross-impact, namely the fact that on average buy (sell) trades on a financial instrument induce positive (negative) price changes in other correlated assets, can be measured from abundant, although noisy, market data. In this paper we propose a principled approach that allows to perform model selection for cross-impact models, showing that symmetries and consistency requirements are particularly effective in reducing the universe of possible models to a much smaller set of viable candidates, thus mitigating the effect of noise on the properties of the inferred model. We review the empirical performance of a large number of cross-impact models, comparing their strengths and weaknesses on a number of asset classes (futures, stocks, calendar spreads). Besides showing which models perform better, we argue that in presence of comparable statistical performance, which is often the case in a noisy world, it is relevant to favor models that provide ex-ante theoretical guarantees on their behavior in limit cases. From this perspective, we advocate that the empirical validation of universal properties (symmetries, invariances) should be regarded as holding a much deeper epistemological value than any measure of statistical performance on specific model instances.
Date:	2022
URL:	http://d.repec.org/n?u=RePEc:hal:journl:hal-02567489&r=ecm

The Hitchhiker's guide to markup estimation

By:	Basile Grassi; Giovanni Morzenti; Maarten de Ridder
Abstract:	Is it feasible to estimate firm-level markups with commonly available datasets? Common methods to measure markups hinge on a production function estimation, but most datasets do not contain data on the quantity that firms produce. We use a tractable analytical framework, simulation from a quantitative model, and firm-level administrative production and pricing data to study the biases in markup estimates that may arise as a result. While the level of markup estimates from revenue data is biased, these estimates do correlate highly with true markups. They also display similar correlations with variables such as profitability and market share in our data. Finally, we show that imposing a Cobb-Douglas production function or simplifying the production function estimation may reduce the informativeness of markup estimates.
Keywords:	Macroeconomics, Production Functions, Markups, Competition
Date:	2022–12–20
URL:	http://d.repec.org/n?u=RePEc:cep:poidwp:063&r=ecm

This nep-ecm issue is ©2023 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.