nep-ecm New Economics Papers
on Econometrics
Issue of 2023‒08‒14
nineteen papers chosen by
Sune Karlsson
Örebro universitet

  1. Hybrid unadjusted Langevin methods for high-dimensional latent variable models By Ruben Loaiza-Maya; Didier Nibbering; Dan Zhu
  2. Panel Data Models with Time-Varying Latent Group Structures By Yiren Wang; Peter C. B. Phillips; Liangjun Su
  3. New asymptotics applied to functional coefficient regression and climate sensitivity analysis By Qiying Wang; Peter C. B. Phillips; Ying Wang
  4. Demand Estimation with Infrequent Purchases and Small Market Sizes By Ali Hortacsu; Olivia R. Natan; Hayden Parsley; Timothy Schwieg; Kevin R. Williams
  5. Adaptive Principal Component Regression with Applications to Panel Data By Anish Agarwal; Keegan Harris; Justin Whitehouse; Zhiwei Steven Wu
  6. Estimating the price elasticity of gasoline demand in correlated random coefficient models with endogeneity By Michael Bates; Seolah Kim
  7. Generalised Covariances and Correlations By Tobias Fissler; Marc-Oliver Pohle
  8. A Double Machine Learning Approach to Combining Experimental and Observational Data By Marco Morucci; Vittorio Orlandi; Harsh Parikh; Sudeepa Roy; Cynthia Rudin; Alexander Volfovsky
  9. Does regional variation in wage levels identify the effects of a national minimum wage? By Daniel Haanwinckel
  10. Local Projections for Applied Economics By Òscar Jordà
  11. Getting the Right Tail Right: Modeling Tails of Health Expenditure Distributions By Martin Karlsson; Yulong Wang; Nicolas R. Ziebarth
  12. The contribution of realized covariance models to the economic value of volatility timing By Bauwens, Luc; Xu, Yongdeng
  13. Online Learning of Order Flow and Market Impact with Bayesian Change-Point Detection Methods By Ioanna-Yvonni Tsaknaki; Fabrizio Lillo; Piero Mazzarisi
  14. Synthetic Decomposition for Counterfactual Predictions By Nathan Canen; Kyungchul Song
  15. Causality by Vote: Aggregating Evidence on Causal Relations in Economic Growth Processes By Manuel de Mier; Fernando Delbianco; Fernando Tohmé; Luisina Patrizio; Facundo Rodriguez; Mauro Romero Stéfani
  16. Measuring Cause-Effect with the Variability of the Largest Eigenvalue By Alejandro Rodriguez Dominguez; Irving Ramirez Carrillo; David Parraga Riquelme
  17. Asymptotics for the Generalized Autoregressive Conditional Duration Model By Giuseppe Cavaliere; Thomas Mikosch; Anders Rahbek; Frederik Vilandt
  18. Impulse Response Analysis at the Zero Lower Bound By Luca Benati; Thomas A. Lubik
  19. Systemic Tail Risk: High-Frequency Measurement, Evidence and Implications By Deniz Erdemlioglu; Christopher J. Neely; Xiye Yang

  1. By: Ruben Loaiza-Maya; Didier Nibbering; Dan Zhu
    Abstract: The exact estimation of latent variable models with big data is known to be challenging. The latents have to be integrated out numerically, and the dimension of the latent variables increases with the sample size. This paper develops a novel approximate Bayesian method based on the Langevin diffusion process. The method employs the Fisher identity to integrate out the latent variables, which makes it accurate and computationally feasible when applied to big data. In contrast to other approximate estimation methods, it does not require the choice of a parametric distribution for the unknowns, which often leads to inaccuracies. In an empirical discrete choice example with a million observations, the proposed method accurately estimates the posterior choice probabilities using only 2% of the computation time of exact MCMC.
    Date: 2023–06
  2. By: Yiren Wang (Singapore Management University); Peter C. B. Phillips (Cowles Foundation, Yale University); Liangjun Su (Tsinghua University)
    Abstract: This paper considers a linear panel model with interactive fixed effects and unobserved individual and time heterogeneities that are captured by some latent group structures and an unknown structural break, respectively. To enhance realism the model may have different numbers of groups and/or different group memberships before and after the break. With the preliminary nuclear norm-regularized estimation followed by row- and column-wise linear regressions, we estimate the break point based on the idea of binary segmentation and the latent group structures together with the number of groups before and after the break by sequential testing K-means algorithm simultaneously. It is shown that the break point, the number of groups and the group memberships can each be estimated correctly with probability approaching one. Asymptotic distributions of the estimators of the slope coefficients are established. Monte Carlo simulations demonstrate excellent finite sample performance for the proposed estimation algorithm. An empirical application to real house price data across 377 Metropolitan Statistical Areas in the US from 1975 to 2014 suggests the presence both of structural breaks and of changes in group membership.
    Date: 2023–06
  3. By: Qiying Wang (University of Sydney); Peter C. B. Phillips (Cowles Foundation, Yale University); Ying Wang (Renmin University of China)
    Abstract: A general asymptotic theory is established for sample cross moments of nonstationary time series, allowing for long range dependence and local unit roots. The theory provides a substantial extension of earlier results on nonparametric regression that include near-cointegrated nonparametric regression as well as spurious nonparametric regression. Many new models are covered by the limit theory, among which are functional coefficient regressions in which both regressors and the functional covariate are nonstationary. Simulations show finite sample performance matching well with the asymptotic theory and having broad relevance to applications, while revealing how dual nonstationarity in regressors and covariates raises sensitivity to bandwidth choice and the impact of dimensionality in nonparametric regression.
    Date: 2023–06
  4. By: Ali Hortacsu (University of Chicago and NBER); Olivia R. Natan (University of California, Berkeley); Hayden Parsley (University of Texas, Austin); Timothy Schwieg (University of Chicago, Booth); Kevin R. Williams (Cowles Foundation, Yale University)
    Abstract: We propose a demand estimation method that allows for a large number of zero sale observations, rich unobserved heterogeneity, and endogenous prices. We do so by modeling small market sizes through Poisson arrivals. Each of these arriving consumers solves a standard discrete choice problem. We present a Bayesian IV estimation approach that addresses sampling error in product shares and scales well to rich data environments. The data requirements are traditional market-level data as well as a measure of market sizes or consumer arrivals. After presenting simulation studies, we demonstrate the method in an empirical application of air travel demand.
    Keywords: Discrete Choice Modeling, Demand Estimation, Zero-Sale Observations, Bayesian Methods, Airline Markets.
    JEL: C11 C18 L93
    Date: 2021–11
  5. By: Anish Agarwal; Keegan Harris; Justin Whitehouse; Zhiwei Steven Wu
    Abstract: Principal component regression (PCR) is a popular technique for fixed-design error-in-variables regression, a generalization of the linear regression setting in which the observed covariates are corrupted with random noise. We provide the first time-uniform finite sample guarantees for online (regularized) PCR whenever data is collected adaptively. Since the proof techniques for analyzing PCR in the fixed design setting do not readily extend to the online setting, our results rely on adapting tools from modern martingale concentration to the error-in-variables setting. As an application of our bounds, we provide a framework for experiment design in panel data settings when interventions are assigned adaptively. Our framework may be thought of as a generalization of the synthetic control and synthetic interventions frameworks, where data is collected via an adaptive intervention assignment policy.
    Date: 2023–07
  6. By: Michael Bates (University of California-Riverside); Seolah Kim (Albion College)
    Abstract: We propose a per-cluster instrumental-variables approach (PCIV) for estimating correlated random coefficient models in the presence of contemporaneous endogeneity and two-way fixed effects. We use variation across clusters to estimate coefficients with homogeneous slopes (such as time effects) and within-cluster variation to estimate the cluster-specific heterogeneity directly. We then aggregate them to population averages. We demonstrate consistency, showing robustness over standard estimators, and provide analytic standard errors for robust inference. Basic implementation is straightforward using standard software such as Stata. In Monte Carlo simulation, PCIV performs relatively well against pooled 2SLS and fixed-effects IV (FEIV) with a finite number of clusters or finite observations per cluster. We apply PCIV in estimating the price elasticity of gasoline demand using state fuel taxes as instrumental variables. PCIV estimation allows for greater transparency of the underlying data. In our setting, we provide evidence of correlation between heterogeneity in the first and second stages, violating a key assumption underpinning consistency of standard estimators. We see significant divergence in the implicit weighting when applying FEIV from the natural weights applied in PCIV. Overlooking effect heterogeneity with standard estimators is consequential. Our estimated distribution of elasticities reveals significant heterogeneity and meaningful differences in estimated averages.
    Date: 2023–06–15
  7. By: Tobias Fissler; Marc-Oliver Pohle
    Abstract: The covariance of two random variables measures the average joint deviations from their respective means. We generalise this well-known measure by replacing the means with other statistical functionals such as quantiles, expectiles, or thresholds. Deviations from these functionals are defined via generalised errors, often induced by identification or moment functions. As a normalised measure of dependence, a generalised correlation is constructed. Replacing the common Cauchy-Schwarz normalisation by a novel Fr\'echet-Hoeffding normalisation, we obtain attainability of the entire interval $[-1, 1]$ for any given marginals. We uncover favourable properties of these new dependence measures. The families of quantile and threshold correlations give rise to function-valued distributional correlations, exhibiting the entire dependence structure. They lead to tail correlations, which should arguably supersede the coefficients of tail dependence. Finally, we construct summary covariances (correlations), which arise as (normalised) weighted averages of distributional covariances. We retrieve Pearson covariance and Spearman correlation as special cases. The applicability and usefulness of our new dependence measures is illustrated on demographic data from the Panel Study of Income Dynamics.
    Date: 2023–07
  8. By: Marco Morucci; Vittorio Orlandi; Harsh Parikh; Sudeepa Roy; Cynthia Rudin; Alexander Volfovsky
    Abstract: Experimental and observational studies often lack validity due to untestable assumptions. We propose a double machine learning approach to combine experimental and observational studies, allowing practitioners to test for assumption violations and estimate treatment effects consistently. Our framework tests for violations of external validity and ignorability under milder assumptions. When only one assumption is violated, we provide semi-parametrically efficient treatment effect estimators. However, our no-free-lunch theorem highlights the necessity of accurately identifying the violated assumption for consistent treatment effect estimation. We demonstrate the applicability of our approach in three real-world case studies, highlighting its relevance for practical settings.
    Date: 2023–07
  9. By: Daniel Haanwinckel
    Abstract: This paper examines the identification assumptions underlying estimators of the causal effects of national minimum wages on employment and wages, such as the "fraction affected" and "effective minimum wage" designs. Specifically, I conduct a series of simulation exercises to investigate whether these assumptions hold in the context of particular economic models used as data-generating processes. I find that, in many cases, the fraction affected design exhibits small biases that lead to inflated rejection rates of the true causal effect. These biases can be larger in the presence of either trends in the dispersion of wages within regions, equilibrium responses to the minimum wage, or violations of the parallel trends assumption. I propose two diagnostic exercises to complement the standard test for differential pre-trends commonly used to validate this design. For the effective minimum wage design, I show that while the identification assumptions emphasized by Lee (1999) are crucial, they are not sufficient for unbiased estimation. Under various economically plausible scenarios, estimators within this framework can exhibit significant biases that are difficult to diagnose through specification tests.
    Date: 2023–07
  10. By: Òscar Jordà
    Abstract: The dynamic causal effect of an intervention on an outcome is of paramount interest to applied macro- and micro-economics research. However, this question has been generally approached differently by the two literatures. In making the transition from traditional time series methods to applied microeconometrics, local projections can serve as a natural bridge. Local projections can translate the familiar language of vector autoregressions (VARs) and impulse responses into the language of potential outcomes and treatment effects. There are gains to be made by both literatures from greater integration of well established methods in each. This review shows how to make these connections and points to potential areas of further research.
    Keywords: local projections; vector autoregressions; panel data; potential outcomes
    Date: 2023–07–14
  11. By: Martin Karlsson; Yulong Wang; Nicolas R. Ziebarth
    Abstract: Health expenditure data almost always include extreme values. Such heavy tails can be a threat to the commonly adopted least squares methods. To accommodate extreme values, we propose the use of an estimation method that recovers the often ignored right tail of health expenditure distributions. We apply the proposed method to a claims dataset from one of the biggest German private health insurers and find that the age gradient in health care spending differs substantially from the standard least squares method. Finally, we extend the popular two-part model and develop a novel three-part model.
    JEL: C10 C13 I10 I13
    Date: 2023–07
  12. By: Bauwens, Luc ((Université catholique de Louvain, CORE, Belgium); Xu, Yongdeng (Cardiff Business School)
    Abstract: Realized covariance models specify the conditional expectation of a realized covariance matrix as a function of past realized covariance matrices through a GARCH-type structure. We compare the forecasting performance of several such models in terms of economic value, measured through economic loss functions, on two datasets. Our empirical results indicate that the (HEAVY-type) models that use realized volatilities yield economic value and significantly surpass the (GARCH) models that use only daily returns for daily and weekly horizons. Among the HEAVY-type models, for a dataset of twenty-nine stocks, those that are specified to capture the heterogeneity of the dynamics of the individual conditional variance processes and to allow these to differ from the correlation processes (namely, DCC-type models) are more beneficial than the models that impose the same dynamics to the variance and covariance processes (namely, BEKK-type models), whereas for the dataset of three assets, the different models perform similarly. Finally, using a directly rescaled intra-day covariance to estimate the full-day covariance provides more economic value than using the overnight returns, as the latter tend to yield noisy estimators of the overnight covariance, impairing their predictive capacity.
    Keywords: volatility timing, realized volatility, high-frequency data, forecasting
    JEL: G11 G17 C32 C58
    Date: 2023–07
  13. By: Ioanna-Yvonni Tsaknaki; Fabrizio Lillo; Piero Mazzarisi
    Abstract: Financial order flow exhibits a remarkable level of persistence, wherein buy (sell) trades are often followed by subsequent buy (sell) trades over extended periods. This persistence can be attributed to the division and gradual execution of large orders. Consequently, distinct order flow regimes might emerge, which can be identified through suitable time series models applied to market data. In this paper, we propose the use of Bayesian online change-point detection (BOCPD) methods to identify regime shifts in real-time and enable online predictions of order flow and market impact. To enhance the effectiveness of our approach, we have developed a novel BOCPD method using a score-driven approach. This method accommodates temporal correlations and time-varying parameters within each regime. Through empirical application to NASDAQ data, we have found that: (i) Our newly proposed model demonstrates superior out-of-sample predictive performance compared to existing models that assume i.i.d. behavior within each regime; (ii) When examining the residuals, our model demonstrates good specification in terms of both distributional assumptions and temporal correlations; (iii) Within a given regime, the price dynamics exhibit a concave relationship with respect to time and volume, mirroring the characteristics of actual large orders; (iv) By incorporating regime information, our model produces more accurate online predictions of order flow and market impact compared to models that do not consider regimes.
    Date: 2023–07
  14. By: Nathan Canen; Kyungchul Song
    Abstract: Counterfactual predictions are challenging when the policy variable goes beyond its pre-policy support. However, in many cases, information about the policy of interest is available from different ("source") regions where a similar policy has already been implemented. In this paper, we propose a novel method of using such data from source regions to predict a new policy in a target region. Instead of relying on extrapolation of a structural relationship using a parametric specification, we formulate a transferability condition and construct a synthetic outcome-policy relationship such that it is as close as possible to meeting the condition. The synthetic relationship weighs both the similarity in distributions of observables and in structural relationships. We develop a general procedure to construct asymptotic confidence intervals for counterfactual predictions and prove its asymptotic validity. We then apply our proposal to predict average teenage employment in Texas following a counterfactual increase in the minimum wage.
    Date: 2023–07
  15. By: Manuel de Mier (UNS); Fernando Delbianco (UNS/INMABB-CONICET); Fernando Tohmé (UNS/INMABB-CONICET); Luisina Patrizio (UNS); Facundo Rodriguez (UNS); Mauro Romero Stéfani (UNS)
    Abstract: In this paper we investigate the performance of five causality-detection methods and how their results can be aggregated when multiple units are considered in a panel data setting. The aggregation procedure employs voting rules for determining which causal paths are identified for the sample population. Using simulated and real-world panel data, we show the performance of this methods in detecting the correct causal paths in comparison to a benchmark that comprises a standard representation of growth processes as ground truth model. We find that the results may be better when only simulated, instead of real-world, data are analyzed.While this may suggest that the methods presented here are are currently incapable of detecting causal links, it is plausible that the ground “truth” may incorporate false relations.
    Keywords: Granger causality, Transfer Entropy, Stochastic Causality, LiNGAM, Ground Truth, Economic Growth.
    JEL: C18 C43 O47
    Date: 2023–07
  16. By: Alejandro Rodriguez Dominguez; Irving Ramirez Carrillo; David Parraga Riquelme
    Abstract: We present a method to test and monitor structural relationships between time variables. The distribution of the first eigenvalue for lagged correlation matrices (Tracy-Widom distribution) is used to test structural time relationships between variables against the alternative hypothesis (Independence). This distribution studies the asymptotic dynamics of the largest eigenvalue as a function of the lag in lagged correlation matrices. By analyzing the time series of the standard deviation of the greatest eigenvalue for $2\times 2$ correlation matrices with different lags we can analyze deviations from the Tracy-Widom distribution to test structural relationships between these two time variables. These relationships can be related to causality. We use the standard deviation of the first eigenvalue at different lags as a proxy for testing and monitoring structural causal relationships. The method is applied to analyse causal dependencies between daily monetary flows in a retail brokerage business allowing to control for liquidity risks.
    Date: 2023–07
  17. By: Giuseppe Cavaliere; Thomas Mikosch; Anders Rahbek; Frederik Vilandt
    Abstract: Engle and Russell (1998, Econometrica, 66:1127--1162) apply results from the GARCH literature to prove consistency and asymptotic normality of the (exponential) QMLE for the generalized autoregressive conditional duration (ACD) model, the so-called ACD(1, 1), under the assumption of strict stationarity and ergodicity. The GARCH results, however, do not account for the fact that the number of durations over a given observation period is random. Thus, in contrast with Engle and Russell (1998), we show that strict stationarity and ergodicity alone are not sufficient for consistency and asymptotic normality, and provide additional sufficient conditions to account for the random number of durations. In particular, we argue that the durations need to satisfy the stronger requirement that they have finite mean.
    Date: 2023–07
  18. By: Luca Benati; Thomas A. Lubik
    Abstract: We study whether the response of the economy to structural shocks changes at the zero lower bound. Monte Carlo evidence suggests that VARs have a limited ability to detect changes in impulse response functions at the ZLB compared to the standard environment with positive interest rates. This issue is confounded given the short sample lengths that characterize ZLB episodes. This is especially the case for timevarying parameter VARs, whose estimates are two-sided, and therefore tend to smooth changes across regimes. In contrast, fixed-coefficient VARs estimated by sub-sample exhibit greater power. Pooled estimates from panel VARs for six countries based on (long-run and) sign restrictions detect in several instances changes in the IRFs. This evidence is, however, weaker than it appears. Based on (long-run and) sign restrictions we find that prior and posterior IRFs are often close, so that the concern raised by Baumeister and Hamilton (2015) appears to be relevant. Evidence from a multivariate permanent-transitory decomposition of GDP shocks is markedly sharper. It points towards material changes in the IRFs: at the ZLB the IRFs of GDP and unemployment exhibit more inertia, the response of prices is flatter, and the responses of interest rates are weaker.
    Keywords: Zero Lower Bound; Bayesian VARs; structural VARs; monetary policy; sign restrictions
    JEL: C32 C52
    Date: 2023–06
  19. By: Deniz Erdemlioglu; Christopher J. Neely; Xiye Yang
    Abstract: We develop a new framework to measure market-wide (systemic) tail risk in the cross-section of high-frequency stock returns. We estimate the time-varying jump intensities of asset prices and introduce a testing approach that identifies multi-asset tail risk based on the release times of scheduled news announcements. Using high-frequency data on individual U.S. stocks and sector-specific ETF portfolios, we find that most of the FOMC announcements create systemic left tail risk, but there is no evidence that macro announcements do so. The magnitude of the tail risk induced by Fed news varies over the business cycle, peaks during the global financial crisis and remains high over different phases of unconventional monetary policy. We use our approach to construct a Fed-induced systemic tail risk (STR) indicator. STR helps explain the pre-FOMC announcement drift and significantly increases variance risk premia, particularly for the meetings without press conferences.
    Keywords: time-varying tail risk; high-frequency data; Federal Open Market Committee (FOMC) news; monetary policy announcements; cojumps; systemic risk; jump intensity
    JEL: C12 C14 C22 C32 C58 G12 G14
    Date: 2023–07–20

This nep-ecm issue is ©2023 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.