nep-ecm New Economics Papers
on Econometrics
Issue of 2025–12–22
seventeen papers chosen by
Sune Karlsson, Örebro universitet


  1. Estimation and inference in models with multiple behavioural equilibria By Alexander Mayer; Davide Raggi
  2. Debiased Bayesian Inference for High-dimensional Regression Models By Qihui Chen; Zheng Fang; Ruixuan Liu
  3. Testing Parametric Distribution Family Assumptions via Differences in Differential Entropy By Mittelhammer, Ron; Judge, George; Henry, Miguel
  4. Estimation of Panel Data Models with Nonlinear Factor Structure By Christina Maschmann; Joakim Westerlund
  5. Balancing Weights for Causal Mediation Analysis By Kentaro Kawato
  6. Inference for Batched Adaptive Experiments By Jan Kemper; Davud Rostam-Afschar
  7. High-dimensional Penalized Linear IV Estimation & Inference using BRIDGE and Adaptive LASSO By Eleftheria Kelekidou
  8. Evaluating A/B Testing Methodologies via Sample Splitting: Theory and Practice By Ryan Kessler; James McQueen; Miikka Rokkanen
  9. Learning Time-Varying Correlation Networks with FDR Control via Time-Varying P-values By Bufan Li; Lujia Bai; Weichi Wu
  10. Foundation Priors By Sanjog Misra
  11. Optimal Screening in Experiments with Partial Compliance By Christopher Carter; Adeline Delavande; Mario Fiorini; Peter Siminski; Patrick Vu
  12. A Simulation Study Comparing Handling Missing Data Strategies By Oatley, Scott; Gayle, Vernon Professor; Connelly, Roxanne
  13. Learning from crises: A new class of time-varying parameter VARs with observable adaptation By Nicolas Hardy; Dimitris Korobilis
  14. New Measures for Richer Theories: Some Thoughts and an Example By Orazio Attanasio; V’ctor Sancibri‡n; Federica Ambrosio
  15. Estimation of Industrial Heterogeneity from Maximum Entropy and Zonotopes Using the Enterprise Surveys By Ting-Yen Wang
  16. George Judge's Contributions to Econometrics in Agricultural and Applied Economics By Rausser, Gordon; Villas-Boas, Sofia B.
  17. A Resampling Approach for Causal Inference on Novel Two-Point Time-Series with Application to Identify Risk Factors for Type-2 Diabetes and Cardiovascular Disease By Dai, Xiaowu; Mouti, Saad; do Vale, Marjorie Lima; Ray, Sumantra; Bohn, Jeffrey; Goldberg, Lisa

  1. By: Alexander Mayer; Davide Raggi
    Abstract: We develop estimation and inference methods for a stylized macroeconomic model with potentially multiple behavioural equilibria, where agents form expectations using a constant-gain learning rule. We first show geometric ergodicity of the underlying process to study in a second step (strong) consistency and asymptotic normality of the nonlinear least squares estimator for the structural parameters. We propose inference procedures for the structural parameters and uniform confidence bands for the equilibria. When equilibrium solutions are repeated, mixed convergence rates and non-standard limit distributions emerge. Monte Carlo simulations and an empirical application illustrate the finite-sample performance of our methods.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.04541
  2. By: Qihui Chen; Zheng Fang; Ruixuan Liu
    Abstract: There has been significant progress in Bayesian inference based on sparsity-inducing (e.g., spike-and-slab and horseshoe-type) priors for high-dimensional regression models. The resulting posteriors, however, in general do not possess desirable frequentist properties, and the credible sets thus cannot serve as valid confidence sets even asymptotically. We introduce a novel debiasing approach that corrects the bias for the entire Bayesian posterior distribution. We establish a new Bernstein-von Mises theorem that guarantees the frequentist validity of the debiased posterior. We demonstrate the practical performance of our proposal through Monte Carlo simulations and two empirical applications in economics.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.09257
  3. By: Mittelhammer, Ron; Judge, George; Henry, Miguel
    Abstract: We introduce a broadly applicable statistical procedure for testing which parametric distribution family generated a random sample of data. The method, termed the Difference in Differential Entropy (DDE) test, provides a unified framework applicable to a wide range of distributional families, with asymptotic validity grounded in established maximum likelihood, bootstrap, and kernel density estimation principles. The test is straightforward to implement, computationally efficient, and requires no user-defined tuning parameters or complex specialized regularity conditions. It compares an MLE-based estimate of differential entropy under the null hypothesis with a nonparametric bootstrapped kernel density estimate, using their divergence as an information-theoretic measure of model fit. The test procedure is constructive in the sense of being informative regardless of whether the null hypothesis is rejected or not, where in the latter case the outcome suggests that the hypothesized distribution can be close to the actual distribution of the data in shape and probability implications. Monte Carlo experiments demonstrate its notable size accuracy and power even in relatively small samples, and three empirical applications using classical datasets from distinct domains illustrate the method’s practical utility.
    Keywords: Research Methods/ Statistical Methods
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:ags:assa26:380041
  4. By: Christina Maschmann; Joakim Westerlund
    Abstract: Panel data models with unobserved heterogeneity in the form of interactive effects standardly assume that the time effects - or "common factors" - enter linearly. This assumption is unnatural in the sense that it pertains to the unobserved component of the model, and there is rarely any reason to believe that this component takes on a particular functional form. This is in stark contrast to the relationship between the observables, which can often be credibly argued to be linear. Linearity in the factors has persevered mainly because it is convenient, and that it is better than standard fixed effects. The present paper relaxes this assumption. It does so by combining the common correlated effects (CCE) approach to standard interactive effects with the method of sieves. The new estimator - abbreviated "SCCE" - retains many of the advantages of CCE, including its computational simplicity, and good small-sample and asymptotic properties, but is applicable under a much broader class of factor structures that includes the linear one as a special case. This makes it well-suited for a wide range of empirical applications.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.03693
  5. By: Kentaro Kawato
    Abstract: This paper develops methods for estimating the natural direct and indirect effects in causal mediation analysis. The efficient influence function-based estimator (EIF-based estimator) and the inverse probability weighting estimator (IPW estimator), which are standard in causal mediation analysis, both rely on the inverse of the estimated propensity scores, and thus they are vulnerable to two key issues (i) instability and (ii) finite-sample covariate imbalance. We propose estimators based on the weights obtained by an algorithm that directly penalizes weight dispersion while enforcing approximate covariate and mediator balance, thereby improving stability and mitigating bias in finite samples. We establish the convergence rates of the proposed weights and show that the resulting estimators are asymptotically normal and achieve the semiparametric efficiency bound. Monte Carlo simulations demonstrate that the proposed estimator outperforms not only the EIF-based estimator and the IPW estimator but also the regression imputation estimator in challenging scenarios with model misspecification. Furthermore, the proposed method is applied to a real dataset from a study examining the effects of media framing on immigration attitudes.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.09337
  6. By: Jan Kemper; Davud Rostam-Afschar
    Abstract: The advantages of adaptive experiments have led to their rapid adoption in economics, other fields, as well as among practitioners. However, adaptive experiments pose challenges for causal inference. This note suggests a BOLS (batched ordinary least squares) test statistic for inference of treatment effects in adaptive experiments. The statistic provides a precision-equalizing aggregation of per-period treatment-control differences under heteroskedasticity. The combined test statistic is a normalized average of heteroskedastic per-period z-statistics and can be used to construct asymptotically valid confidence intervals. We provide simulation results comparing rejection rates in the typical case with few treatment periods and few (or many) observations per batch.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.10156
  7. By: Eleftheria Kelekidou
    Abstract: This paper is an exposition of how BRIDGE and adaptive LASSO can be used in a two-stage least squares problem, to estimate the second-stage coefficients when the number of parameters p in both stages is growing with the sample size n. Facing a larger class of problems compared to the usual analysis in the literature, i.e., replacing the assumption of normal with sub-Gaussian errors, I prove that both methods ensure model selection consistency and oracle efficiency even when the number of instruments and covariates exceeds the sample size. For BRIDGE, I also prove that if the former is growing but slower than the latter, the same properties hold even without sub-Gaussian errors. When p is greater than n, BRIDGE requires a slightly weaker set of assumptions to have the desirable properties, as adaptive LASSO requires a good initial estimator of the relevant weights. However, adaptive LASSO is expected to be much faster computationally, so the methods are competitive on different fronts and the one that is recommended depends on the researcher's resources.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.00265
  8. By: Ryan Kessler; James McQueen; Miikka Rokkanen
    Abstract: We develop a theoretical framework for sample splitting in A/B testing environments, where data for each test are partitioned into two splits to measure methodological performance when the true impacts of tests are unobserved. We show that sample-split estimators are generally biased for full-sample performance but consistently estimate sample-split analogues of it. We derive their asymptotic distributions, construct valid confidence intervals, and characterize the bias-variance trade-offs underlying sample-split design choices. We validate our theoretical results through simulations and provide implementation guidance for A/B testing products seeking to evaluate new estimators and decision rules.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.03366
  9. By: Bufan Li; Lujia Bai; Weichi Wu
    Abstract: This paper presents a systematic framework for controlling false discovery rate in learning time-varying correlation networks from high-dimensional, non-linear, non-Gaussian and non-stationary time series with an increasing number of potential abrupt change points in means. We propose a bootstrap-assisted approach to derive dependent and time-varying P-values from a robust estimate of time-varying correlation functions, which are not sensitive to change points. Our procedure is based on a new high-dimensional Gaussian approximation result for the uniform approximation of P-values across time and different coordinates. Moreover, we establish theoretically guaranteed Benjamini--Hochberg and Benjamini--Yekutieli procedures for the dependent and time-varying P-values, which can achieve uniform false discovery rate control. The proposed methods are supported by rigorous mathematical proofs and simulation studies. We also illustrate the real-world application of our framework using both brain electroencephalogram and financial time series data.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.10467
  10. By: Sanjog Misra
    Abstract: Foundation models, and in particular large language models, can generate highly informative responses, prompting growing interest in using these ''synthetic'' outputs as data in empirical research and decision-making. This paper introduces the idea of a foundation prior, which shows that model-generated outputs are not as real observations, but draws from the foundation prior induced prior predictive distribution. As such synthetic data reflects both the model's learned patterns and the user's subjective priors, expectations, and biases. We model the subjectivity of the generative process by making explicit the dependence of synthetic outputs on the user's anticipated data distribution, the prompt-engineering process, and the trust placed in the foundation model. We derive the foundation prior as an exponential-tilted, generalized Bayesian update of the user's primitive prior, where a trust parameter governs the weight assigned to synthetic data. We then show how synthetic data and the associated foundation prior can be incorporated into standard statistical and econometric workflows, and discuss their use in applications such as refining complex models, informing latent constructs, guiding experimental design, and augmenting random-coefficient and partially linear specifications. By treating generative outputs as structured, explicitly subjective priors rather than as empirical observations, the framework offers a principled way to harness foundation models in empirical work while avoiding the conflation of synthetic ''facts'' with real data.
    Date: 2025–11
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.01107
  11. By: Christopher Carter; Adeline Delavande; Mario Fiorini; Peter Siminski; Patrick Vu
    Abstract: This note studies optimal experimental design under partial compliance when experimenters can screen participants prior to randomization. Theoretical results show that retaining all compliers and screening out all non-compliers achieves three complementary aims: (i) the Local Average Treatment Effect is the same as the standard 2SLS estimator with no screening; (ii) median bias is minimized; and (iii) statistical power is maximized. In practice, complier status is unobserved. We therefore discuss feasible screening strategies and propose a simple test for screening efficacy. Future work will conduct an experiment to demonstrate the feasibility and advantages of the optimal screening design.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.09206
  12. By: Oatley, Scott (University of Manchester); Gayle, Vernon Professor (University of Edinburgh); Connelly, Roxanne (University of Edinburgh)
    Abstract: Missing data is a threat to the accurate reporting of substantive results within data analysis. While handling missing data strategies are widely available, many studies fail to account for missingness in their analysis. Those who do engage in handling missing data analysis sometimes engage in less than-gold-standard approaches. These gold-standard approaches: multiple imputation (MI) and full information maximum likelihood (FIML), are rarely compared with one another. This paper assess the efficiency of different handling missing data techniques and directly compares these gold-standard methods. A Monte Carlo simulation is performed to accomplish this task. Results confirm that under a missing at-random assumption, methods such as listwise deletion and single use imputation are inefficient at handling missing data. MI and FIML based approaches, when conducted correctly, provide equally compelling reductions in bias under a Missing at Random (MAR) mechanism. A discussion of statistical and time-based efficiency is also provided.
    Date: 2025–12–05
    URL: https://d.repec.org/n?u=RePEc:osf:socarx:4vtqs_v1
  13. By: Nicolas Hardy; Dimitris Korobilis
    Abstract: We revisit macroeconomic time-varying parameter vector autoregressions (TVP-VARs), whose persistent coefficients may adapt too slowly to large, abrupt shifts such as those during major crises. We explore the performance of an adaptively-varying parameter (AVP) VAR that incorporates deterministic adjustments driven by observable exogenous variables, replacing latent state innovations with linear combinations of macroeconomic and financial indicators. This reformulation collapses the state equation into the measurement equation, enabling simple linear estimation of the model. Simulations show that adaptive parameters are substantially more parsimonious than conventional TVPs, effectively disciplining parameter dynamics without sacrificing flexibility. Using macroeconomic datasets for both the U.S. and the euro area, we demonstrate that AVP-VAR consistently improves out-of-sample forecasts, especially during periods of heightened volatility.
    Date: 2025–12
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.03763
  14. By: Orazio Attanasio (Yale University); V’ctor Sancibri‡n (Bocconi University); Federica Ambrosio (Universitˆ di Napoli Federico II)
    Abstract: For a long time, the majority of economists doing empirical work relied on choice data, while data based on answers to hypothetical questions, stated preferences or measures of subjective beliefs were met with some skepticism. Although this has changed recently, much work needs to be done. In this paper, we emphasize the identifying content of new economic measures. In the first part of the paper, we discuss where the literature on measures in economics stands at the moment. We first consider how the design and use of new measures can help identify causal links and structural parameters under weaker assumptions than those required by approaches based exclusively on choice data. We then discuss how the availability of new measures can allow the study of richer models of human behavior that incorporate a wide set of factors. In the second part of the paper, we illustrate these issues with an application to the study of risk sharing and of deviations from perfect risk sharing.
    Date: 2025–11–01
    URL: https://d.repec.org/n?u=RePEc:cwl:cwldpp:2477
  15. By: Ting-Yen Wang
    Abstract: This study introduces a novel framework for estimating industrial heterogeneity by integrating maximum entropy (ME) estimation of production functions with Zonotope-based measures. Traditional production function estimations often rely on restrictive parametric models, failing to capture firm behavior under uncertainty. This research addresses these limitations by applying Hang K. Ryu's ME method to estimate production functions using World Bank Enterprise Survey (WBES) data from Bangladesh, Colombia, Egypt, and India. The study normalizes entropy values to quantify heterogeneity and compares these measures with a Zonotope-based Gini index. Results demonstrate the ME method's superiority in capturing nuanced, functional heterogeneity often missed by traditional techniques. Furthermore, the study incorporates a "Tangent Against Input Axes" method to dynamically assess technical change within industries. By integrating information theory with production economics, this unified framework quantifies structural and functional differences across industries using firm-level data, advancing both methodological and empirical understanding of heterogeneity. A numerical simulation confirms the ME regression functions can approximate actual industrial heterogeneity. The research also highlights the superior ability of the ME method to provide a precise and economically meaningful measure of industry heterogeneity, particularly for longitudinal analyses.
    Date: 2025–09
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2512.00002
  16. By: Rausser, Gordon; Villas-Boas, Sofia B.
    Abstract: Professor George Garrett Judge's body of work constitutes one of the most intellectually coherent and forward-looking research programs in modern quantitative economics, spanning three distinct but related domains: econometric estimation theory, spatial equilibrium operations research, and information-theoretic inference. What appears at first to be a diverse set of contributions is in fact organized around a single foundational question: How can economists recover reliable information about complex systems from noisy, incomplete, and imperfect data? Judge approached this challenge first by advancing new estimators and finitesample theory, then by reformulating spatial general equilibrium through mathematical programming methods, and ultimately by developing an entropy-based framework that integrates information theory, statistical mechanics, and computational methods. His vision redefines quantitative economics for an information-rich but uncertainty-dominated world, emphasizing epistemological humility, out-of-sample predictive performance, and the dynamic recovery of information over static parameter estimation. Across more than 150 articles, 16 books, and decades of mentorship, Judge reshaped agricultural economics, applied economics, and econometrics more broadly.
    Keywords: Research Methods/ Statistical Methods
    Date: 2025
    URL: https://d.repec.org/n?u=RePEc:ags:assa26:379046
  17. By: Dai, Xiaowu; Mouti, Saad; do Vale, Marjorie Lima; Ray, Sumantra; Bohn, Jeffrey; Goldberg, Lisa
    Abstract: Two-point time-series data, characterized by baseline and follow-up observations, are frequently encountered in health research. We study a novel two-point time-series structure without a control group, which is driven by an observational routine clinical dataset collected to monitor key risk markers of type-2 diabetes (T2D) and cardiovascular disease (CVD). We propose a resampling approach called "I-Rand" for independently sampling one of the two-time points for each individual and making inferences on the estimated causal effects based on matching methods. The proposed method is illustrated with data from a service-based dietary intervention to promote a low-carbohydrate diet (LCD), designed to impact risk of T2D and CVD. Baseline data contain a pre-intervention health record of study participants, and health data after LCD intervention are recorded at the follow-up visit, providing a two-point time-series pattern without a parallel control group. Using this approach we find that obesity is a significant risk factor of T2D and CVD, and an LCD approach can significantly mitigate the risks of T2D and CVD. We provide code that implements our method.
    Keywords: Resampling, Matching method, Causal inference, Two-point time-series, Synthetic control, Type-2 diabetes, Cardiovascular disease, Cardiovascular disease, Causal inference, Matching method, Resampling, Synthetic control, Two-point time-series, Type-2 diabetes, 3102 Bioinformatics and computational biology (for-2020), 4905 Statistics (for-2020)
    Date: 2025–01–01
    URL: https://d.repec.org/n?u=RePEc:cdl:econwp:qt1hr3447x

This nep-ecm issue is ©2025 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.