|
on Econometrics |
| By: | Artem Samiahulin |
| Abstract: | Regression discontinuity (RD) designs with multiple running variables arise in a growing number of empirical applications, including geographic boundaries and multi-score assignment rules. Although recent methodological work has extended estimation and inference tools to multivariate settings, far less attention has been devoted to developing global testing methods that formally assess whether a discontinuity exists anywhere along a multivariate treatment boundary. Existing approaches perform well in large samples, but can exhibit severe size distortions in moderate or small samples due to the sparsity of observations near any particular boundary point. This paper introduces a complementary global testing procedure that mitigates the small-sample weaknesses of existing multivariate RD methods by integrating multivariate machine learning estimators with a distance-based aggregation strategy, yielding a test statistic that remains reliable with limited data. Simulations demonstrate that the proposed method maintains near-nominal size and strong power, including in settings where standard multivariate estimators break down. The procedure is applied to an empirical setting to demonstrate its implementation and to illustrate how it can complement existing multivariate RD estimators. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.03819 |
| By: | Bellocca, Gian Pietro Enzo; Garrón Vedia, Ignacio; Rodríguez Caballero, Carlos Vladimir; Ruiz Ortega, Esther |
| Abstract: | The research question we answer in this paper is whether the asymptotic distribution derived by Bai (2003) for Principal Components (PC) factors in dynamic factor models (DFMs) can approximate the empirical distribution of the sequential Least Squares (SLS) estimator of global and group-specific factors in multi-level dynamic factor models (ML-DFMs). Monte Carlo experiments confirm that under general forms of the idiosyncratic covariance matrix, the finite-sample distribution of SLS global and group-specific factors can be well approximated using the asymptotic distribution of PC factors. We also analyse the performance of alternative estimators of the asymptotic mean squared error (MSE) of the SLS factors and show that the MSE estimator that allows for idiosyncratic cross-sectional correlation and accounts for estimation uncertainty of factor loadings is best. |
| Keywords: | Multi-Level Dynamic Factors Models; Principal Components; Sequential Least Squares; Subsampling |
| JEL: | C13 C32 C55 F47 |
| Date: | 2026–02–16 |
| URL: | https://d.repec.org/n?u=RePEc:cte:wsrepe:49336 |
| By: | Johann Caro-Burnett |
| Abstract: | Standard instrumental variables (IV) methods identify a Local Average Treatment Effect under monotonicity, which rules out defiers. In many empirical environments, however, distinct instruments may induce heterogeneous and even opposing behavioral responses. This paper introduces the Difference-in-Instrumental-Variables (DIIV) estimand, which exploits two instruments with opposing compliance patterns to recover a point-identified and behaviorally interpretable causal effect without imposing monotonicity. The estimand yields a convex combination of the marginal treatment effects on compliers and defiers, with weights reflecting differential shifts in treatment take-up across instruments. When monotonicity holds, DIIV coincides with the standard IV estimand. The approach can be implemented using simple linear transformations and standard two-stage least squares procedures. Applications using replication data illustrate its applicability in practice. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.12504 |
| By: | Simar, Léopold (Université catholique de Louvain, LIDAM/ISBA, Belgium); Wilson, Paul (Clemson University) |
| Abstract: | Production theory is based on an economic model where we define the production set, i.e. the set of the combinations of inputs and outputs that are technically feasible. The efficiency of a particular unit is measured by its distance to the efficient frontier of the production set, based on a selected direction. Nonparametric models are particularly appealing because they do not rely on restrictive assumptions about the shape of the efficient frontier nor on the processes that may give rise to inefficiencies. Since these quantities are typically unknown, they must be estimated from a sample of observed units. The most widely used non-parametric approaches are based on envelopment estimators such as Data Envelopment Analysis (DEA) or Free Disposal Hull (FDH), making the derived measures of efficiency for a given unit dependent on these envelopment estimators. In recent decades, substantial results have been derived regarding the statistical properties of these non-parametric estimators. These advancements facilitate statistical inference regarding the efficiency scores of individual units acrossdifferent contexts or efficiency comparison between groups of units, as well as testing procedures concerning the shape of the attainable set (whether convex or non-convex), or assumptions about returns to scale. It is shown how crucial the assumptions made on the DGP are, incorrect assumptions may lead to inconsistent estimators and wrong inference. These results have now been extended to dynamic settings, including inference on Malmquist Productivity Indices (and other well-known productivity indices) and their components. In this paper, we provide a comprehensive up-to-date survey of various approaches. |
| Keywords: | Production Theory ; Nonparametric estimation ; Data envelopment analysis ; Conditional frontiers |
| JEL: | C1 C14 C13 D24 O47 |
| Date: | 2026–02–01 |
| URL: | https://d.repec.org/n?u=RePEc:aiz:louvad:2026002 |
| By: | Andy Snell (School of Economics, University of Edinburgh) |
| Abstract: | It is well known that if the regression coefficient of y on w (β say) has a constant probability limit but we only have two noisy measures of w - x and z - then we may obtain consistent estimates of β as long as a) the measurement errors are classical and b) the measurement errors are uncorrelated. We propose a simple test of a) and a test for b) as part of a composite null. To effect the latter we instrument x with z and functions of z and vice versa to obtain two sets of overidentifying restrictions tested via a standard J test of instrument validity. If no test in this sequence rejects we then combine the orthogonality conditions to obtain a single efficient estimate of β. We discuss the likely prior validity of the various instruments and the pitfalls in using the test procedure. Unlike standard overidentification tests which diverge in heterogeneous response settings even when each instrument is valid, our tests only diverge when one or more instrument is invalid. We apply the test sequence and estimation procedure to analyse i) the cyclical component of wages and ii) the effect of state level unemployment on burglaries in the US. Correcting for measurement error raises the estimates of β in both applications. |
| Keywords: | Measurement Error, Instrumental Variables, Consistent OLS estimation. |
| Date: | 2025–06 |
| URL: | https://d.repec.org/n?u=RePEc:edn:esedps:321 |
| By: | Cazals, Catherine (Toulouse School of Economics); Florens, Jean-Pierre (Toulouse School of Economics); Simar, Léopold (Université catholique de Louvain, LIDAM/ISBA, Belgium) |
| Abstract: | In production theory, a lot of attention has been paid in the literature to the analysis of the effect of environmental variables on the efficiency of firms. The usual and natural way to investigate this issue is to consider conditional frontier models. For nonparametric approaches, this can create serious problems if the number of these potential environmental factors increases, exacerbating the curse of dimensionality characteristic of nonparametric models. In this paper, to address this issue, we investigate whether Single Index Models (SIM) could be used for modeling the effect of these variables on the production process. We propose a test for the SIM hypothesis and analyse the asymptotic properties. If the SIM model is not rejected, we obtain better rates of convergence of the conditional efficiency estimates. The paper investigates, through some Monte-Carlo experiments, the finite sample properties of the proposed test and the properties of the resulting estimates of the SIM when it is not rejected. We illustrate the method with a real data set from the French national postal operator in charge of universal service. |
| Keywords: | Nonparametric conditional frontier ; Single-Index ; Robust frontier ; Environmental variables |
| JEL: | C10 C14 C51 D22 |
| Date: | 2025–11–28 |
| URL: | https://d.repec.org/n?u=RePEc:aiz:louvad:2025022 |
| By: | Candelon, Bertrand (Université catholique de Louvain, LIDAM/LFIN, Belgium); Luisi, Angelo (Ghent University) |
| Abstract: | When modeling the dynamics of a large cross section of interdependent variables, the employment of small scale Vector Autoregressive (VAR) models augmented by few factors, extracted as linear combination of the variables under analysis, is the typical solution to avoid the proliferation of parameters in large heterogeneous VARs. The factors’ loadings/weights can be estimated or derived from economic literature, and are usually interpreted as interconnection channels. We propose a novel Likelihood Ratio Test procedure to empirically evaluate the chosen set of weights, and show that testing is fundamental for valid inferences. We exploit the intuition that, if the factors employed are empirically valid, no remaining information from the cross section remains statistically significant. The proposed test is intuitive, easy to implement, and presents very good finite sample properties. In the empirical exercise, we test several interconnection channels for the sovereign bond market in the euro area. |
| Keywords: | Global VARs ; FAVARs ; Likelihood Ratio Test ; Interdependence |
| Date: | 2025–11–30 |
| URL: | https://d.repec.org/n?u=RePEc:ajf:louvlf:2025005 |
| By: | Dan Ben-Moshe; David Genesove |
| Abstract: | This paper derives closed-form unbiased estimators of central moments in multilevel random-effects models with unbalanced group sizes. In a two-level model, we provide unbiased estimators for the second, third, and fourth central moments under both group-level and observation-level averaging. In a three-level model, we provide unbiased estimators for the second and third central moments. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.03469 |
| By: | Albert Tan; Sadegh Shirani; James Nordlund; Mohsen Bayati |
| Abstract: | Estimating total treatment effects in the presence of network interference typically requires knowledge of the underlying interaction structure. However, in many practical settings, network data is either unavailable, incomplete, or measured with substantial error. We demonstrate that causal message passing, a methodology that leverages temporal structure in outcome data rather than network topology, can recover total treatment effects comparable to network-aware approaches. We apply causal message passing to two large-scale field experiments where a recently developed bipartite graph methodology, which requires network knowledge, serves as a benchmark. Despite having no access to the interaction network, causal message passing produces effect estimates that match the network-aware approach in direction across all metrics and in statistical significance for the primary decision metric. Our findings validate the premise of causal message passing: that temporal variation in outcomes can serve as an effective substitute for network observation when estimating spillover effects. This has important practical implications: practitioners facing settings where network data is costly to collect, proprietary, or unreliable can instead exploit the temporal dynamics of their experimental data. |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:arx:papers:2602.04230 |
| By: | Lassance, Nathan (Université catholique de Louvain, LIDAM/LFIN, Belgium); Vanderveken, Rodolphe (Université catholique de Louvain, LIDAM/LFIN, Belgium); Vrins, Frédéric (Université catholique de Louvain, LIDAM/LFIN, Belgium) |
| Abstract: | We introduce analytical linear and nonlinear shrinkage estimators of the sample covariance matrix that are optimal for mean-variance portfolio choice. Unlike the classical estimators based on statistical loss functions like the mean squared error, our shrinkage covariance matrices optimize the expected out-of-sample portfolio utility and account for estimation errors in mean returns. Our estimators shrink the sample eigenvalues more intensively than conventional methods, and they especially diminish the contribution of principal components with small squared Sharpe ratios. By jointly estimating the covariance matrix and the optimal portfolio in one step, our method delivers significant empirical performance gains relative to the usual two-step shrinkage approach. Our portfolios also help reduce turnover and outperform recent regularized mean-variance portfolio strategies. |
| Keywords: | Estimation risk ; linear shrinkage ; mean-variance portfolio ; nonlinear shrinkage ; out-of-sample utility ; parameter uncertainty |
| JEL: | G11 |
| Date: | 2025–07–11 |
| URL: | https://d.repec.org/n?u=RePEc:ajf:louvlf:2025002 |
| By: | Yilin Xiao; Jamie L. Cross |
| Abstract: | We propose a new class of Regularized Random Subspace Regressions (RRSRs) that combine the variance reduction benefits of regularized estimators with the non-linearities of random subspace ensembles. The approach introduces regularization in the selection of predictor subspaces, coefficient estimation within each subspace, or in both, yielding a flexible family of models that nest both RSR and standard penalized regressions as special cases. Using the FRED-MD database as a large predictor space, we show that RRSRs consistently outperform traditional RSR and several widely used econometric and machine learning benchmarks when forecasting four key macroeconomic indicators: inflation, output, unemployment, and the federal funds rate. The most systematic gains arise from the double-regularized specification, underscoring the value of applying shrinkage jointly to subspace selection and coefficient estimation. |
| Keywords: | big data, forecasting, machine learning, model averaging, random subspace, regularization |
| JEL: | C22 C53 C55 E37 |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:een:camaaa:2026-13 |
| By: | Barbagli, Matteo (Université catholique de Louvain, LIDAM/LFIN, Belgium); Vrins, Frédéric (Université catholique de Louvain, LIDAM/LFIN, Belgium) |
| Abstract: | In this paper we address the explicit exclusion of credit concentration risk from the Pillar 1 minimum capital requirements formulas of the Basel framework. Leveraging on a well established Gaussian multi-factor model, we introduce a novel control variate estimator of value-at-risk (VaR), suitable for measuring sector concentration risk under the Pillar 2 guidelines. This estimator integrates the precision of Monte Carlo simulations with the speed and simplicity of the Large Pool approximation, aiming for a more efficient quantile estimation tool. We conduct numerical experiments in a two systematic factor setup to test the validity of our methodology, achieving consistent variance reduction compared to the benchmark Monte Carlo estimator. Our results are robust across various pool parameters and increasing number of Monte Carlo simulations. |
| Keywords: | Credit risk ; Factor model ; Control variate ; Value-at-risk ; Basel regulation |
| JEL: | G21 G28 G32 |
| Date: | 2025–08–06 |
| URL: | https://d.repec.org/n?u=RePEc:ajf:louvlf:2025003 |
| By: | Qian, Jingye; Marín Díazaraque, Juan Miguel; Veiga, Helena |
| Abstract: | We develop a structural VAR with Threshold Stochastic Volatility (VAR-TSV) to study state-dependent transmission among climate conditions, energy prices, and industrial activity. The model combines volatility-in-mean effects with a threshold in log-volatility dynamics that generates discrete shifts between low- and high-volatility states, while keeping VAR propagation and contemporaneous identification unchanged across regimes. The threshold is an observed Low Economic Growth indicator that shifts the level of industrial volatility. We estimate the model in a Bayesian framework and apply it to monthly data for seven European economies (1970s to 2023, varying according to availability) using temperature anomalies, CPI inflation in energy and industrial production growth. Volatility-shock impulse responses and volatility-state-conditional connectedness reveal strong cross-country heterogeneity, with high resilience in Northern Europe, high sensitivity in Central Europe, and high persistence in Southern Europe. |
| Keywords: | Bayesian VAR; Climate uncertainty; Connectedness; Energy transition; Stochastic threshold volatility; Volatility-in-mean; Volatility regimes |
| JEL: | Q43 Q54 C11 C32 |
| Date: | 2026–02–16 |
| URL: | https://d.repec.org/n?u=RePEc:cte:wsrepe:49327 |
| By: | Gril, Lorena; Hossain, Md Jamal; Tzavidis, Nikos; Rendtel, Ulrich |
| Abstract: | The availability of geocoordinates offers valuable insights into spatial patterns of economic, demographic and health outcomes. However, disclosing the exact geolocation of statistical units to secondary analysts contravenes the responsible use of data. To protect privacy, anonymisation methods are used. A commonly applied anonymisation method is the one used by Demographic and Health Surveys (DHS). The DHS anonymisation scheme works by first aggregating data at small spatial units followed by random (donut) displacement of the geocoordinates. It is reasonable for secondary analysts to be concerned about the impact of anonymisation on the analyses. In this paper, the DHS anonymisation scheme is used as a basis for studying how anonymisation impacts on kernel density estimation. We propose methodology to account for the impact of the anonymisation process on density estimation. The proposed methodology is based on deriving the distribution of the true coordinates given the observed (anonymised) coordinates. Density estimation is then implemented by using the theoretical distribution and an iterative algorithm that accounts for both aggregation and displacement. The aim is to approximate the original population density using generated pseudo-coordinates under the assumption that the anonymisation process is known. The proposed method is illustrated by using DHS data from the Rajshahi Division in Bangladesh to estimate the density of households below the poverty line. The results show that accounting for measurement error due to anonymisation leads to a more accurate picture of the spatial distribution of poverty. |
| Keywords: | Aggregation, Confidentiality, Measurement error, Random (donut) displacement |
| Date: | 2026 |
| URL: | https://d.repec.org/n?u=RePEc:zbw:fubsbe:336811 |
| By: | Ollech, Daniel; Stefan, Martin |
| Abstract: | Official statistics increasingly make use of higher-frequency time series. But when users ultimately are interested in a seasonally adjusted temporal aggregate of these data, we have to decide whether to perform seasonal adjustment or aggregation first. Consequently, we must weigh up the benefits of richer informational content against the increased computational requirements and the challenges presented by using more volatile and outlier-prone data. We examine this trade-off on simulated and real-world time series using a battery of diagnostics including revision size, tests on residual seasonal and calendar effects and linkage with target variables using leading adjustment procedures: DSA, WSA, X-13-ARIMA, and TRAMO- SEATS. We synthesise our findings into practical guidelines that help users choose the aggregation level that balances statistical quality and real-time usefulness. |
| Keywords: | higher frequency time series, temporal aggregation, official time series |
| JEL: | C13 C14 C22 C52 |
| Date: | 2026 |
| URL: | https://d.repec.org/n?u=RePEc:zbw:bubdps:336747 |