
on Econometrics 
By:  Yoici Arai; Taisuke Otsu; Mengshan Xu 
Abstract:  The generalized least square (GLS) is one of the most basic tools in regression analyses. A major issue in implementing the GLS is estimation of the conditional variance function of the error term, which typically requires a restrictive functional form assumption for parametric estimation or tuning parameters for nonparametric estimation. In this paper, we propose an alternative approach to estimate the conditional variance function under nonparametric monotonicity constraints by utilizing the isotonic regression method. Our GLS estimator is shown to be asymptotically equivalent to the infeasible GLS estimator with knowledge of the conditional error variance, and is free from tuning parameters, not only for point estimation but also for interval estimation or hypothesis testing. Our analysis extends the scope of the isotonic regression method by showing that the isotonic estimates, possibly with generated variables, can be employed as first stage estimates to be plugged in for semiparametric objects. Simulation studies illustrate excellent finite sample performances of the proposed method. As an empirical example, we revisit Acemoglu and Restrepo's (2017) study on the relationship between an aging population and economic growth to illustrate how our GLS estimator effectively reduces estimation errors. 
Keywords:  Generalized least squares, Monotonicity, Isotonic regression 
JEL:  C13 C14 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:cep:stiecm:625&r=ecm 
By:  Giovanni Angelini; Giuseppe Cavaliere; Luca Fanelli 
Abstract:  When proxies (external instruments) used to identify target structural shocks are weak, inference in proxySVARs (SVARIVs) is nonstandard and the construction of asymptotically valid confidence sets for the impulse responses of interest requires weakinstrument robust methods. In the presence of multiple target shocks, test inversion techniques require extra restrictions on the proxySVAR parameters other those implied by the proxies that may be difficult to interpret and test. We show that frequentist asymptotic inference in these situations can be conducted through Minimum Distance estimation and standard asymptotic methods if the proxySVAR is identified by using proxies for the nontarget shocks; i.e., the shocks which are not of primary interest in the analysis. The suggested identification strategy hinges on a novel pretest for instrument relevance based on bootstrap resampling. This test is free from pretesting issues, robust to conditionally heteroskedasticity and/or zerocensored proxies, computationally straightforward and applicable regardless on the number of shocks being instrumented. Some illustrative examples show the empirical usefulness of the suggested approach. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.04523&r=ecm 
By:  Clément de Chaisemartin; Xavier D'Haultfoeuille 
Abstract:  We study twowayfixedeffects regressions (TWFE) with several treatment variables. Under a parallel trends assumption, we show that the coefficient on each treatment identifies a weighted sum of that treatment’s effect, with possibly negative weights, plus a weighted sum of the effects of the other treatments. Thus, those estimators are not robust to heterogeneous effects and may be contaminated by other treatments’ effects. When a treatment is omitted from the regression, we obtain a new omitted variable bias formula, where bias can arise even if the treatments are not correlated with each other, but can be smaller than in the TWFE regression with all treatments. We propose an alternative differenceindifferences estimator, robust to heterogeneous effects and immune to the contamination problem. In the application we consider, the TWFE regression identifies a highly nonconvex combination of effects, with large contamination weights, and one of its coefficients significantly differs from our heterogeneityrobust estimator. 
JEL:  C21 C23 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:30564&r=ecm 
By:  Matteo Barigozzi; Daniele Massacci 
Abstract:  We study a novel large dimensional approximate factor model with regime changes in the loadings driven by a latent first order Markov process. By exploiting the equivalent linear representation of the model we first recover the latent factors by means of Principal Component Analysis. We then cast the model in statespace form, and we estimate loadings and transition probabilities through an EM algorithm based on a modified version of the BaumLindgrenHamiltonKim filter and smoother which makes use of the factors previously estimated. An important feature of our approach is that it provides closed form expressions for all estimators. We derive the theoretical properties of the proposed estimation procedure and show their good finite sample performance through a comprehensive set of Monte Carlo experiments. An important feature of our methodology is that it does not require knowledge of the true number of factors. The empirical usefulness of our approach is illustrated through an application to a large portfolio of stocks. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.09828&r=ecm 
By:  Vladislav Morozov 
Abstract:  We develop a methodology for conducting inference on extreme quantiles of unobserved individual heterogeneity (heterogeneous coefficients, heterogeneous treatment effects, and other unobserved heterogeneity) in a panel data or metaanalysis setting. Examples of interest include productivity of most and least productive firms or prediction intervals for studyspecific treatment effects in metaanalysis. Inference in such a setting is challenging. Only noisy estimates of unobserved heterogeneity are available, and approximations based on the central limit theorem work poorly for extreme quantiles. For this situation, under weak assumptions we derive an extreme value theorem for noisy estimates and appropriate rate and moment conditions. In addition, we develop a theory for intermediate order statistics. Both extreme and intermediate order theorems are then used to construct confidence intervals for extremal quantiles. The limiting distribution is nonpivotal, and we show consistency of both subsampling and simulating from the limit distribution. Furthermore, we provide a novel selfnormalized intermediate order theorem. In a Monte Carlo exercise, we show that the resulting extremal confidence intervals have favorable coverage properties in the tail. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.08524&r=ecm 
By:  Matias D. Cattaneo; Yingjie Feng; Filippo Palomba; Rocio Titiunik 
Abstract:  We propose principled prediction intervals to quantify the uncertainty of a large class of synthetic control predictions or estimators in settings with staggered treatment adoption, offering precise nonasymptotic coverage probability guarantees. From a methodological perspective, we provide a detailed discussion of different causal quantities to be predicted, which we call `causal predictands', allowing for multiple treated units with treatment adoption at possibly different points in time. From a theoretical perspective, our uncertainty quantification methods improve on prior literature by (i) covering a large class of causal predictands in staggered adoption settings, (ii) allowing for synthetic control methods with possibly nonlinear constraints, (iii) proposing scalable robust conic optimization methods and principled datadriven tuning parameter selection, and (iv) offering valid uniform inference across posttreatment periods. We illustrate our methodology with a substantive empirical application studying the effects of economic liberalization in the 1990s on GDP for emerging European countries. Companion generalpurpose software packages are provided in Python, R and Stata. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.05026&r=ecm 
By:  Nicholas Brown (Queen's University); Joakim Westerlund (Lund University and Deakin University) 
Abstract:  One of the most popular estimators of interactive effects panel data models is the common correlated effects (CCE) approach, which uses the crosssectional averages of the observables as proxies of the unobserved factors. The present paper proposes a simple test that is suitable for testing hypotheses about the factors in CCE and that is valid provided only that the number of crosssectional units is large. The new test can be used to test if a subset of the averages is enough to proxy the factors, or if there are observable variables that capture the factors. The test can also be used sequentially to determine the smallest set of averages needed to proxy the factors. 
Keywords:  Factor model selection, Interactive effects models, CCE estimation 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:qed:wpaper:1491&r=ecm 
By:  Yong Cai 
Abstract:  This paper studies the properties of linear regression on centrality measures when network data is sparse  that is, when there are many more agents than links per agent  and when they are measured with error. We make three contributions in this setting: (1) We show that OLS estimators can become inconsistent under sparsity and characterize the threshold at which this occurs, with and without measurement error. This threshold depends on the centrality measure used. Specifically, regression on eigenvector is less robust to sparsity than on degree and diffusion. (2) We develop distributional theory for OLS estimators under measurement error and sparsity, finding that OLS estimators are subject to asymptotic bias even when they are consistent. Moreover, bias can be large relative to their variances, so that bias correction is necessary for inference. (3) We propose novel bias correction and inference methods for OLS with sparse noisy networks. Simulation evidence suggests that our theory and methods perform well, particularly in settings where the usual OLS estimators and heteroskedasticityconsistent/robust ttests are deficient. Finally, we demonstrate the utility of our results in an application inspired by De Weerdt and Deacon (2006), in which we consider consumption smoothing and social insurance in Nyakatoke, Tanzania. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.10024&r=ecm 
By:  Anna Bykhovskaya; James A. Duffy 
Abstract:  This paper extends local to unity asymptotics to the nonlinear setting of the dynamic Tobit model, motivated by the application of this model to highly persistent censored time series. We show that the standardised process converges weakly to a nonstandard limiting process that is constrained (regulated) to be positive, and derive the limiting distributions of the OLS estimates of the model parameters. This allows inferences to be drawn on the overall persistence of a process (as measured by the sum of the autoregressive coefficients), and for the null of a unit root to be tested in the presence of censoring. Our simulations illustrate that the conventional ADF test substantially overrejects when the data is generated by a dynamic Tobit with a unit root. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.02599&r=ecm 
By:  Chen, Yunxiao; Lu, Yan; Moustaki, Irini 
Abstract:  The paper proposes a new latent variable model for the simultaneous (twoway) detection of outlying individuals and items for itemresponsetype data. The proposed model is a synergy between a factor model for binary responses and continuous response times that captures normal item response behaviour and a latent class model that captures the outlying individuals and items. A statistical decision framework is developed under the proposed model that provides compound decision rules for controlling local false discovery/nondiscovery rates of outlier detection. Statistical inference is carried out under a Bayesian framework, for which a Markov chain Monte Carlo algorithm is developed. The proposed method is applied to the detection of cheating in educational tests due to item leakage using a case study of a computerbased nonadaptive licensure assessment. The performance of the proposed method is evaluated by simulation studies. 
Keywords:  Bayesian hierarchical model; outlier detection; false discovery rate; compound decision; test fairness; item response theory; latent class analysis 
JEL:  C1 
Date:  2022–09–01 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:112499&r=ecm 
By:  Christopher Harshaw; Fredrik S\"avje; Yitan Wang 
Abstract:  We describe a new designbased framework for drawing causal inference in randomized experiments. Causal effects in the framework are defined as linear functionals evaluated at potential outcome functions. Knowledge and assumptions about the potential outcome functions are encoded as function spaces. This makes the framework expressive, allowing experimenters to formulate and investigate a wide range of causal questions. We describe a class of estimators for estimands defined using the framework and investigate their properties. The construction of the estimators is based on the Riesz representation theorem. We provide necessary and sufficient conditions for unbiasedness and consistency. Finally, we provide conditions under which the estimators are asymptotically normal, and describe a conservative variance estimator to facilitate the construction of confidence intervals for the estimands. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.08698&r=ecm 
By:  Kazuhiko Kakamu 
Abstract:  This study proposes a reversible jump Markov chain Monte Carlo method for estimating parameters of lognormal distribution mixtures for income. Using simulated data examples, we examined the proposed algorithm's performance and the accuracy of posterior distributions of the Gini coefficients. Results suggest that the parameters were estimated accurately. Therefore, the posterior distributions are close to the true distributions even when the different data generating process is accounted for. Moreover, promising results for Gini coefficients encouraged us to apply our method to real data from Japan. The empirical examples indicate two subgroups in Japan (2020) and the Gini coefficients' integrity. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.05115&r=ecm 
By:  Emil Aas Stoltenberg 
Abstract:  In this paper the regression discontinuity design is adapted to the survival analysis setting with rightcensored data, studied in an intensity based counting process framework. In particular, a local polynomial regression version of the Aalen additive hazards estimator is introduced as an estimator of the difference between two covariate dependent cumulative hazard rate functions. Largesample theory for this estimator is developed, including confidence intervals that take into account the uncertainty associated with bias correction. As is standard in the causality literature, the models and the theory are embedded in the potential outcomes framework. Two general results concerning potential outcomes and the multiplicative hazards model for survival data are presented. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.02548&r=ecm 
By:  Samuel Higbee 
Abstract:  I study the problem of a decision maker choosing a policy to allocate treatment to a heterogeneous population on the basis of experimental data that includes only a subset of possible treatment values. The effects of new treatments are partially identified based on shape restrictions on treatment response. I propose solving an empirical minimax regret problem to estimate the policy and show it has a tractable linear and integerprogramming formulation. I prove the maximum regret of the estimator converges to the lowest possible maximum regret at the rate at which heterogeneous treatment effects can be estimated in the experimental data or $N^{1/2}$, whichever is slower. I apply my results to design targeted subsidies for electrical grid connections in rural Kenya, and estimate that $97\%$ of the population should be given a treatment not implemented in the experiment. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.04703&r=ecm 
By:  Alejandro Rodriguez Dominguez; David Stynes 
Abstract:  We present a geometric version of Quickest Change Detection (QCD) and Quickest Hub Discovery (QHD) tests in correlation structures that allows us to include and combine new information with distance metrics. The topic falls within the scope of sequential, nonparametric, highdimensional QCD and QHD, from which stateoftheart settings developed global and local summary statistics from asymptotic Random Matrix Theory (RMT) to detect changes in random matrix law. These settings work only for uncorrelated prechange variables. With our geometric version of the tests via clustering, we can test the hypothesis that we can improve stateoftheart settings for QHD, by combining QCD and QHD simultaneously, as well as including information about prechange timeevolution in correlations. We can work with correlated prechange variables and test if the timeevolution of correlation improves performance. We prove test consistency and design test hypothesis based on clustering performance. We apply this solution to financial time series correlations. Future developments on this topic are highly relevant in finance for Risk Management, Portfolio Management, and Market Shocks Forecasting which can save billions of dollars for the global economy. We introduce the Diversification Measure Distribution (DMD) for modeling the timeevolution of correlations as a function of individual variables which consists of a DirichletMultinomial distribution from a distance matrix of rolling correlations with a threshold. Finally, we are able to verify all these hypotheses. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.03988&r=ecm 
By:  Fernando MorenoPino; Stefan Zohren 
Abstract:  Volatility forecasts play a central role among equity risk measures. Besides traditional statistical models, modern forecasting techniques, based on machine learning, can readily be employed when treating volatility as a univariate, daily timeseries. However, econometric studies have shown that increasing the number of daily observations with highfrequency intraday data helps to improve predictions. In this work, we propose DeepVol, a model based on Dilated Causal Convolutions to forecast dayahead volatility by using highfrequency data. We show that the dilated convolutional filters are ideally suited to extract relevant information from intraday financial data, thereby naturally mimicking (via a datadriven approach) the econometric models which incorporate realised measures of volatility into the forecast. This allows us to take advantage of the abundance of intraday observations, helping us to avoid the limitations of models that use daily data, such as model misspecification or manually designed handcrafted features, whose devise involves optimising the tradeoff between accuracy and computational efficiency and makes models prone to lack of adaptation into changing circumstances. In our analysis, we use two years of intraday data from NASDAQ100 to evaluate DeepVol's performance. The reported empirical results suggest that the proposed deep learningbased approach learns global features from highfrequency data, achieving more accurate predictions than traditional methodologies, yielding to more appropriate risk measures. 
Date:  2022–09 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.04797&r=ecm 
By:  Shlomo, Natalie; Skinner, Chris 
Abstract:  We review the influential research carried out by Chris Skinner in the area of statistical disclosure control, and in particular quantifying the risk of reidentification in sample microdata from a random survey drawn from a finite population. We use the sample microdata to infer population parameters when the population is unknown, and estimate the risk of reidentification based on the notion of population uniqueness using probabilistic modelling. We also introduce a new approach to measure the risk of reidentification for a subpopulation in a register that is not representative of the general population, for example a register of cancer patients. In addition, we can use the additional information from the register to measure the risk of reidentification for the sample microdata. This new approach was developed by the two authors and is published here for the first time. We demonstrate this approach in an application study based on UK census data where we can compare the estimated risk measures to the known truth. 
Keywords:  disclosure risks; key variables; loglinear models; model specification; probability scores estimation; registers; EP/K032208/1 
JEL:  C1 
Date:  2022–10–07 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:117168&r=ecm 
By:  Vitor Possebom; Flavio Riva 
Abstract:  This paper presents identification results for the probability of causation when there is sample selection. We show that the probability of causation is partially identified for individuals who are always observed regardless of treatment status and derive sharp bounds under three increasingly restrictive sets of assumptions. The first set imposes an exogenous treatment and a monotone sample selection mechanism. To tighten these bounds, the second set also imposes the monotone treatment response assumption, while the third set additionally imposes a stochastic dominance assumption. Finally, we use experimental data from the Colombian job training program J\'ovenes en Acci\'on to empirically illustrate our approach's usefulness. We find that, among individuals who are always employed regardless of treatment, at least 12% and at most 19% transition to the formal labor market because of this training program. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.01938&r=ecm 
By:  Seabrook, Isobel; Barucca, Paolo; Caccioli, Fabio 
Abstract:  A fundamental problem in the study of networks is the identification of important nodes. This is typically achieved using centrality metrics, which rank nodes in terms of their position in the network. This approach works well for static networks, that do not change over time, but does not consider the dynamics of the network. Here we propose instead to measure the importance of a node based on how much a change to its strength will impact the global structure of the network, which we measure in terms of the spectrum of its adjacency matrix. We apply our method to the identification of important nodes in equity transaction networks and show that, while it can still be computed from a static network, our measure is a good predictor of nodes subsequently transacting. This implies that static representations of temporal networks can contain information about their dynamics. 
Keywords:  Node predictability; Spectral perturbation; Temporal network 
JEL:  C1 F3 G3 
Date:  2022–12–01 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:117130&r=ecm 
By:  Sampi Bravo,James Robert Ezequiel; Jooste,Charl; Vostroknutova,Ekaterina 
Abstract:  This paper addresses several shortcomings in the productivity and markup estimation literature. Using MonteCarlo simulations, the analysis shows that the methods in Ackerberg, Caves and Frazer (2015) and De Loecker and Warzynski (2012) produce biased estimates of the impact of policy variables on markups and productivity. This bias stems from endogeneity due to the following: (1) the functional form of the production function; (2) the omission of demand shifters; (3) the absence of price information; (4) the violation of the Markov process for productivity; and (5) misspecification when marginal costs are excluded in the estimation. The paper addresses these concerns using a quasimaximum likelihood approach and a generalized estimator for the production function. It produces unbiased estimates of the impact of regulation on markups and productivity. The paper therefore proposes a workaround solution for the identification problem identified in Bond, Hashemi, Kaplan and Zoch (2020), and an unbiased measure of productivity, by directly accounting for the joint impact of regulation on markups and productivity. 
Keywords:  International Trade and Trade Rules,Competition Policy,Competitiveness and Competition Policy,De Facto Governments,Democratic Government,State Owned Enterprise Reform,Public Sector Administrative and Civil Service Reform,Economics and Finance of PublicInstitution Development,Public Sector Administrative&Civil Service Reform,Macroeconomic Management,Governance Diagnostic Capacity Building,Economic Forecasting,Trade Policy 
Date:  2021–01–21 
URL:  http://d.repec.org/n?u=RePEc:wbk:wbrwps:9523&r=ecm 
By:  Carsten Chong; Marc Hoffmann; Yanghui Liu; Mathieu Rosenbaum; Gr\'egoire Szymanski 
Abstract:  Rough volatility models have gained considerable interest in the quantitative finance community in recent years. In this paradigm, the volatility of the asset price is driven by a fractional Brownian motion with a small value for the Hurst parameter $H$. In this work, we provide a rigorous statistical analysis of these models. To do so, we establish minimax lower bounds for parameter estimation and design procedures based on wavelets attaining them. We notably obtain an optimal speed of convergence of $n^{1/(4H+2)}$ for estimating $H$ based on n sampled data, extending results known only for the easier case $H>1/2$ so far. We therefore establish that the parameters of rough volatility models can be inferred with optimal accuracy in all regimes. 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2210.01214&r=ecm 
By:  Florian Berg; Julian F. Koelbel; Anna Pavlova; Roberto Rigobon 
Abstract:  How does ESG (environmental, social, and governance) performance affect stock returns? Answering this question is difficult because existing measures of ESG perfor mance — ESG ratings — are noisy and, therefore, standard regression estimates suffer from attenuation bias. To address the bias, we propose two noisecorrection procedures, in which we instrument ESG ratings with ratings of other ESG rating agencies, as in the classical errorsinvariables problem. The corrected estimates demonstrate that the effect of ESG performance on stock returns is stronger than previously estimated: after correcting for attenuation bias, the coefficients increase on average by a factor of 2.6, implying an average noisetosignal ratio of 61.7%. The attenuation bias is stable across horizons at which stock returns are measured. In simulations, our noisecorrection pro cedures outperform the standard approaches followed by practitioners such as averages or principal component analysis. 
JEL:  C26 G12 Q56 
Date:  2022–10 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:30562&r=ecm 