
on Econometrics 
By:  Tom Boot; Gianmaria Niccodemi; Tom Wansbeek 
Abstract:  When data are clustered, common practice has become to do OLS and use an estimator of the covariance matrix of the OLS estimator that comes close to unbiasedness. In this paper we derive an estimator that is unbiased when the randomeffects model holds. We do the same for two more general structures. We study the usefulness of these estimators against others by simulation, the size of the $t$test being the criterion. Our findings suggest that the choice of estimator hardly matters when the regressor has the same distribution over the clusters. But when the regressor is a clusterspecific treatment variable, the choice does matter and the unbiased estimator we propose for the randomeffects model shows excellent performance, even when the clusters are highly unbalanced. 
Date:  2022–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2206.09644&r= 
By:  Timo Dimitriadis; Yannick Hoga 
Abstract:  The popular systemic risk measure CoVaR (conditional ValueatRisk) is widely used in economics and finance. Formally, it is defined as an (extreme) quantile of one variable (e.g., losses in the financial system) conditional on some other variable (e.g., losses in a bank's shares) being in distress and, hence, measures the spillover of risks. In this article, we propose a dynamic "CoQuantile Regression", which jointly models VaR and CoVaR semiparametrically. We propose a twostep Mestimator drawing on recently proposed bivariate scoring functions for the pair (VaR, CoVaR). Among others, this allows for the estimation of joint dynamic forecasting models for (VaR, CoVaR). We prove the asymptotic normality of the proposed estimator and simulations illustrate its good finitesample properties. We apply our coquantile regression to correct the statistical inference in the existing literature on CoVaR, and to generate CoVaR forecasts for real financial data, which are shown to be superior to existing methods. 
Date:  2022–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2206.14275&r= 
By:  Simone Tonini; Francesca Chiaromonte; Alessandro Giovannelli 
Abstract:  This paper characterizes the impact of serial dependence on the nonasymptotic estimation error bound of penalized regressions (PRs). Focusing on the direct relationship between the degree of crosscorrelation of covariates and the estimation error bound of PRs, we show that orthogonal or weakly crosscorrelated stationary AR processes can exhibit high spurious crosscorrelations caused by serial dependence. In this respect, we study analytically the density of sample crosscorrelations in the simplest case of two orthogonal Gaussian AR(1) processes. Simulations show that our results can be extended to the general case of weakly crosscorrelated non Gaussian AR processes of any autoregressive order. To improve the estimation performance of PRs in a time series regime, we propose an approach based on applying PRs to the residuals of ARMA models fit on the observed time series. We show that under mild assumptions the proposed approach allows us both to reduce the estimation error and to develop an effective forecasting strategy. The estimation accuracy of our proposal is numerically evaluated through simulations. To assess the effectiveness of the forecasting strategy, we provide the results of an empirical application to monthly macroeconomic data relative to the Euro Area economy. 
Keywords:  Serial dependence; spurious correlation; minimum eigenvalue; penalized regressions; estimation accuracy. 
Date:  2022–07–27 
URL:  http://d.repec.org/n?u=RePEc:ssa:lemwps:2022/21&r= 
By:  Taisuke Otsu; Meghan Xu; Meghan Xu 
Abstract:  We propose a onetomany matching estimator of the average treatment effect based on propensity scores estimated by isotonic regression. The method relies on the monotonicity assumption on the propensity score function, which can be justified in many applications in economics. We show that the nature of the isotonic estimator can help us to fix many problems of existing matching methods, including efficiency, choice of the number of matches, choice of tuning parameters, robustness to propensity score misspecification, and bootstrap validity. As a byproduct, a uniformly consistent isotonic estimator is developed for our proposed matching method. 
Keywords:  Matching, Propensity score, Isotonic regression 
JEL:  C14 
Date:  2022–07 
URL:  http://d.repec.org/n?u=RePEc:cep:stiecm:623&r= 
By:  Dimitris Korobilis 
Abstract:  A comprehensive methodology for inference in vector autoregressions (VARs) using sign and other structural restrictions is developed. The reducedform VAR disturbances are driven by a few common factors and structural identification restrictions can be incorporated in their loadings in the form of parametric restrictions. A Gibbs sampler is derived that allows for reducedform parameters and structural restrictions to be sampled efficiently in one step. A key benefit of the proposed approach is that it allows for treating parameter estimation and structural inference as a joint problem. An additional benefit is that the methodology can scale to large VARs with multiple shocks, and it can be extended to accommodate nonlinearities, asymmetries, and numerous other interesting empirical features. The excellent properties of the new algorithm for inference are explored using synthetic data experiments, and by revisiting the role of financial factors in economic fluctuations using identification based on sign restrictions. 
Date:  2022–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2206.06892&r= 
By:  Li, Kunpeng 
Abstract:  This paper considers the estimation and inferential issues of threshold spatial autoregressive model, which is a hybrid of threshold model and spatial autoregressive model. We consider using the quasi maximum likelihood (QML) method to estimate the model. We prove the tightness and the H\'{a}jekR\'{e}nyi type inequality for a quadratic form, and establish a full inferential theory of the QML estimator under the setup that threshold effect shrinks to zero along with an increasing sample size. We consider the hypothesis testing on the presence of threshold effect. Three supertype statistics are proposed to perform this testing. Their asymptotic behaviors are studied under the Pitman local alternatives. A bootstrap procedure is proposed to obtain the asymptotically correct critical value. We also consider the hypothesis testing on the threshold value equal to some prespecified one. We run Monte carlo simulations to investigate the finite sample performance of the QML estimators and find that the QML estimators have good performance. 
Keywords:  Spatial autoregressive models, Spillover effects, Threshold effect, Maximum likelihood estimation, Inferential theory. 
JEL:  C12 C31 
Date:  2022–06–27 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:113568&r= 
By:  Chen, Zezhun; Dassios, Angelos; Tzougas, George 
Abstract:  In this paper, we present a novel family of multivariate mixed PoissonGeneralized Inverse Gaussian INAR(1), MMPGIGINAR(1), regression models for modelling time series of overdispersed count response variables in a versatile manner. The statistical properties associated with the proposed family of models are discussed and we derive the joint distribution of innovations across all the sequences. Finally, for illustrative purposes different members of the MMPGIGINAR(1) class are fitted to Local Government Property Insurance Fund data from the state of Wisconsin via maximum likelihood estimation. 
Keywords:  count data time series; multivariate INAR(1) regression models; multivariate mixed Poisson Generalized Inverse Gaussian; correlated time series; maximum likelihood estimation; Springer deal 
JEL:  C1 
Date:  2022–07–09 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:115369&r= 
By:  Helton Saulo; Roberto Vila; Shayane S. Cordeiro 
Abstract:  The sample selection bias problem arises when a variable of interest is correlated with a latent variable, and involves situations in which the response variable had part of its observations censored. Heckman (1976) proposed a sample selection model based on the bivariate normal distribution that fits both the variable of interest and the latent variable. Recently, this assumption of normality has been relaxed by more flexible models such as the Studentt distribution (Marchenko and Genton, 2012; Lachos et al., 2021). The aim of this work is to propose generalized Heckman sample selection models based on symmetric distributions (Fang et al., 1990). This is a new class of sample selection models, in which variables are added to the dispersion and correlation parameters. A Monte Carlo simulation study is performed to assess the behavior of the parameter estimation method. Two real data sets are analyzed to illustrate the proposed approach. 
Date:  2022–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2206.10054&r= 
By:  Jinyong Hahn; David W. Hughes; Guido Kuersteiner; Whitney K. Newey 
Abstract:  Bias correction can often improve the finite sample performance of estimators. We show that the choice of bias correction method has no effect on the higherorder variance of semiparametrically efficient parametric estimators, so long as the estimate of the bias is asymptotically linear. It is also shown that bootstrap, jackknife, and analytical bias estimates are asymptotically linear for estimators with higherorder expansions of a standard form. In particular, we find that for a variety of estimators the straightforward bootstrap bias correction gives the same higherorder variance as more complicated analytical or jackknife bias corrections. In contrast, bias corrections that do not estimate the bias at the parametric rate, such as the splitsample jackknife, result in larger higherorder variances in the i.i.d. setting we focus on. For both a crosssectional MLE and a panel model with individual fixed effects, we show that the splitsample jackknife has a higherorder variance term that is twice as large as that of the `leaveoneout' jackknife. 
Date:  2022–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2207.09943&r= 
By:  Christian Bongiorno; Damien Challet 
Abstract:  Symbolic transfer entropy is a powerful nonparametric tool to detect leadlag between time series. Because a closed expression of the distribution of Transfer Entropy is not known for finitesize samples, statistical testing is often performed with bootstraps whose slowness prevents the inference of large leadlag networks between long time series. On the other hand, the asymptotic distribution of Transfer Entropy between two time series is known. In this work, we derive the asymptotic distribution of the test for one time series having a larger Transfer Entropy than another one on a target time series. We then measure the convergence speed of both tests in the small sample size limits via benchmarks. We then introduce Transfer Entropy between timeshifted time series, which allows to measure the timescale at which information transfer is maximal and vanishes. We finally apply these methods to tickbytick price changes of several hundreds of stocks, yielding nontrivial statistically validated networks. 
Date:  2022–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2206.10173&r= 
By:  Joshua Chan; Eric Eisenstat; Xuewen Yu 
Abstract:  Vector autoregressions (VARs) with multivariate stochastic volatility are widely used for structural analysis. Often the structural model identified through economically meaningful restrictionse.g., sign restrictionsis supposed to be independent of how the dependent variables are ordered. But since the reducedform model is not order invariant, results from the structural analysis depend on the order of the variables. We consider a VAR based on the factor stochastic volatility that is constructed to be order invariant. We show that the presence of multivariate stochastic volatility allows for statistical identification of the model. We further prove that, with a suitable set of sign restrictions, the corresponding structural model is pointidentified. An additional appeal of the proposed approach is that it can easily handle a large number of dependent variables as well as sign restrictions. We demonstrate the methodology through a structural analysis in which we use a 20variable VAR with sign restrictions to identify 5 structural shocks. 
Date:  2022–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2207.03988&r= 
By:  Chang, Jinyuan; Cheng, Guanghui; Yao, Qiwei 
Abstract:  We propose a new unitroot test for a stationary null hypothesis H0 against a unitroot alternative H1. Our approach is nonparametric as H0 assumes only that the process concerned is I(0), without specifying any parametric forms. The new test is based on the fact that the sample autocovariance function converges to the finite population autocovariance function for an I(0) process, but diverges to infinity for a process with unit roots. Therefore, the new test rejects H0 for large values of the sample autocovariance function. To address the technical question of how large is large, we split the sample and establish an appropriate normal approximation for the null distribution of the test statistic. The substantial discriminative power of the new test statistic is due to the fact that it takes finite values under H0 and diverges to infinity under H1. This property allows one to truncate the critical values of the test so that it has asymptotic power 1; it also alleviates the loss of power due to the samplesplitting. The test is implemented in R. 
Keywords:  autocovariance; integrated processes; normal approximation; powerone test; samplesplitting; EP/V007556/1; OUP deal 
JEL:  C1 
Date:  2022–06–01 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:114620&r= 
By:  Weronika Ormaniec; Marcin Pitera; Sajad Safarveisi; Thorsten Schmidt 
Abstract:  Estimating valueatrisk on time series data with possibly heteroscedastic dynamics is a highly challenging task. Typically, we face a small data problem in combination with a high degree of nonlinearity, causing difficulties for both classical and machinelearning estimation algorithms. In this paper, we propose a novel valueatrisk estimator using a long shortterm memory (LSTM) neural network and compare its performance to benchmark GARCH estimators. Our results indicate that even for a relatively short time series, the LSTM could be used to refine or monitor risk estimation processes and correctly identify the underlying risk dynamics in a nonparametric fashion. We evaluate the estimator on both simulated and market data with a focus on heteroscedasticity, finding that LSTM exhibits a similar performance to GARCH estimators on simulated data, whereas on real market data it is more sensitive towards increasing or decreasing volatility and outperforms all existing estimators of valueatrisk in terms of exception rate and mean quantile score. 
Date:  2022–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2207.10539&r= 
By:  Mario P. Rothfelder; Otilia Boldea 
Abstract:  We show by simulation that the test for an unknown threshold in models with endogenous regressors  proposed in Caner and Hansen (2004)  can exhibit severe size distortions both in small and in moderately large samples, pertinent to empirical applications. We propose three new tests that rectify these size distortions. The first test is based on GMM estimators. The other two are based on unconventional 2SLS estimators, that use additional information about the linearity (or lack of linearity) of the first stage. Just like the test in Caner and Hansen (2004), our tests are nonpivotal, and we prove their bootstrap validity. The empirical application revisits the question in Ramey and Zubairy (2018) whether government spending multipliers are larger in recessions, but using tests for an unknown threshold. Consistent with Ramey and Zubairy (2018), we do not find strong evidence that these multipliers are larger in recessions. 
Date:  2022–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2207.10076&r= 
By:  Danyu Lin (University of North Carolina at Chapel Hill) 
Abstract:  Intervalcensored data arise frequently in clinical, epidemiological, financial, and sociological studies, where the event or failure of interest is not observed at an exact time point but is rather known to occur within a time interval induced by periodic examinations. We formulate the effects of potentially timedependent covariates on the failure time through the familiar Cox proportional hazards model, under which the failure time distribution is completely arbitrary. We consider nonparametric maximumlikelihood estimation with an arbitrary number of examination times for each study subject. We present an EM algorithm that involves very simple calculations and converges stably for any dataset, even in the presence of timedependent covariates. The resulting estimators for the regression parameters are consistent, asymptotically normal, and asymptotically efficient with an easily estimated covariance matrix. In addition, we extend the EM algorithm and the theoretical results to multivariate failure time data, in which there are multiple events per subjects or clustering of study subjects. Finally, we provide illustrations with real medical studies. 
Date:  2022–06–25 
URL:  http://d.repec.org/n?u=RePEc:boc:biep22:04&r= 
By:  Oliver R. Cutbill; Rami V. Tabri 
Abstract:  This paper discusses the statistical inference problem associated with testing for dependence between two continuous random variables using Kendall’s Ƭ in the context of the missing data problem. We prove the worstcase identified set for this measure of association always includes zero. The consequence of this result is that robust inference for dependence using Kendall’s Ƭ, where robustness is with respect to the form of the missingnessgenerating process, is impossible. 
Keywords:  Impossible Inference; Statistical Dependence; Kendall’s Ƭ; Partial Identification; Missing Data 
Date:  2022–02 
URL:  http://d.repec.org/n?u=RePEc:syd:wpaper:202203&r= 
By:  Yan Liu 
Abstract:  This paper studies the statistical decision problem of learning an individualized intervention policy when data are obtained from observational studies or randomized experiments with imperfect compliance. Leveraging an instrumental variable, we provide a social welfare criterion that allows the policymaker to account for endogenous treatment selection. To this end, we incorporate the marginal treatment effects (MTE) when identifying treatment effects parameters and consider encouragement rules that affect social welfare through treatment takeup when designing policies. We focus on settings where encouragement rules are binary decisions on whether or not to offer a userchosen manipulation of the instrument based on observable characteristics. We apply the representation of the social welfare criterion of encouragement rules via the MTE to the Empirical Welfare Maximization (EWM) method and derive convergence rates of the worstcase regret (welfare loss). We illustrate the EWM encouragement rule using data from the Indonesia Family Life Survey. 
Date:  2022–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2206.09883&r= 
By:  Chiranjit Dutta; Nalini Ravishanker; Sumanta Basu 
Abstract:  In this paper we describe fast Bayesian statistical analysis of vector positivevalued time series, with application to interesting financial data streams. We discuss a flexible level correlated model (LCM) framework for building hierarchical models for vector positivevalued time series. The LCM allows us to combine marginal gamma distributions for the positivevalued component responses, while accounting for association among the components at a latent level. We use integrated nested Laplace approximation (INLA) for fast approximate Bayesian modeling via the \texttt{RINLA} package, building custom functions to handle this setup. We use the proposed method to model interdependencies between realized volatility measures from several stock indexes. 
Date:  2022–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2206.05374&r= 
By:  Timothy G. Conley; Bill Dupor; Mahdi Ebsim 
Abstract:  We develop a method to use disaggregate data to conduct causal inference in macroeconomics. The approach permits one to infer the aggregate effect of a macro treatment using regional outcome data and a valid instrument. We estimate a macro effect without (sine) the aggregation (aggregatio) of the outcome variable. We exploit crossseries parameter restrictions to increase precision relative to traditional, aggregate series estimates and provide a method to assess robustness to modest departures from these restrictions. We illustrate our method via estimating the jobs effect of oil price changes using regional manufacturing employment data and an aggregate oil supply shock. 
Keywords:  aggregation; macroeconomic causal effect 
JEL:  E3 
Date:  2022–07–11 
URL:  http://d.repec.org/n?u=RePEc:fip:fedlwp:94511&r= 
By:  Ochoa Arellano, Maicol Jesús; Cascos Fernández, Ignacio 
Abstract:  For a univariate distribution, its Mquantiles are obtained as solutions to asymmetric minimization problems dealing with the distance of a random variable to a fixed point. The asymmetry refers to the different weights for the values of the random variable at either side of the fixed point. We focus on Mquantiles whose associated losses are given in terms of a power. In this setting, the classical quantiles are obtained for the first power, while the expectiles correspond to quadratic losses. The Mquantiles considered here are computed over distorted distributions, which allows to tune the weight awarded to the more central or peripheral parts of the distribution. These distorted Mquantiles are used in the multivariate setting to introduce novel families of central regions and their associated depth functions, which are further extended to the multiple output regression setting in the form of conditional regression regions and conditional depths. 
Keywords:  Bivariate Depth Algorithm; Data Depth; Distortion Function; Conditional Regression Region; MQuantiles 
Date:  2022–07–14 
URL:  http://d.repec.org/n?u=RePEc:cte:wsrepe:35465&r= 
By:  Collin Philipps (Department of Economics and Geosciences, US Air Force Academy) 
Abstract:  We show that Kolmogorov's classical strong law of large numbers applies to all expectiles uniformly. The expectiles of a random sample converge almost surely (uniformly) to the true expectiles if and only if the true data generating process has a finite first moment. The result holds for expectile functions of scalar and vectorvalued random variables and can be reformulated to state that the mean (or any expectile) of a random sample converges almost surely to the true mean (or expectile) if and only if any arbitrary expectile exists and is finite. 
Keywords:  Expectile Regression, Quantile Regression, Strong Law of Large Numbers 
JEL:  C0 C21 C46 
Date:  2022–07 
URL:  http://d.repec.org/n?u=RePEc:ats:wpaper:wp20225&r= 
By:  Federico Bassetti; Roberto Casarin; Marco Del Negro 
Abstract:  We propose a nonparametric Bayesian approach for conducting inference on probabilistic surveys. We use this approach to study whether U.S. Survey of Professional Forecasters density projections for output growth and inflation are consistent with the noisy rational expectations hypothesis. We find that in contrast to theory, for horizons close to two years, there is no relationship whatsoever between subjective uncertainty and forecast accuracy for output growth density projections, both across forecasters and over time, and only a mild relationship for inflation projections. As the horizon shortens, the relationship becomes onetoone, as the theory would predict. 
Keywords:  Bayesian interface; Bayesian nonparametric; Survey of Professional Forecasters; noisy rational expectations 
JEL:  C11 C13 C15 C32 C58 G12 
Date:  2022–07–01 
URL:  http://d.repec.org/n?u=RePEc:fip:fednsr:94495&r= 
By:  Di Zhang; Qiang Niu; Youzhou Zhou 
Abstract:  Volatility clustering is a common phenomenon in financial time series. Typically, linear models are used to describe the temporal autocorrelation of the (logarithmic) variance of returns. Considering the difficulty in estimation of this model, we construct a Dynamic Bayesian Network, which utilizes the conjugate prior relation of normalgamma and gammagamma, so that at each node, its posterior form locally remains unchanged. This makes it possible to quickly find approximate solutions using variational methods. Furthermore, we ensure that the volatility expressed by the model is an independent incremental process after inserting dummy gamma nodes between adjacent time steps. We have found that, this model has two advantages: 1) It can be proved that it can express heavier tails than Gaussians, i.e., have positive excess kurtosis, compared to popular linear models. 2) If the variational inference(VI) is used for state estimation, it runs much faster than Monte Carlo(MC) methods, since the calculation of the posterior uses only basic arithmetic operations. And, its convergence process is deterministic. We tested the model, named GamChain, using recent Crypto, Nasdaq, and Forex records of varying resolutions. The results show that: 1) In the same case of using MC, this model can achieve comparable state estimation results with the regular lognormal chain. 2) In the case of only using VI, this model can obtain accuracy that are slightly worse than MC, but still acceptable in practice; 3) Only using VI, the running time of GamChain, under the most conservative settings, can be reduced to below 20% of that based on the lognormal chain via MC. 
Date:  2022–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2207.01151&r= 
By:  Anthony Coache; Sebastian Jaimungal; \'Alvaro Cartea 
Abstract:  We propose a novel framework to solve risksensitive reinforcement learning (RL) problems where the agent optimises timeconsistent dynamic spectral risk measures. Based on the notion of conditional elicitability, our methodology constructs (strictly consistent) scoring functions that are used as penalizers in the estimation procedure. Our contribution is threefold: we (i) devise an efficient approach to estimate a class of dynamic spectral risk measures with deep neural networks, (ii) prove that these dynamic spectral risk measures may be approximated to any arbitrary accuracy using deep neural networks, and (iii) develop a risksensitive actorcritic algorithm that uses full episodes and does not require any additional nested transitions. We compare our conceptually improved reinforcement learning algorithm with the nested simulation approach and illustrate its performance in two settings: statistical arbitrage and portfolio allocation on both simulated and real data. 
Date:  2022–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2206.14666&r= 
By:  Andrew Y. Chen 
Abstract:  I present two simple bounds for the false discovery rate (FDR) that account for publication bias. The first assumes that the publication process is not worse at finding predictability than atheoretical datamining. The second conservatively extrapolates by assuming that there are exponentially more filedrawer tstats than published tstats. Both methods find that at least 75% of findings in crosssectional predictability are true. I show that, surprisingly, Harvey, Liu, and Zhu's (2016) estimates imply a similar FDR. I discuss interpretations and relate to the biostatistics literature. My analysis shows that carefully mapping multiple testing statistics to economic interpretations is important. 
Date:  2022–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2206.15365&r= 
By:  Bryan T. Kelly; Semyon Malamud; Kangying Zhou 
Abstract:  The extant literature predicts market returns with “simple” models that use only a few parameters. Contrary to conventional wisdom, we theoretically prove that simple models severely understate return predictability compared to “complex” models in which the number of parameters exceeds the number of observations. We empirically document the virtue of complexity in US equity market return prediction. Our findings establish the rationale for modeling expected returns through machine learning. 
JEL:  C1 C45 G1 
Date:  2022–07 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:30217&r= 