
on Econometrics 
By:  Xiaohong Chen (Cowles Foundation, Yale University); Zhijie Xiao (Dept. of Economics, Boston College); Bo Wang (Dept. of Economics, Boston College) 
Abstract:  Economic and ï¬ nancial time series data can exhibit nonstationary and nonlinear patterns simultaneously. This paper studies copulabased time series models that capture both patterns. We propose a procedure where nonstationarity is removed via a ï¬ ltration, and then the nonlinear temporal dependence in the ï¬ ltered data is captured via a flexible Markov copula. We study the asymptotic properties of two estimators of the parametric copula dependence parameters: the parametric (twostep) copula estimator where the marginal distribution of the ï¬ ltered series is estimated parametrically; and the semiparametric (twostep) copula estimator where the marginal distribution is estimated via a rescaled empirical distribution of the ï¬ ltered series. We show that the limiting distribution of the parametric copula estimator depends on the nonstationary ï¬ ltration and the parametric marginal distribution estimation, and may be nonnormal. Surprisingly, the limiting distribution of the semiparametric copula estimator using the ï¬ ltered data is shown to be the same as that without nonstationary ï¬ ltration, which is normal and free of marginal distribution speciï¬ cation. The simple and robust properties of the semiparametric copula estimators extend to models with misspeciï¬ ed copulas, and facilitate statistical inferences, such as hypothesis testing and model selection tests, on semiparametric copulabased dynamic models in the presence of nonstationarity. Monte Carlo studies and real data applications are presented. 
Keywords:  Residual copula, Cointegration, Unit Root, Nonstationarity, Nonlinearity, Tail Dependence, Semiparametric 
JEL:  C14 C22 
Date:  2020–07 
URL:  http://d.repec.org/n?u=RePEc:cwl:cwldpp:2242&r=all 
By:  Zhishui Hu; Ioannis Kasparis; Qiying Wang 
Abstract:  A novel IV estimation method, that we term Locally Trimmed LS (LTLS), is developed which yields estimators with (mixed) Gaussian limit distributions in situations where the data may be weakly or strongly persistent. In particular, we allow for nonlinear predictive type of regressions where the regressor can be stationary short/long memory as well as nonstationary long memory process or a nearly integrated array. The resultant ttests have conventional limit distributions (i.e. N(0; 1)) free of (near to unity and long memory) nuisance parameters. In the case where the regressor is a fractional process, no preliminary estimator for the memory parameter is required. Therefore, the practitioner can conduct inference while being agnostic about the exact dependence structure in the data. The LTLS estimator is obtained by applying certain chronological trimming to the OLS instrument via the utilisation of appropriate kernel functions of time trend variables. The finite sample performance of LTLS based ttests is investigated with the aid of a simulation experiment. An empirical application to the predictability of stock returns is also provided. 
Date:  2020–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2006.12595&r=all 
By:  Donggyu Kim; Xinyu Song; Yazhen Wang 
Abstract:  This paper introduces unified models for highdimensional factorbased Ito process, which can accommodate both continuoustime Ito diffusion and discretetime stochastic volatility (SV) models by embedding the discrete SV model in the continuous instantaneous factor volatility process. We call it the SVIto model. Based on the series of daily integrated factor volatility matrix estimators, we propose quasimaximum likelihood and least squares estimation methods. Their asymptotic properties are established. We apply the proposed method to predict future vast volatility matrix whose asymptotic behaviors are studied. A simulation study is conducted to check the finite sample performance of the proposed estimation and prediction method. An empirical analysis is carried out to demonstrate the advantage of the SVIto model in volatility prediction and portfolio allocation problems. 
Date:  2020–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2006.12039&r=all 
By:  David Benson; Matthew A. Masten; Alexander Torgovitsky 
Abstract:  We present the ivcrc command, which implements an instrumental variables (IV) estimator for the linear correlated random coefficients (CRC) model. This model is a natural generalization of the standard linear IV model that allows for endogenous, multivalued treatments and unobserved heterogeneity in treatment effects. The proposed estimator uses recent semiparametric identification results that allow for flexible functional forms and permit instruments that may be binary, discrete, or continuous. The command also allows for the estimation of varying coefficients regressions, which are closely related in structure to the proposed IV estimator. We illustrate this IV estimator and the ivcrc command by estimating the returns to education in the National Longitudinal Survey of Young Men. 
Keywords:  ivregress; Instrumental variables; Correlated random coefficients; Heterogeneous treatment effects; Varying coefficient models; Returns to schooling 
JEL:  C14 C51 I26 C26 
Date:  2020–06–16 
URL:  http://d.repec.org/n?u=RePEc:fip:fedgfe:202046&r=all 
By:  Deborah Gefang; Gary Koop; Aubrey Poon 
Abstract:  Mixed frequency Vector Autoregressions (MFVARs) can be used to provide timely and high frequency estimates or nowcasts of variables for which data is available at a low frequency. Bayesian methods are commonly used with MFVARs to overcome overparameterization concerns. But Bayesian methods typically rely on computationally demanding Markov Chain Monte Carlo (MCMC) methods. In this paper, we develop Variational Bayes (VB) methods for use with MFVARs using DirichletLaplace globallocal shrinkage priors. We show that these methods are accurate and computationally much more efficient than MCMC in two empirical applications involving large MFVARs. 
Keywords:  Mixed Frequency, Variational inference, Vector Autoregression, Stochastic Volatility, Hierarchical Prior, Forecasting 
JEL:  C11 C32 C53 
Date:  2020–05 
URL:  http://d.repec.org/n?u=RePEc:nsr:escoed:escoedp202007&r=all 
By:  Atsushi Inoue; Lutz Kilian 
Abstract:  Structural VAR models are routinely estimated by Bayesian methods. Several recent studies have voiced concerns about the common use of posterior median (or mean) response functions in applied VAR analysis. In this paper, we show that these response functions can be misleading because in empirically relevant settings there need not exist a posterior draw for the impulse response function that matches the posterior median or mean response function, even as the number of posterior draws approaches infinity. As a result, the use of these summary statistics may distort the shape of the impulse response function which is of foremost interest in applied work. The same concern applies to error bands based on the upper and lower quantiles of the marginal posterior distributions of the impulse responses. In addition, these error bands fail to capture the full uncertainty about the estimates of the structural impulse responses. In response to these concerns, we propose new estimators of impulse response functions under quadratic loss, under absolute loss and under Dirac delta loss that are consistent with Bayesian statistical decision theory, that are optimal in the relevant sense, that respect the dynamics of the impulse response functions and that are easy to implement. We also propose joint credible sets for these estimators derived under the same loss function. Our analysis covers a much wider range of structural VAR models than previous proposals in the literature including models that combine shortrun and longrun exclusion restrictions and models that combine zero restrictions, sign restrictions and narrative restrictions. 
Keywords:  Loss function; joint inference; median response function; mean response function; modal model 
JEL:  C22 C32 C52 
Date:  2020–07–17 
URL:  http://d.repec.org/n?u=RePEc:fip:feddwp:88408&r=all 
By:  Roberto Molinari (Auburn University); Gaetan Bakalli (University of Geneva  Geneva School of Economics and Management); Stéphane Guerrier (University of Geneva  Geneva School of Economics and Management); Cesare Miglioli (University of Geneva  Geneva School of Economics and Management); Samuel Orso (University of Geneva  Geneva School of Economics and Management); O. Scaillet (University of Geneva GSEM and GFRI; Swiss Finance Institute; University of Geneva  Research Center for Statistics) 
Abstract:  Predictive power has always been the main research focus of learning algorithms with the goal of minimizing the test error for supervised classification and regression problems. While the general approach for these algorithms is to consider all possible attributes in a dataset to best predict the response of interest, an important branch of research is focused on sparse learning in order to avoid overfitting which can greatly affect the accuracy of outofsample prediction. However, in many practical settings we believe that only an extremely small combination of different attributes affect the response whereas even sparselearning methods can still preserve a high number of attributes in highdimensional settings and possibly deliver inconsistent prediction performance. As a consequence, the latter methods can also be hard to interpret for researchers and practitioners, a problem which is even more relevant for the “blackbox”type mechanisms of many learning approaches. Finally, aside from needing to quantify prediction uncertainty, there is often a problem of replicability since not all datacollection procedures measure (or observe) the same attributes and therefore cannot make use of proposed learners for testing purposes. To address all the previous issues, we propose to study a procedure that combines screening and wrapper methods and aims to find a library of extremely lowdimensional attribute combinations (with consequent low data collection and storage costs) in order to (i) match or improve the predictive performance of any particular learning method which uses all attributes as an input (including sparse learners); (ii) provide a lowdimensional network of attributes easily interpretable by researchers and practitioners; and (iii) increase the potential replicability of results due to a diversity of attribute combinations defining strong learners with equivalent predictive power. We call this algorithm “Sparse Wrapper AlGorithm” (SWAG). 
Keywords:  interpretable machine learning, big data, wrapper, sparse learning, meta learning, ensemble learning, greedy algorithm, feature selection, variable importance network 
JEL:  C45 C51 C52 C53 C55 C87 
Date:  2020–06 
URL:  http://d.repec.org/n?u=RePEc:chf:rpseri:rp2049&r=all 
By:  Carolina Caetano; Gregorio Caetano; Juan Carlos Escanciano 
Abstract:  We study identification and estimation in the Regression Discontinuity Design (RDD) with a multivalued treatment variable. We also allow for the inclusion of covariates. We show that without additional information, treatment effects are not identified. We give necessary and sufficient conditions that lead to identification of LATEs as well as of weighted averages of the conditional LATEs. We show that if the first stage discontinuities of the multiple treatments conditional on covariates are linearly independent, then it is possible to identify multivariate weighted averages of the treatment effects with convenient identifiable weights. If, moreover, treatment effects do not vary with some covariates or a flexible parametric structure can be assumed, it is possible to identify (in fact, overidentify) all the treatment effects. The overidentification can be used to test these assumptions. We propose a simple estimator, which can be programmed in packaged software as a TwoStage Least Squares regression, and packaged standard errors and tests can also be used. Finally, we implement our approach to identify the effects of different types of insurance coverage on health care utilization, as in Card, Dobkin and Maestas (2008). 
Date:  2020–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2007.00185&r=all 
By:  Florian Huber; Luca Rossini 
Abstract:  Vector autoregressive (VAR) models assume linearity between the endogenous variables and their lags. This linearity assumption might be overly restrictive and could have a deleterious impact on forecasting accuracy. As a solution, we propose combining VAR with Bayesian additive regression tree (BART) models. The resulting Bayesian additive vector autoregressive tree (BAVART) model is capable of capturing arbitrary nonlinear relations between the endogenous variables and the covariates without much input from the researcher. Since controlling for heteroscedasticity is key for producing precise density forecasts, our model allows for stochastic volatility in the errors. Using synthetic and real data, we demonstrate the advantages of our methods. For Eurozone data, we show that our nonparametric approach improves upon commonly used forecasting models and that it produces impulse responses to an uncertainty shock that are consistent with established findings in the literature. 
Date:  2020–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2006.16333&r=all 
By:  Alexander Giessing; Jianqing Fan 
Abstract:  This paper considers a new bootstrap procedure to estimate the distribution of highdimensional $\ell_p$statistics, i.e. the $\ell_p$norms of the sum of $n$ independent $d$dimensional random vectors with $d \gg n$ and $p \in [1, \infty]$. We provide a nonasymptotic characterization of the sampling distribution of $\ell_p$statistics based on Gaussian approximation and show that the bootstrap procedure is consistent in the KolmogorovSmirnov distance under mild conditions on the covariance structure of the data. As an application of the general theory we propose a bootstrap hypothesis test for simultaneous inference on highdimensional mean vectors. We establish its asymptotic correctness and consistency under highdimensional alternatives, and discuss the power of the test as well as the size of associated confidence sets. We illustrate the bootstrap and testing procedure numerically on simulated data. 
Date:  2020–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2006.13099&r=all 
By:  Masahiro Kato 
Abstract:  This study addresses the problem of offpolicy evaluation (OPE) from dependent samples obtained via the bandit algorithm. The goal of OPE is to evaluate a new policy using historical data obtained from behavior policies generated by the bandit algorithm. Because the bandit algorithm updates the policy based on past observations, the samples are not independent and identically distributed (i.i.d.). However, several existing methods for OPE do not take this issue into account and are based on the assumption that samples are i.i.d. In this study, we address this problem by constructing an estimator from a standardized martingale difference sequence. To standardize the sequence, we consider using evaluation data or sample splitting with a twostep estimation. This technique produces an estimator with asymptotic normality without restricting a class of behavior policies. In an experiment, the proposed estimator performs better than existing methods, which assume that the behavior policy converges to a timeinvariant policy. 
Date:  2020–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2006.06982&r=all 
By:  Angelica Gianfreda; Francesco Ravazzolo; Luca Rossini 
Abstract:  We study the importance of timevarying volatility in modelling hourly electricity prices when fundamental drivers are included in the estimation. This allows us to contribute to the literature of large Bayesian VARs by using wellknown time series models in a huge dimension for the matrix of coefficients. Based on novel Bayesian techniques, we exploit the importance of both Gaussian and nonGaussian error terms in stochastic volatility. We find that by using regressors as fuels prices, forecasted demand and forecasted renewable energy is essential in order to properly capture the volatility of these prices. Moreover, we show that the timevarying volatility models outperform the constant volatility models in both the insample model fit and the outofsample forecasting performance. 
Keywords:  Electricity, Hourly Prices, Renewable Energy Sources, NonGaussian, StochasticVolatility, Forecasting 
Date:  2020–07 
URL:  http://d.repec.org/n?u=RePEc:bny:wpaper:0088&r=all 
By:  Ulrich K. Mueller 
Abstract:  Standard inference about a scalar parameter estimated via GMM amounts to applying a ttest to a particular set of observations. If the number of observations is not very large, then moderately heavy tails can lead to poor behavior of the ttest. This is a particular problem under clustering, since the number of observations then corresponds to the number of clusters, and heterogeneity in cluster sizes induces a form of heavy tails. This paper combines extreme value theory for the smallest and largest observations with a normal approximation for the average of the remaining observations to construct a more robust alternative to the ttest. The new test is found to control size much more successfully in small samples compared to existing methods. Analytical results in the canonical inference for the mean problem demonstrate that the new test provides a refinement over the full sample ttest under more than two but less than three moments, while the bootstrapped ttest does not. 
Date:  2020–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2007.07065&r=all 
By:  Cl\'ement de Chaisemartin; Xavier D'Haultf{\oe}uille 
Abstract:  We consider the estimation of a scalar parameter, when two estimators are available. The first is always consistent. The second is inconsistent in general, but has a smaller asymptotic variance than the first, and may be consistent if an assumption is satisfied. We propose to use the weighted sum of the two estimators with the lowest estimated meansquared error (MSE). We show that this third estimator dominates the other two from a minimaxregret perspective: the maximum asymptoticMSEgain one may incur by using this estimator rather than one of the other estimators is larger than the maximum asymptoticMSEloss. 
Date:  2020–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2006.14667&r=all 
By:  Simon Freyaldenhoven 
Abstract:  Factor models are generally subject to a rotational indeterminacy, meaning that individual factors are only identified up to a rotation. In the presence of local factors, which only affect a subset of the outcomes, we show that the implied sparsity of the loading matrix can be used to solve this rotational indeterminacy. We further prove that a rotation criterion based on the `1norm of the loading matrix can be used to achieve identification even under approximate sparsity in the loading matrix. This enables us to consistently estimate individual factors, and to interpret them as structural objects. Monte Carlo simulations suggest that our criterion performs better than widely used heuristics, and we find strong evidence for the presence of local factors in financial and macroeconomic datasets. 
Keywords:  identification; factor models; sparsity; local factors 
JEL:  C38 C55 
Date:  2020–06–22 
URL:  http://d.repec.org/n?u=RePEc:fip:fedpwp:88229&r=all 
By:  Young Shin Kim; KumHwan Roh; Raphael Douady 
Abstract:  In this paper, we introduce a new time series model having the stochastic exponential tail. This model is constructed based on the Normal Tempered Stable distribution with a time varying parameter. The model captures the stochastic exponential tail which generates the volatility smile effect and volatility term structure in option pricing. Moreover, the model describes the time varying volatility of volatility. We empirically show the stochastic skewness and stochastic kurtosis by applying the model to analyze S\&P 500 index return data. We present MonteCarlo simulation technique for a parameter calibration of the model for the S\&P 500 option prices. By the calibration, we can see that the stochastic exponential tail makes the model better to analyze the market option prices. 
Date:  2020–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2006.07669&r=all 
By:  Subrato Banerjee; Benno Torgler 
Abstract:  Because the use of pvalues in statistical inference often involves the rejection of a hypothesis on the basis of a number that itself assumes the hypothesis to be true, many in the scientific community argue that inference should instead be based on the hypothesis’ actual probability conditional on supporting data. In this study, therefore, we propose a nonBayesian approach to achieving statistical inference independent of any prior beliefs about hypothesis probability, which are frequently subject to human bias. In doing so, we offer an important statistical tool to biology, medicine, and any other academic field that employs experimental methodology. 
Keywords:  Statistical inference; experimental science; hypothesis testing; conditional probability 
Date:  2020–07 
URL:  http://d.repec.org/n?u=RePEc:cra:wpaper:202014&r=all 
By:  Jozef Barunik; Michael Ellington 
Abstract:  We propose new measures to characterize dynamic network connections in large financial and economic systems. In doing so, our measures allow one to describe and understand causal network structures that evolve throughout time and over horizons using variance decomposition matrices from timevarying parameter VAR (TVP VAR) models. These methods allow researchers and practitioners to examine network connections over any horizon of interest whilst also being applicable to a wide range of economic and financial data. Our empirical application redefines the meaning of big in big data, in the context of TVP VAR models, and track dynamic connections among illiquidity ratios of all S\&P500 constituents. We then study the information content of these measures for the market return and real economy. 
Date:  2020–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2007.07842&r=all 
By:  Jungsik Hwang 
Abstract:  Extracting previously unknown patterns and information in time series is central to many realworld applications. In this study, we introduce a novel approach to modeling financial time series using a deep learning model. We use a Long ShortTerm Memory (LSTM) network equipped with the trainable initial hidden states. By learning to reconstruct time series, the proposed model can represent highdimensional time series data with its parameters. An experiment with the Korean stock market data showed that the model was able to capture the relative similarity between a large number of stock prices in its latent space. Besides, the model was also able to predict the future stock trends from the latent space. The proposed method can help to identify relationships among many time series, and it could be applied to financial applications, such as optimizing the investment portfolios. 
Date:  2020–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2007.06848&r=all 
By:  Otsu, Taisuke; Pesendorfer, Martin; Sasaki, Yuya; Takahashi, Yuya 
Abstract:  We propose a multiplicityrobust estimation method for (static or dynamic) games. The method allows for distinct behaviors and strategies across markets by treating market specific behaviors as correlated latent variables, with their conditional probability measure treated as an infinitedimensional nuisance parameter. Instead of solving the intermediate problem which requires optimization over the infinite dimensional set, we consider the equivalent dual problem which entails optimization over only a finitedimensional Euclidean space. This property allows for a practically feasible characterization of the identified region for the structural parameters. We apply the estimation method to newspaper market previously studied in Gentzkow et al. (2014) to characterize the identified region of marginal costs. 
Date:  2020–01 
URL:  http://d.repec.org/n?u=RePEc:cpr:ceprdp:14342&r=all 
By:  Elena Andreou; Eric Ghysels 
Abstract:  This paper presents an innovative approach to extract Volatility Factors which predict the VIX, the S&P500 Realized Volatility (RV) and the Variance Risk Premium (VRP). The approach is innovative along two different dimensions, namely: (1) we extract Volatility Factors from panels of filtered volatilities  in particular large panels of univariate ARCHtype models and propose methods to estimate common Volatility Factors in the presence of estimation error and (2) we price equity volatility risk using factors which go beyond the equity class namely Volatility Factors extracted from panels of volatilities of shortrun funding spreads. The role of these Volatility Factors is compared with the corresponding factors extracted from the panels of the above spreads as well as related factors proposed in the literature. Our monthly shortrun funding spreads Volatility Factors provide both in and outofsample predictive gains for forecasting the monthly VIX, RV as well as the equity premium, while the corresponding daily volatility factors via Mixed Data Sampling (MIDAS) models provide further improvements. 
Keywords:  Factor asset pricing models; Volatility Factors; ARCH filters 
JEL:  C2 C5 G1 
Date:  2020–03 
URL:  http://d.repec.org/n?u=RePEc:ucy:cypeua:042020&r=all 
By:  Shankhyajyoti De; Arabin Kumar Dey; Deepak Gauda 
Abstract:  In this paper, we show an innovative way to construct bootstrap confidence interval of a signal estimated based on a univariate LSTM model. We take three different types of bootstrap methods for dependent set up. We prescribe some useful suggestions to select the optimal block length while performing the bootstrapping of the sample. We also propose a benchmark to compare the confidence interval measured through different bootstrap strategies. We illustrate the experimental results through some stock price data set. 
Date:  2020–07 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2007.00254&r=all 