
on Econometrics 
By:  Kerem Tuzcuoglu 
Abstract:  Modeling and estimating persistent discrete data can be challenging. In this paper, we use an autoregressive panel probit model where the autocorrelation in the discrete variable is driven by the autocorrelation in the latent variable. In such a nonlinear model, the autocorrelation in an unobserved variable results in an intractable likelihood containing highdimensional integrals. To tackle this problem, we use composite likelihoods that involve much lower order of integration. However, parameter identification becomes problematic since the information employed in lower dimensional distributions may not be rich enough for identification. Therefore, we characterize types of composite likelihoods that are valid for this model and study conditions under which the parameters can be identified. Moreover, we provide consistency and asymptotic normality results of the pairwise composite likelihood estimator and conduct Monte Carlo simulations to assess its finitesample performances. Finally, we apply our method to analyze credit ratings. The results indicate a significant improvement in the estimated transition probabilities between rating classes compared with static models. 
Keywords:  Credit risk management; Econometric and statistical methods; Economic models 
JEL:  C23 C25 C58 G24 
Date:  2019–05 
URL:  http://d.repec.org/n?u=RePEc:bca:bocawp:1916&r=all 
By:  Martin Burda; Louis Belisle 
Abstract:  The Copula Multivariate GARCH (CMGARCH) model is based on a dynamic copula function with timevarying parameters. It is particularly suited for modelling dynamic dependence of nonelliptically distributed financial returns series. The model allows for capturing more flexible dependence patterns than a multivariate GARCH model and also generalizes static copula dependence models. Nonetheless, the model is subject to a number of parameter constraints that ensure positivity of variances and covariance stationarity of the modeled stochastic processes. As such, the resulting distribution of parameters of interest is highly irregular, characterized by skewness, asymmetry, and truncation, hindering the applicability and accuracy of asymptotic inference. In this paper, we propose Bayesian analysis of the CMGARCH model based on Constrained Hamiltonian Monte Carlo (CHMC), which has been shown in other contexts to yield efficient inference on complicated constrained dependence structures. In the CMGARCH context, we contrast CHMC with traditional randomwalk sampling used in the previous literature and highlight the benefits of CHMC for applied researchers. We estimate the posterior mean, median and Bayesian confidence intervals for the coefficients of tail dependence. The analysis is performed in an application to a recent portfolio of S&P500 financial asset returns. 
Keywords:  Dynamic conditional volatility, varying correlation model, Markov Chain Monte Carlo 
JEL:  C11 C15 C32 C63 
Date:  2019–04–29 
URL:  http://d.repec.org/n?u=RePEc:tor:tecipa:tecipa638&r=all 
By:  Milda Norkuté; Vasilis Sarafidis; Takashi Yamagata; Guowei Cui 
Abstract:  This paper develops two instrumental variable (IV) estimators for dynamic panel data models with exogenous covariates and a multifactor error structure when both crosssectional and time series dimensions, N and T respectively, are large. Our approach initially projects out the common factors from the exogenous covariates of the model, and constructs instruments based on this defactored covariates. For models with homogeneous slope coe_cients, we propose a twostep IV estimator: the _rst step IV estimator is obtained using the defactored covariates as instruments. In the second step, the entire model is defactored by the extracted factors from the residuals of the _rst step estimation and subsequently obtain the _nal IV estimator. For models with heterogeneous slope coe _cients, we propose a meangroup type estimator, which is the crosssectional average of _rststep IV estimators of crosssection speci_c slopes. It is noteworthy that our estimators do not require us to seek for instrumental variables outside the model. Furthermore, our estimators are linear hence computationally robust and inexpensive. Moreover, they require no bias correction, and they are not subject to the small sample bias of least squares type estimators. The _nite sample performances of the proposed estimators and associated statistical tests are investigated, and the results show that the estimators and the tests perform well even for small N and T. 
Date:  2018–02 
URL:  http://d.repec.org/n?u=RePEc:dpr:wpaper:1019r&r=all 
By:  Wang, Dandan; Phillips, Garry David Alan 
Abstract:  We consider the bias of the 2SLS estimator in general dynamic simultaneousequation models with g endogenous regressors. By using asymptotic expansion techniques we approximate 2SLS coefficient estimation bias under innovation errors, p laggeddependent variables and stronglyexogenous explanatory variables. LargeT approximations bias of the structural form is then used to construct corrected estimators for the parameters of interest in the general DSEM (C2SLS). Simulations show that the C2SLS gives almost unbiased estimators and low mean squared errors. Alternatively, the numerical bootstrap method results suggest that the nonparametric bootstrap could be used in 2SLS for improving estimation in general DSEM. 
Keywords:  C2sls; 2sls; Monte Carlo Simulations; Bootstrap; Bias Correction; Asymptotic Approximations; General Dynamic Simultaneous Equations Model 
JEL:  C32 C13 
Date:  2019–04–29 
URL:  http://d.repec.org/n?u=RePEc:cte:wsrepe:28322&r=all 
By:  Olivier Ledoit; Michael Wolf 
Abstract:  Many econometric and datascience applications require a reliable estimate of the covariance matrix, such as Markowitz portfolio selection. When the number of variables is of the same magnitude as the number of observations, this constitutes a difficult estimation problem; the sample covariance matrix certainly will not do. In this paper, we review our work in this area going back 15+ years. We have promoted various shrinkage estimators, which can be classified into linear and nonlinear. Linear shrinkage is simpler to understand, to derive, and to implement. But nonlinear shrinkage can deliver another level of performance improvement, especially if overlaid with stylized facts such as timevarying covolatility or factor models. 
Keywords:  Dynamic conditional correlations, factor models, largedimensional asymptotics, Markowitz portfolio selection, rotation equivariance 
JEL:  C13 C58 G11 
Date:  2019–05 
URL:  http://d.repec.org/n?u=RePEc:zur:econwp:323&r=all 
By:  Feiyu Jiang; Dong Li; Ke Zhu 
Abstract:  This paper considers an augmented double autoregressive (DAR) model, which allows null volatility coefficients to circumvent the overparameterization problem in the DAR model. Since the volatility coefficients might be on the boundary, the statistical inference methods based on the Gaussian quasimaximum likelihood estimation (GQMLE) become nonstandard, and their asymptotics require the data to have a finite sixth moment, which narrows applicable scope in studying heavytailed data. To overcome this deficiency, this paper develops a systematic statistical inference procedure based on the selfweighted GQMLE for the augmented DAR model. Except for the Lagrange multiplier test statistic, the Wald, quasilikelihood ratio and portmanteau test statistics are all shown to have nonstandard asymptotics. The entire procedure is valid as long as the data is stationary, and its usefulness is illustrated by simulation studies and one real example. 
Date:  2019–05 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1905.01798&r=all 
By:  Felix Chan; Ágoston Reguly; László Mátyás 
Abstract:  This paper deals with econometric models where some (or all) explanatory variables (or covariates) are observed as discretized ordered choices. Such variables are in theory continuous, but in this form are not observed at all, their distribution is unknown, and instead only a set of discrete choices are observed. We explore how such variables influence inference, more precisely, we show that this leads to a very special form of measurement error, and consequently to endogeneity bias. We then propose appropriate subsampling and instrumental variables (IV) estimation methods to deal with the problem. 
Date:  2019–05–02 
URL:  http://d.repec.org/n?u=RePEc:ceu:econwp:2019_2&r=all 
By:  Xin Shi; Robert Qiu; Tiebin Mi 
Abstract:  In dealing with highdimensional data, factor models are often used for reducing dimensions and extracting relevant information. The spectrum of covariance matrices from power data exhibits two aspects: 1) bulk, which arises from random noise or fluctuations and 2) spikes, which represents factors caused by anomaly events. In this paper, we propose a new approach to the estimation of highdimensional factor models, minimizing the distance between the empirical spectral density (ESD) of covariance matrices of the residuals of power data that are obtained by subtracting principal components and the limiting spectral density (LSD) from a multiplicative covariance structure model. The free probability techniques in random matrix theory (RMT) are used to calculate the spectral density of the multiplicative covariance model, which efficiently solves the computational difficulties. The proposed approach connects the estimation of the number of factors to the LSD of covariance matrices of the residuals, which provides estimators of the number of factors and the correlation structure information in the residuals. Considering a lot of measurement noise is contained in power data and the correlation structure is complex for the residuals from power data, the approach prefers approaching the ESD of covariance matrices of the residuals through a multiplicative covariance model, which avoids making crude assumptions or simplifications on the complex structure of the data. Theoretical studies show the proposed approach is robust to noise and sensitive to the presence of weak factors. The synthetic data from IEEE 118bus power system is used to validate the effectiveness of the approach. Furthermore, the application to the analysis of the realworld online monitoring data in a power grid shows that the estimators in the approach can be used to indicate the system states. 
Date:  2019–05 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1905.02061&r=all 
By:  Jentsch, Carsen (TU Dortmund University); Lunsford, Kurt Graden (Federal Reserve Bank of Cleveland) 
Abstract:  Proxy structural vector autoregressions identify structural shocks in vector autoregressions with external variables that are correlated with the structural shocks of interest but uncorrelated with all other structural shocks. We provide asymptotic theory for this identification approach under mild αmixing conditions that cover a large class of uncorrelated, but possibly dependent innovation processes, including conditional heteroskedasticity. We prove consistency of a residualbased moving block bootstrap for inference on statistics such as impulse response functions and forecast error variance decompositions. Wild bootstraps are proven to be generally invalid for these statistics and their coverage rates can be badly and persistently missized. 
Keywords:  External Instruments; Mixing; Proxy Variables; ResidualBased Moving Block Bootstrap; Structural Vector Autoregression; Wild Bootstrap; 
JEL:  C30 C32 
Date:  2019–05–03 
URL:  http://d.repec.org/n?u=RePEc:fip:fedcwq:190800&r=all 
By:  Harold D. Chiang; Yuya Sasaki 
Abstract:  This paper studies regression models with lasso when data is sampled under multiway clustering. First, we establish the convergence rates for the lasso and postlasso estimators. Second, we propose a novel inference method based on a postdoubleselection procedure and show its asymptotic validity. Our procedure can be easily implemented with existing statistical packages. Simulation results demonstrate that the proposed procedure works well in finite sample. 
Date:  2019–05 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1905.02107&r=all 
By:  Adam Golinski; Peter Spencer 
Abstract:  Linear estimators of the affine term structure model are inconsistent since they cannot reproduce the factors used in estimation. This is a serious handicap empirically,giving a worse fit than the conventional ML estimator that ensures consistency. We show that a simple selfconsistent estimator can be constructed using the eigenvalue decomposition of a regression estimator. The remaining parameters of the model follow analytically. The fit of this model is virtually indistinguishable from that of the ML estimator. We apply the method to estimate various models of U.S. Treasury yields and a joint model of the U.S. and German yield curves. 
Keywords:  term structure, linear regression estimators, selfconsistent model, estimation methods, twocountry model. 
JEL:  C13 G12 
Date:  2019–05 
URL:  http://d.repec.org/n?u=RePEc:yor:yorken:19/05&r=all 
By:  Hyungsik Roger Moon 
Abstract:  In this paper, we derive a uniform stochastic bound of the operator norm (or equivalently, the largest singular value) of random matrices whose elements are indexed by parameters. As an application, we propose a new estimator that minimizes the operator norm of the matrix that consists of the moment functions. We show the consistency of the estimator. 
Date:  2019–05 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1905.01096&r=all 
By:  Vanessa BerenguerRico (University of Oxford); Søren Johansen (University of Copenhagen and CREATES); Bent Nielsen (University of Oxford) 
Abstract:  An extended and improved theory is presented for marked and weighted empirical processes of residuals of time series regressions. The theory is motivated by 1step Huberskip estimators, where a set of good observations are selected using an initial estimator and an updated estimator is found by applying least squares to the selected observations. In this case, the weights and marks represent powers of the regressors and the regression errors, respectively. The inclusion of marks is a nontrivial extention to previous theory and requires refined martingale arguments. 
Keywords:  1step Huberskip, Nonstationarity, Robust Statistics, Stationarity 
JEL:  C13 
Date:  2019–04–29 
URL:  http://d.repec.org/n?u=RePEc:aah:create:201906&r=all 
By:  Máximo Camacho; María Dolores Gadea (University of Zaragoza); Ana Gómez Loscos (Banco de España) 
Abstract:  This paper proposes a new approach to the analysis of the reference cycle turning points, defined on the basis of the specific turning points of a broad set of coincident economic indicators. Each individual pair of specific peaks and troughs from these indicators is viewed as a realization of a mixture of an unspecified number of separate bivariate Gaussian distributions whose different means are the reference turning points. These dates break the sample into separate reference cycle phases, whose shifts are modeled by a hidden Markov chain. The transition probability matrix is constrained so that the specification is equivalent to a multiple changepoint model. Bayesian estimation of finite Markov mixture modeling techniques is suggested to estimate the model. Several Monte Carlo experiments are used to show the accuracy of the model to date reference cycles that suffer from short phases, uncertain turning points, small samples and asymmetric cycles. In the empirical section, we show the high performance of our approach to identifying the US reference cycle, with little difference from the timing of the turning point dates established by the NBER. In a pseudo realtime analysis, we also show the good performance of this methodology in terms of accuracy and speed of detection of turning point dates. 
Keywords:  business cycles, turning points, finite mixture models 
JEL:  E32 C22 E27 
Date:  2019–05 
URL:  http://d.repec.org/n?u=RePEc:bde:wpaper:1914&r=all 
By:  Yang, Bill Huajian; Wu, Biao; Cui, Kaijie; Du, Zunwei; Fei, Glenn 
Abstract:  Estimation of portfolio expected credit loss is required for IFRS9 regulatory purposes. It starts with the estimation of scenario loss at loan level, and then aggregated and summed up by scenario probability weights to obtain portfolio expected loss. This estimated loss can vary significantly, depending on the levels of loss severity generated by the IFSR9 models, and the probability weights chosen. There is a need for a quantitative approach for determining the weights for scenario losses. In this paper, we propose a model to estimate the expected portfolio losses brought by recession risk, and a quantitative approach for determining the scenario weights. The model and approach are validated by an empirical example, where we stress portfolio expected loss by recession risk, and calculate the scenario weights accordingly. 
Keywords:  Scenario weight, stressed expected credit loss, loss severity, recession probability, Vasicek distribution, probit mixed model 
JEL:  C02 C1 C10 C13 C18 C22 C32 C46 C51 C52 C53 G1 G18 G31 G32 G38 
Date:  2019–04–18 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:93634&r=all 
By:  Jaap H. Abbring; Tim Salimans 
Abstract:  We present a method for computing the likelihood of a mixed hittingtime model that specifies durations as the first time a latent L\'evy process crosses a heterogeneous threshold. This likelihood is not generally known in closed form, but its Laplace transform is. Our approach to its computation relies on numerical methods for inverting Laplace transforms that exploit special properties of the first passage times of L\'evy processes. We use our method to implement a maximum likelihood estimator of the mixed hittingtime model in MATLAB. We illustrate the application of this estimator with an analysis of Kennan's (1985) strike data. 
Date:  2019–05 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1905.03463&r=all 
By:  Amit Goyal (University of Lausanne); Zhongzhi Lawrence He (Brock University, Goodman School of Business); SahnWook Huh (State University of New York (SUNY)  Department of Finance) 
Abstract:  We propose a unified set of distancebased performance metrics that address the power and extremeerror problems inherent in traditional measures for assetpricing tests. From a Bayesian perspective, the distance metrics coherently incorporate both pricing errors and their standard errors. Measured in units of return, they have an economic interpretation as the minimum cost of holding a dogmatic belief in a model. Our metrics identify Fama and French (2015) factor model (augmented with the momentum factor and/or without the value factor) as the best model and thus highlight the importance of the momentum factor. In contrast, the traditional alphabased statistics often lead to inconsistent and counterintuitive model rankings. 
Keywords:  AssetPricing Tests, Power Problem, ExtremeError Problem, DistanceBased Metrics, Optimal Transport Theory, Bayesian Interpretations, Model Comparisons and Rankings 
JEL:  C11 G11 G12 
Date:  2018–12 
URL:  http://d.repec.org/n?u=RePEc:chf:rpseri:rp1878&r=all 
By:  Kai Feng; Han Hong; Ke Tang; Jingyuan Wang 
Abstract:  The Receiver Operating Characteristic (ROC) curve is a representation of the statistical information discovered in binary classification problems and is a key concept in machine learning and data science. This paper studies the statistical properties of ROC curves and its implication on model selection. We analyze the implications of different models of incentive heterogeneity and information asymmetry on the relation between human decisions and the ROC curves. Our theoretical discussion is illustrated in the context of a large data set of pregnancy outcomes and doctor diagnosis from the PrePregnancy Checkups of reproductive age couples in Henan Province provided by the Chinese Ministry of Health. 
Date:  2019–05 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1905.02810&r=all 
By:  Bhattacharya, Debopam; Dupas, Pascaline; Kanaya, Shin 
Abstract:  Many reallife settings of consumer choice involve social interactions, causing targeted policies to have spillover effects. This paper develops novel empirical tools for analyzing demand and welfare effects of policy interventions in binary choice settings with social interactions. Examples include subsidies for health product adoption and vouchers for attending a highachieving school. We establish the connection between econometrics of large games and BrockDurlauftype interaction models, under both I.I.D. and spatially correlated unobservables. We develop new convergence results for associated beliefs and estimates of preference parameters under increasing domain spatial asymptotics. Next, we show that even with fully parametric specifications and unique equilibrium, choice data, that are sufficient for counterfactual demand prediction under interactions, are insufficient for welfare calculations. This is because distinct underlying mechanisms producing the same interaction coefficient can imply different welfare effects and deadweightloss from a policy intervention. Standard indexrestrictions imply distributionfree bounds on welfare. We illustrate our results using experimental data on mosquitonet adoption in rural Kenya. 
Date:  2019–04 
URL:  http://d.repec.org/n?u=RePEc:cpr:ceprdp:13707&r=all 
By:  Tarun Chordia (Emory University  Department of Finance); Amit Goyal (University of Lausanne); Alessio Saretto (University of Texas at Dallas  School of Management  Department of Finance & Managerial Economics) 
Abstract:  We implement a data mining approach to generate about 2.1 million trading strategies. This large set of strategies serves as a laboratory to evaluate the seriousness of phacking and data snooping in finance. We apply multiple hypothesis testing techniques that account for crosscorrelations in signals and returns to produce tstatistic thresholds that control the proportion of false discoveries. We find that the difference in rejections rates produced by single and multiple hypothesis testing is such that most rejections of the null of no outperformance under single hypothesis testing are likely false (i.e., we find a very high rate of type I errors). Combining statistical criteria with economic considerations, we find that a remarkably small number of strategies survive our thorough vetting procedure. Even these surviving strategies have no theoretical underpinnings. Overall, phacking is a serious problem and, correcting for it, outperforming trading strategies are rare. 
Keywords:  Hypothesis testing, False discoveries, Trading strategies 
JEL:  G10 G11 G12 
Date:  2017–08 
URL:  http://d.repec.org/n?u=RePEc:chf:rpseri:rp1737&r=all 