
on Econometrics 
By:  Adam M. Rosen; Takuya Ura 
Abstract:  We provide a finite sample inference method for the structural parameters of a semiparametric binary response model under a conditional median restriction originally studied by Manski (1975, 1985). Our inference method is valid for any sample size and irrespective of whether the structural parameters are point identified or partially identified, for example due to the lack of a continuously distributed covariate with large support. Our inference approach exploits distributional properties of observable outcomes conditional on the observed sequence of exogenous variables. Moment inequalities conditional on this size n sequence of exogenous covariates are constructed, and the test statistic is a monotone function of violations of sample moment inequalities. The critical value used for inference is provided by the appropriate quantile of a known function of n independent Rademacher random variables. We investigate power properties of the underlying test and provide simulation studies to support the theoretical findings. 
Date:  2019–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1903.01511&r=all 
By:  Demetrescu, Matei; Georgiev, Iliyan; Rodrigues, Paulo MM; Taylor, AM Robert 
Abstract:  Standard tests based on predictive regressions estimated over the full available sample data have tended to find little evidence of predictability in stock returns. Recent approaches based on the analysis of subsamples of the data have been considered, suggesting that predictability where it occurs might exist only within socalled 'pockets of predictability' rather than across the entire sample. However, these methods are prone to the criticism that the subsample dates are endogenously determined such that the use of standard critical values appropriate for full sample tests will result in incorrectly sized tests leading to spurious findings of stock returns predictability. To avoid the problem of endogenouslydetermined sample splits, we propose new tests derived from sequences of predictability statistics systematically calculated over subsamples of the data. Specifically, we will base tests on the maximum of such statistics from sequences of forward and backward recursive, rolling, and doublerecursive predictive subsample regressions. We develop our approach using the overidentified instrumental variablebased predictability test statistics of Breitung and Demetrescu (2015). This approach is based on partialsum asymptotics and so, unlike many other popular approaches including, for example, those based on Bonferroni corrections, can be readily adapted to implementation over sequences of subsamples. We show that the limiting distributions of our proposed tests are robust to both the degree of persistence and endogeneity of the regressors in the predictive regression, but not to any heteroskedasticity present even if the subsample statistics are based on heteroskedasticityrobust standard errors. We therefore develop fixed regressor wild bootstrap implementations of the tests which we demonstrate to be firstorder asymptotically valid. Finite sample behaviour against a variety of temporarily predictable processes is considered. An empirical application to US stock returns illustrates the usefulness of the new predictability testing methods we propose. 
Keywords:  predictive regression; rolling and recursive IV estimation; persistence; endogeneity; conditional and unconditional heteroskedasticity 
Date:  2019–02–27 
URL:  http://d.repec.org/n?u=RePEc:esy:uefcwp:24137&r=all 
By:  Zongwu Cai (Department of Economics, The University of Kansas); Ying Fang (The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China); Ming Lin (The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China); Shengfang Tang (Department of Statistics, School of Economics, Xiamen University, Xiamen, China) 
Abstract:  In this paper, we propose an alternative test procedure for testing the conditional independence assumption which is an important identication condition commonly imposed in the literature of program analysis and policy evaluation. We transform the conditional independence test to a nonparametric conditional moment test using an auxiliary variable which is independent of the treatment assignment variable conditional on potential outcomes and observable covariates. The proposed test statistic is shown to have a limiting normal distribution under null hypotheses of conditional independence. Furthermore, the suggested method is shown to be valid under time series framework and thus the corresponding test statistic and its limiting distribution are also established. Monte Carlo simulations are conducted to examine the finite sample performances of the proposed test statistics. Finally, the proposed test method is applied to test the conditional independence in real examples: the 401(k) participation program and return to college education. 
Keywords:  Conditional independence; Moment test; Nonparametric estimation; Selection on ob servable; Treatment effect. 
JEL:  C12 C13 C14 C23 
Date:  2019–03 
URL:  http://d.repec.org/n?u=RePEc:kan:wpaper:201905&r=all 
By:  Koo, B.; La Vecchia, D.; Linton, O. 
Abstract:  We develop estimation methodology for an additive nonparametric panel model that is suitable for capturing the pricing of couponpaying government bonds followed over many time periods. We use our model to estimate the discount function and yield curve of nominally riskless government bonds. The novelty of our approach is the combination of two different techniques: crosssectional nonparametric methods and kernel estimation for time varying dynamics in the time series context. The resulting estimator is able to capture the yield curve shapes and dynamics commonly observed in the fixed income markets. We establish the consistency, the rate of convergence, and the asymptotic normality of the proposed estimator. A Monte Carlo exercise illustrates the good performance of the method under different scenarios. We apply our methodology to the daily CRSP bond dataset, and compare with the popular Diebold and Li (2006) method. 
Keywords:  nonparametric inference, panel data, time varying, yield curve dynamics 
JEL:  C13 C14 C22 G12 
Date:  2019–02–27 
URL:  http://d.repec.org/n?u=RePEc:cam:camdae:1916&r=all 
By:  Sebastian Kripfganz (University of Exeter); Daniel C. Schneider (Max Planck Institute for Demographic Research) 
Abstract:  Singleequation conditional equilibrium correction models can be used to test for the existence of a level relationship among the variables of interest. The distributions of the respective test statistics are nonstandard under the null hypothesis of no such relationship and critical values need to be obtained with stochastic simulations. We compute more than 95 billion F statistics and 57 billion tstatistics for a large number of specifications of the Pesaran, Shin, and Smith (2001, Journal of Applied Econometrics 16: 289Ð326) bounds test. Our largescale simulations enable us to draw smooth density functions and to estimate response surface models that improve upon and substantially extend the set of available critical values for the bounds test. Besides covering the full range of possible sample sizes and lag orders, our approach notably allows for any number of variables in the longrun level relationship by exploiting the diminishing effect on the distributions of adding another variable to the model. The computation of approximate pvalues enables a finegrained statistical inference and allows us to quantify the finitesample distortions from using asymptotic critical values. We find that the bounds test can be easily oversized by more than 5 percentage points in small samples. 
Keywords:  Bounds test, Cointegration, Error correction model, Generalized DickeyFuller regression, Level relationship, Unit roots 
JEL:  C12 C15 C32 C46 C63 
Date:  2019 
URL:  http://d.repec.org/n?u=RePEc:exe:wpaper:1901&r=all 
By:  Escribano Sáez, Álvaro; Blazsek, Szabolcs Istvan; Ayala, Astrid 
Abstract:  We introduce new dynamic conditional score (DCS) volatility models with dynamic scale and shape parameters for the effective measurement of volatility. In the new models, we use the EGB2 (exponential generalized beta of the second kind), NIG (normalinverse Gaussian) and SkewGent (skewed generalizedt) probability distributions. Those distributions involve several shape parameters that control the dynamic skewness, tail shape and peakedness of financial returns. We use daily return data from the Standard & Poor's 500 (S&P 500) index for the period of January 4, 1950 to December 30, 2017. We estimate all models by using the maximum likelihood (ML) method, and we present the conditions of consistency and asymptotic normality of the ML estimates. We study those conditions for the S&P 500 and we also perform diagnostic tests for the residuals. The statistical performances of several DCS specifications with dynamic shape are superior to the statistical performance of the DCS specification with constant shape. Outliers in the shape parameters are associated with important announcements that affected the United States (US) stock market. Our results motivate the application of the new DCS models to volatility measurement, pricing financial derivatives, or estimation of the valueatrisk (VaR) and expected shortfall (ES) metrics. 
Keywords:  scoredriven shape parameters; Dynamic conditional score (DCS) models 
JEL:  C58 C52 C22 
Date:  2019–01–28 
URL:  http://d.repec.org/n?u=RePEc:cte:werepe:28133&r=all 
By:  JiunHua Su 
Abstract:  The semiparametric maximum utility estimation proposed by Elliott and Lieli (2013) can be viewed as costsensitive binary classification; thus, its insample overfitting issue is similar to that of perceptron learning in the machine learning literature. Based on structural risk minimization, a utilitymaximizing prediction rule (UMPR) is constructed to alleviate the insample overfitting of the maximum utility estimation. We establish nonasymptotic upper bounds on the difference between the maximal expected utility and the generalized expected utility of the UMPR. Simulation results show that the UMPR with an appropriate datadependent penalty outweighs some common estimators in binary classification if the conditional probability of the binary outcome is misspecified, or a decision maker's preference is ignored. 
Date:  2019–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1903.00716&r=all 
By:  Gregor Kastner; Sylvia Fr\"uhwirthSchnatter 
Abstract:  Bayesian inference for stochastic volatility models using MCMC methods highly depends on actual parameter values in terms of sampling efficiency. While draws from the posterior utilizing the standard centered parameterization break down when the volatility of volatility parameter in the latent state equation is small, noncentered versions of the model show deficiencies for highly persistent latent variable series. The novel approach of ancillaritysufficiency interweaving has recently been shown to aid in overcoming these issues for a broad class of multilevel models. In this paper, we demonstrate how such an interweaving strategy can be applied to stochastic volatility models in order to greatly improve sampling efficiency for all parameters and throughout the entire parameter range. Moreover, this method of "combining best of different worlds" allows for inference for parameter constellations that have previously been infeasible to estimate without the need to select a particular parameterization beforehand. 
Date:  2017–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1706.05280&r=all 
By:  Denis Kojevnikov; Vadim Marmer; Kyungchul Song 
Abstract:  This paper considers a general form of network dependence where dependence between two sets of random variables becomes weaker as their distance in a network gets longer. We show that such network dependence cannot be embedded as a random field on a lattice in a Euclidean space with a fixed dimension when the maximum clique increases in size as the network grows. This paper applies Doukhan and Louhichi (1999)'s weak dependence notion to network dependence by measuring dependence strength by the covariance between nonlinearly transformed random variables. While this approach covers examples such as strong mixing random fields on a graph and conditional dependency graphs, it is most useful when dependence arises through a large functionalcausal system of equations. The main results of our paper include the law of large numbers, and the central limit theorem. We also propose a heteroskedasticityautocorrelation consistent variance estimator and prove its consistency under regularity conditions. The finite sample performance of this latter estimator is investigated through a Monte Carlo simulation study. 
Date:  2019–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1903.01059&r=all 
By:  Ashesh Rambachan; Neil Shephard 
Abstract:  This paper uses potential outcome time series to provide a nonparametric framework for quantifying dynamic causal effects in macroeconometrics. This provides sufficient conditions for the nonparametric identification of dynamic causal effects as well as clarify the causal content of several common assumptions and methods in macroeconomics. Our key identifying assumption is shown to be nonanticipating treatments which enables nonparametric inference on dynamic causal effects. Next, we provide a formal definition of a `shock' and this leads to a shocked potential outcome time series. This is a nonparametric statement of the FrischSlutzky paradigm. The common additional assumptions that the causal effects are additive and that the treatments are shocks place substantial restrictions on the underlying dynamic causal estimands. We use this structure to causally interpret several common estimation strategies. We provide sufficient conditions under which local projections is causally interpretable and show that the standard assumptions for local projections with an instrument are not sufficient to identify dynamic causal effects. We finally show that the structural vector moving average form is causally equivalent to a restricted potential outcome time series under the usual invertibility assumption. 
Date:  2019–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1903.01637&r=all 
By:  Reza Hajargasht 
Abstract:  Variational Bayes (VB) is a recent approximate method for Bayesian inference. It has the merit of being a fast and scalable alternative to Markov Chain Monte Carlo (MCMC) but its approximation error is often unknown. In this paper, we derive the approximation error of VB in terms of mean, mode, variance, predictive density and KL divergence for the linear Gaussian multiequation regression. Our results indicate that VB approximates the posterior mean perfectly. Factors affecting the magnitude of underestimation in posterior variance and mode are revealed. Importantly, We demonstrate that VB estimates predictive densities accurately. 
Date:  2019–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1903.00617&r=all 
By:  Zeqin Liu (The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China); Zongwu Cai (Department of Economics, The University of Kansas); Ying Fang (The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China); Ming Lin (The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China) 
Abstract:  In this paper, we highlight some recent developments of a new route to evaluate macroeconomic policy effects, which are investigated under the framework with potential out comes. First, this paper begins with a brief introduction of the basic model setup in modern econometric analysis of program evaluation. Secondly, primary attention goes to the focus on causal effect estimation of macroeconomic policy with single time series data together with some extensions to multiple time series data. Furthermore, we examine the connection of this new approach to traditional macroeconomic models for policy analysis and evaluation. Finally, we conclude by addressing some possible future research directions in statistics and econometrics. 
Keywords:  Impulse response function; Macroeconomic casual inferences; Macroeconomic pol icy evaluation; Multiple time series data; Potential outcomes; Treatment effect. 
JEL:  C12 C13 C14 C23 
Date:  2019–03 
URL:  http://d.repec.org/n?u=RePEc:kan:wpaper:201904&r=all 
By:  Sergio Correia; Paulo Guimar\~aes; Thomas Zylkin 
Abstract:  We expand on Santos Silva and Tenreyro (2010)'s observation that estimates from Poisson models are not guaranteed to exist by documenting necessary and sufficient conditions for the existence of estimates for a wide class of generalized linear models (GLMs). We show that some, but not all, GLMs can still deliver consistent, uniquelyidentified maximum likelihood estimates of at least some of the linear parameters at the boundary of the parameter space when these conditions fail to hold. We also demonstrate how to verify this condition in the presence of highdimensional fixed effects, which are often recommended in the international trade literature and in other common panel settings. 
Date:  2019–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1903.01633&r=all 
By:  Kapetanios, George; Papailias, Fotis; Taylor, AM Robert 
Abstract:  A bootstrap methodology, first proposed in a restricted form by Kapetanios and Papailias (2011), suitable for use with stationary and nonstationary fractionally integrated time series is further developed in this paper. The resampling algorithm involves estimating the degree of fractional integration, applying the fractional differencing operator, resampling the resulting approximation to the underlying short memory series and, finally, cumulating to obtain a resample of the original fractionally integrated process. While a similar approach based on differencing has been independently proposed in the literature for stationary fractionally integrated processes using the sieve bootstrap by Poskitt, Grose and Martin (2015), we extend it to allow for general bootstrap schemes including blockwise bootstraps. Further, we show that it can also be validly used for nonstationary fractionally integrated processes. We establish asymptotic validity results for the general method and provide simulation evidence which highlights a number of favourable aspects of its finite sample performance, relative to other commonly used bootstrap methods. 
Date:  2019–02–27 
URL:  http://d.repec.org/n?u=RePEc:esy:uefcwp:24136&r=all 
By:  L\'aszl\'o Csat\'o; D\'ora Gr\'eta Petr\'oczy 
Abstract:  Pairwise comparisons are used in a wide variety of decision situations when the importance of different alternatives should be measured by numerical weights. One popular method to derive these priorities is based on the right eigenvector of a multiplicative pairwise comparison matrix. We introduce an axiom called monotonicity: increasing an arbitrary entry of a pairwise comparison matrix should increase the weight of the favoured alternative (which is in the corresponding row) by the greatest factor and should decrease the weight of the favoured alternative (which is in the corresponding column) by the greatest factor. It is proved that the eigenvector method violates this natural requirement. We also investigate the relationship between nonmonotonicity and the Saaty inconsistency index. It turns out that the violation of monotonicity is not a problem in the case of nearly consistent matrices. On the other hand, the eigenvector method remains a dubious choice for inherently inconsistent large matrices such as the ones that emerge in sports applications. 
Date:  2019–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1902.10790&r=all 
By:  Nikkil Sudharsanan; Maarten J. Bijlsma (Max Planck Institute for Demographic Research, Rostock, Germany) 
Abstract:  One central aim of the population sciences is to understand why one population has different levels of health and wellbeing compared to another. Various methods, such as the OaxacaBlinder and Kitagawa decompositions, have been used to decompose populationdifferences in a wide range of outcomes. We provide a way of implementing an alternative decomposition method that, under certain assumptions, adds a causal interpretation to the decomposition by building upon counterfactualdriven estimation methods. In addition, the approach has the advantage of flexibility to accommodate different types of outcome and explanatory variables and any population contrast. By using Monte Carlo methods, our approach does not rely on closedform approximate solutions and can be applied to any parametric model without having to derive any decomposition equations. We demonstrate our approach through two motivating examples using data from the Mexican Health and Aging Study and the 1970 British Birth Cohort Study. The first example uses a crosssectional binary outcome (disability), a contrast of prevalence rates, and considers a binary mediator (stroke), while the second example uses a count outcome (age at first birth), a contrast of median ages, and considers a count mediator (women’s own years of education). Together, our two examples outline how to implement a very generalized decomposition procedure that is theoretically grounded in counterfactual theory but still easy to apply to a wide range of situations. We provide example Rcode and an Rfunction [package in development]. 
Keywords:  methods of analysis 
JEL:  J1 Z0 
Date:  2019–02 
URL:  http://d.repec.org/n?u=RePEc:dem:wpaper:wp2019004&r=all 
By:  Miriam Steurer (University of Graz, Austria); Robert Hill (University of Graz, Austria) 
Abstract:  With the rapid growth of machine learning (ML) methods and datasets to which they can be applied, the question of how one can compare the predictive performance of competing models is becoming an issue of high importance. The existing literature is interdisciplinary, making it hard for users to locate and evaluate the set of available metrics. In this article we collect a number of such metrics from various sources. We classify them by type and then evaluate them with respect to two novel symmetry conditions. While none of these metrics satisfy both conditions, we propose a number of new metrics that do. In total we consider a portfolio of 56 performance metrics. To illustrate the problem of choosing between them, we provide an application in which five ML methods are used to predict apartment prices. We show that the most popular metrics for evaluating performance in the AVM literature generate misleading results. A different picture emerges when the full set of metrics is considered, and especially when we focus on the class of metrics with the best symmetry properties. We conclude by recommending four key metrics for evaluating model predictive performance. 
Keywords:  Machine learning; Performance metric; Prediction error; Automated valuation model 
JEL:  C45 C53 
Date:  2019–02 
URL:  http://d.repec.org/n?u=RePEc:grz:wpaper:201902&r=all 
By:  Christopher F. Baum (Boston College; DIW Berlin; CESIS, KTH Royal Institute of Technology); Arthur Lewbel (Boston College) 
Abstract:  Lewbel (2012) provides a heteroscedasticity based estimator for linear regression models containing an endogenous regressor when no external instruments or other such information is available. The estimator is implemented in the Stata module ivreg2h by Baum and Schaffer (2012). This note gives some advice and instructions to researchers who want to use this estimator. 
Keywords:  instrumental variables, linear regression, endogeneity, identification, heteroscedasticity 
JEL:  C26 C13 C87 
Date:  2018–02–12 
URL:  http://d.repec.org/n?u=RePEc:boc:bocoec:975&r=all 
By:  Jonas Rothfuss; Fabio Ferreira; Simon Walther; Maxim Ulrich 
Abstract:  Given a set of empirical observations, conditional density estimation aims to capture the statistical relationship between a conditional variable $\mathbf{x}$ and a dependent variable $\mathbf{y}$ by modeling their conditional probability $p(\mathbf{y}\mathbf{x})$. The paper develops best practices for conditional density estimation for finance applications with neural networks, grounded on mathematical insights and empirical evaluations. In particular, we introduce a noise regularization and data normalization scheme, alleviating problems with overfitting, initialization and hyperparameter sensitivity of such estimators. We compare our proposed methodology with popular semi and nonparametric density estimators, underpin its effectiveness in various benchmarks on simulated and Euro Stoxx 50 data and show its superior performance. Our methodology allows to obtain highquality estimators for statistical expectations of higher moments, quantiles and nonlinear return transformations, with very little assumptions about the return dynamic. 
Date:  2019–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1903.00954&r=all 
By:  Sariev, Eduard; Germano, Guido 
Abstract:  Support vector machines (SVM) have been extensively used for classification problems in many areas such as gene, text and image recognition. However, SVM have been rarely used to estimate the probability of default (PD) in credit risk. In this paper, we advocate the application of SVM, rather than the popular logistic regression (LR) method, for the estimation of both corporate and retail PD. Our results indicate that most of the time SVM outperforms LR in terms of classification accuracy for the corporate and retail segments. We propose a new wrapper feature selection based on maximizing the distance of the support vectors from the separating hyperplane and apply it to identify the main PD drivers. We used three datasets to test the PD estimation, containing (1) retail obligors from Germany, (2) corporate obligors from Eastern Europe, and (3) corporate obligors from Poland. Total assets, total liabilities, and sales are identified as frequent default drivers for the corporate datasets, whereas current account status and duration of the current account are frequent default drivers for the retail dataset. 
Keywords:  default risk; logistic regression; support vector machines; ES/ K002309/1 
JEL:  C10 C13 
Date:  2018–11–28 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:100211&r=all 
By:  Stelios Arvanitis (Athens University of Economics and Business); Alexandros Louka (Athens University of Economics and Business) 
Abstract:  We derive the limit theory of the Gaussian QMLE in the nonstationary GARCH(1,1) model when the square dinnovation process lies in the domain of attraction of a stable law. Analogously to the stationary case, when the stability parameter lies in (1, 2], we find regularly varying rates and stable limits for the QMLE of the ARCH and GARCH parame ters. 
Keywords:  Martingale Limit Theorem, Domain of Attraction, Stable Distribution, Slowly Varying Sequence, NonStationarity, Gaussian QMLE, Regularly Varying Rate. 
JEL:  C32 
Date:  2017–05 
URL:  http://d.repec.org/n?u=RePEc:aeb:wpaper:201705:y:2017&r=all 