
on Econometrics 
By:  David Kaplan (Department of Economics, University of MissouriColumbia) 
Abstract:  Estimation of a sample quantile's variance requires estimation of the probability density at the quantile. The common quantile spacing method involves smoothing parameter m. When m, n â†’ âˆž , the corresponding Studentized test statistic asymptotically follows a standard normal distribution. Holding m fixed asymptotically yields a nonstandard distribution dependent on m that contains the Edgeworth expansion term capturing the variance of the quantile spacing. Consequently, the fixedm distribution is more accurate than the standard normal under both asymptotic frameworks. For the fixedm test, I propose an m to maximize power subject to size control, as calculated via Edgeworth expansion. Compared with similar methods, the new method controls size better and maintains good or better power in simulations. Results for twosample quantile treatment effect inference are given in parallel. 
Keywords:  Edgeworth expansion, fixedsmoothing asymptotics, inference, quantile, studentize, testingoptimal 
JEL:  C01 C12 C21 
Date:  2013–07–05 
URL:  http://d.repec.org/n?u=RePEc:umc:wpaper:1313&r=ecm 
By:  David Kaplan (Department of Economics, University of MissouriColumbia); Matt Goldman 
Abstract:  We propose new methods of inference on distributions based on the nitesample joint (ordered) Dirichlet distribution of uniform order statistics. The commonlyused KolmogorovSmirnov test is known to have low sensitivity to deviations in the tails. Weighting by the inverse pointwise asymptotic standard deviation is known to suffer the opposite problem: sensitivity in the middle of the distribution is much lower than in the tails. Our Dirichletbased method finally succeeds in having equal pointwise type I error across the entire distribution, even in finite samples, while maintaining exact overall type I error. Our method may alternatively be interpreted as a family of tests (one at each order statistic) that controls the familywise error rate, or as constructing a uniform confidence band for the unknown distribution function. We also propose twosample tests, which also have exact finitesample size, for equality or firstorder stochastic dominance, based on uniform confidence bands for the two distribution functions. Simulations and empirical examples demonstrate our new methods. Fully operational code is provided.ClassificationJEL: C01, C12, C21 
Keywords:  fractional order statistics, nonparametric statistics, quantile inference, quantile treatment effect 
Date:  2013–10–31 
URL:  http://d.repec.org/n?u=RePEc:umc:wpaper:1319&r=ecm 
By:  David Kaplan (Department of Economics, University of MissouriColumbia); Matt Goldman 
Abstract:  The literature has two types of fractional order statistics: an `ideal' (unobserved) type based on a beta distribution, and an observable type linearly interpolated between consecutive order statistics. We show convergence in distribution of the two types at an O(n1) rate, which we also show holds for joint vectors and linear combinations of fractional order statistics. This connection justifies use of the linearly interpolated type in practice when sampling theory is based on the `ideal' type. For example, the coverage probability error (CPE) has the same O(n1) magnitude for one sample nonparametric joint confidence intervals over multiple quantiles. For a single quantile, our new analytic calibration reduces the CPE to nearly O(n3/2), and our new inference method on linear combinations of quantiles has O(n2/3) CPE. With additional theoretical work, we propose a new method for twosample quantile treatment effect inference, which has twosided CPE of order O(n2/3), or O(n1) under exchangeability, and onesided CPE of order O(n1/2). In an application of our method to data from a recent paper on "gift exchange," we reveal interesting heterogeneity in the treatment effect of "gift wages." In simulations, our quantile treatment effect hypothesis test compares favorably with existing methods in both size and power properties. Along the way, we provide highorder approximations of the PDF and PDF derivative of a Dirichlet distribution in terms of the normal. 
Keywords:  fractional order statistics, nonparametric statistics, quantile inference, quantile treatment effect 
JEL:  C01 C12 C21 
Date:  2013–09–05 
URL:  http://d.repec.org/n?u=RePEc:umc:wpaper:1315&r=ecm 
By:  Heng Chen 
Abstract:  Estimation of the quantile model, especially with a large data set, can be computationally burdensome. This paper proposes using the Gaussian approximation, also known as quantile coupling, to estimate a quantile model. The intuition of quantile coupling is to divide the original observations into bins with an equal number of observations, and then compute order statistics within these bins. The quantile coupling allows one to apply the standard Gaussianbased estimation and inference to the transformed data set. The resulting estimator is asymptotically normal with a parametric convergence rate. A key advantage of this method is that it is faster than the conventional check function approach, when handling a sizable data set. 
Keywords:  Econometric and statistical methods 
JEL:  C13 C14 C21 
Date:  2014 
URL:  http://d.repec.org/n?u=RePEc:bca:bocawp:1424&r=ecm 
By:  David Kaplan (Department of Economics, University of MissouriColumbia) 
Abstract:  The literature has two types of fractional order statistics: an `ideal' (unobserved) type based on a beta distribution, and an observable type linearly interpolated between consecutive order statistics. From the nonparametric perspective of local smoothing, we examine inference on conditional quantiles, as well as linear combinations of conditional quantiles and conditional quantile treatment effects. This paper develops a framework for translating the powerful, highorder accurate IDEAL results (Goldman and Kaplan, 2012) from their original unconditional context into a conditional context, via a uniform kernel. Under mild smoothness assumptions, our new conditional IDEAL method's twosided pointwise coverage probability error is O(n2/(2+d)), where d is the dimension of the conditioning vector and n is the total sample size. For d â‰¤ 2, this is better than the conventional inference based on asymptotic normality or a standard bootstrap. It is also better for other d depending on smoothness assumptions. For example, conditional IDEAL is more accurate for d = 3 unless 11 or more derivatives of the unknown function exist and a corresponding local polynomial of degree 11 is used (which has 364 terms since interactions are required). Even as d â†’ âˆž, conditional IDEAL is more accurate unless the number of derivatives is at least four, and the number of terms in the corresponding local polynomial goes to infinity as d â†’ âˆž. The tradeoff between the effective (local) sample size and bias determines the optimal bandwidth rate, and we propose a feasible plugin bandwidth. Simulations show that IDEAL is more accurate than popular current methods, significantly reducing size distortion in some cases while substantially increasing power (while still controlling size) in others. Computationally, our new method runs much more quickly than existing methods for medium and large datasets (roughly n â‰¥ 1000). We also examine health outcomes in Indonesia for an empirical example. 
Keywords:  fractional order statistics, nonparametric statistics, quantile inference, quantile treatment effect 
JEL:  C01 C12 C21 
Date:  2013–09–05 
URL:  http://d.repec.org/n?u=RePEc:umc:wpaper:1316&r=ecm 
By:  David Kaplan (Department of Economics, University of MissouriColumbia); Yixiao Sun 
Abstract:  The moment conditions or estimating equations for instrumental variables quantile regression involves the discontinuous indicator function. We instead use smoothed estimating equations, with bandwidth h. This is known to allow higherorder expansions that justify bootstrap refinements for inference. Computation of the estimator also becomes simpler and more reliable, especially with (more) endogenous regressors. We show that the mean squared error of the vector of estimating equations is minimized for some h > 0, which also reduces the mean squared error of the parameter estimators. The same h also minimizes higherorder type I error for a Ï‡2 test, leading to improved sizeadjusted power. Our plugin bandwidth consistently reproduces all of these properties in simulations. 
Keywords:  bandwidth choice, higherorder properties, instrumental variables, quantile regression, smoothing 
JEL:  C01 C12 C13 C21 C26 
Date:  2013–09–05 
URL:  http://d.repec.org/n?u=RePEc:umc:wpaper:1314&r=ecm 
By:  Markku Lanne (University of Helsinki and CREATES); Henri Nyberg (University of Helsinki) 
Abstract:  We propose a new generalized forecast error variance decomposition with the property that the proportions of the impact accounted for by innovations in each variable sum to unity. Our decomposition is based on the wellestablished concept of the generalized impulse response function. The use of the new decomposition is illustrated with an empirical application to U.S. output growth and interest rate spread data. 
Keywords:  Forecast error variance decomposition, generalized impulse response function, output growth, term spread 
JEL:  C13 C32 C53 
Date:  2014–05–19 
URL:  http://d.repec.org/n?u=RePEc:aah:create:201417&r=ecm 
By:  Harald Oberhofer; Michael Pfaffermayr (WIFO) 
Abstract:  This paper discusses two alternative twopart models for fractional response variables that are defined as ratios of integers. The first twopart model assumes a Binomial distribution and known group size. It nests the onepart fractional response model proposed by Papke and Wooldridge (1996) and thus, allows to apply Wald, LM and/or LR tests in order to discriminate between the two models. The second model extends the first one by allowing for overdispersion. Monte Carlo studies reveal that, for both models, the proposed tests are equipped with sufficient power and are properly sized. Finally, we demonstrate the usefulness of the proposed twopart models for data on the 401(k) pension plan participation rates used in Papke and Wooldridge (1996). 
Keywords:  Fractional response models for ratios of integers, onepart versus twopart models, Wald test, LM test, LR test 
Date:  2014–06–17 
URL:  http://d.repec.org/n?u=RePEc:wfo:wpaper:y:2014:i:472&r=ecm 
By:  Peter Ganong; Simon JÃ¤ger 
Abstract:  The Regression Kink (RK) design is an increasingly popular empirical method, with more than 20 studies circulated using RK in the last 5 years since the initial circulation of Card, Lee, Pei and Weber (2012). We document empirically that these estimates, which typically use local linear regression, are highly sensitive to curvature in the underlying relationship between the outcome and the assignment variable. As an alternative inference procedure, motivated by randomization inference, we propose that researchers construct a distribution of placebo estimates in regions without a policy kink. We apply our procedure to three empirical RK applications – two administrative UI datasets with true policy kinks and the 1980 Census, which has no policy kinks – and we find that statistical significance based on conventional pvalues may be spurious. In contrast, our permutation test reinforces the asymptotic inference results of a recent Regression Discontinuity study and a DifferenceinDifference study. Finally, we propose estimating RK models with a modified cubic splines framework and test the performance of different estimators in a simulation exercise. Cubic specifications – in particular recently proposed robust estimators (Calonico, Cattaneo and Titiunik 2014) – yield short interval lengths with good coverage rates. 
Date:  2014–01 
URL:  http://d.repec.org/n?u=RePEc:qsh:wpaper:174531&r=ecm 
By:  Ishanu Chattopadhyay 
Abstract:  While correlation measures are used to discern statistical relationships between observed variables in almost all branches of datadriven scientific inquiry, what we are really interested in is the existence of causal dependence. Designing an efficient causality test, that may be carried out in the absence of restrictive presuppositions on the underlying dynamical structure of the data at hand, is nontrivial. Nevertheless, ability to computationally infer statistical prima facie evidence of causal dependence may yield a far more discriminative tool for data analysis compared to the calculation of simple correlations. In the present work, we present a new nonparametric test of Granger causality for quantized or symbolic data streams generated by ergodic stationary sources. In contrast to stateofart binary tests, our approach makes precise and computes the degree of causal dependence between data streams, without making any restrictive assumptions, linearity or otherwise. Additionally, without any a priori imposition of specific dynamical structure, we infer explicit generative models of causal crossdependence, which may be then used for prediction. These explicit models are represented as generalized probabilistic automata, referred to crossed automata, and are shown to be sufficient to capture a fairly general class of causal dependence. The proposed algorithms are computationally efficient in the PAC sense; $i.e.$, we find good models of crossdependence with high probability, with polynomial runtimes and sample complexities. The theoretical results are applied to weekly searchfrequency data from Google Trends API for a chosen set of socially "charged" keywords. The causality network inferred from this dataset reveals, quite expectedly, the causal importance of certain keywords. It is also illustrated that correlation analysis fails to gather such insight. 
Date:  2014–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1406.6651&r=ecm 
By:  Juan Carlos MartínezOvando; Sergio I. OlivaresGuzmán; Adriana RoldánRodríguez 
Abstract:  In this paper, we develop a new modelbased method to inference on totals and averages of nite populations segmented in planned domains or strata. Within each stratum, we decompose the total as the sum of its sampled and unsampled parts, making inference on the unsampled part using Bayesian nonparametric methods. Additionally, we extend this method to make inference on totals of unplanned domains simultaneously modelling, within each stratum, the underlying uncertainty about the composition of the population and the totals across unplanned domains. Making inference on population averages is straightforward in both frameworks. To illustrate these methods, we develop a simulation exercise and evaluate the uncertainty surrounding the gender wage gap in Mexico. 
Keywords:  Survey methods, robustness, speciessampling models 
JEL:  C11 C14 C42 C81 C88 J31 
Date:  2014–02 
URL:  http://d.repec.org/n?u=RePEc:bdm:wpaper:201404&r=ecm 
By:  Barbara Rossi; Tatevik Sekhposyan 
Abstract:  This paper proposes a framework to implement regressionbased tests of predictive ability in unstable environments, including, in particular, forecast unbiasedness and efficiency tests, commonly referred to as tests of forecast rationality. Our framework is general: it can be applied to modelbased forecasts obtained either with recursive or rolling window estimation schemes, as well as to forecasts that are modelfree. The proposed tests provide more evidence against forecast rationality than previously found in the Federal Reserve's Greenbook forecasts as well as surveybased private forecasts. It confirms, however, that the Federal Reserve has additional information about current and future states of the economy relative to market participants. 
Keywords:  Forecasting, forecast rationality, regressionbased tests of forecasting ability, Greenbook forecasts, survey forecasts, realtime data 
JEL:  C22 C52 C53 
Date:  2014–06 
URL:  http://d.repec.org/n?u=RePEc:upf:upfgen:1426&r=ecm 
By:  Dirk Tasche 
Abstract:  How to forecast next year's portfoliowide credit default rate based on last year's default observations and the current score distribution? A classical approach to this problem consists of fitting a mixture of the conditional score distributions observed last year to the current score distribution. This is a special (simple) case of a finite mixture model where the mixture components are fixed and only the weights of the components are estimated. The optimum weights provide a forecast of next year's portfoliowide default rate. We point out that the maximumlikelihood (ML) approach to fitting the mixture distribution not only gives an optimum but even an exact fit if we allow the mixture components to vary but keep their density ratio fix. From this observation we can conclude that the standard default rate forecast based on last year's conditional default rates will always be located between last year's portfoliowide default rate and the ML forecast for next year. We also discuss how the mixture model based estimation methods can be used to forecast total loss. This involves the reinterpretation of an individual classification problem as a collective quantification problem. 
Date:  2014–06 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1406.6038&r=ecm 
By:  Malikov, Emir; Kumbhakar, Subal C. 
Abstract:  This paper considers a generalized panel data model of polychotomous and/or sequential switching which can also accommodate the dependence between unobserved effects and covariates in the model. We showcase our model using an empirical illustration in which we estimate scope economies for the publicly owned electric utilities in the U.S. during the period from 2001 to 2003. 
Keywords:  Correlated Effects, Multinomial Logit, Nested Logit, Panel Data, Polychotomous, Selection 
JEL:  C33 C34 
Date:  2014–05–28 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:56770&r=ecm 
By:  Casasnovas, Valero L.; Aldanondo, Ana M. 
Abstract:  We extend the Tauer (2001) and Färe et al. (2004) analyses of aggregation bias in technical efficiency measurement to multiple criteria decision analysis. We show input aggregation conditions consistent with multiple criteria evaluation of overall efficiency in conjunction with variation in aggregation bias. 
Keywords:  Data Envelopment Analysis, Input aggregation, multiple objectives 
JEL:  C61 D20 
Date:  2014–06–12 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:56778&r=ecm 
By:  Karapanagiotidis, Paul 
Abstract:  A review of the general statespace modeling framework. The discussion focuses heavily on the three prediction problems of forecasting, filtering, and smoothing within the state space context. Numerous examples are provided detailing special cases of the statespace model and its use in solving a number of modeling issues. Independent sections are also devoted to both the topics of Factor models and Harvey’s Unobserved Components framework. 
Keywords:  statespace models, signal extraction, unobserved components 
JEL:  C10 C32 C51 C53 C58 
Date:  2014–06–03 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:56807&r=ecm 