nep-ecm 2020-06-29 papers

on Econometrics

Issue of 2020‒06‒29
24 papers chosen by
Sune Karlsson
Örebro universitet

Semiparametric Tests for the Order of Integration in the Possible Presence of Level Breaks By Fabrizio Iacone; Morten Ørregaard Nielsen; A.M. Robert Taylor
Estimation of the Kronecker Covariance Model by Quadratic Form By Linton, O.; Tang, H.
Adaptive, Rate-Optimal Testing in Instrumental Variables Models By Christoph Breunig; Xiaohong Chen
New robust inference for predictive regressions By Rustam Ibragimov; Jihyun Kim; Anton Skrobotov
Heteroskedasticity-Robust Inference in Linear Regression Models with Many Covariates By Jochmans, K.
A note on exploratory item factor analysis by singular value decomposition By Zhang, Haoran; Chen, Yunxiao; Li, Xiaoou
Shallow Neural Hawkes: Non-parametric kernel estimation for Hawkes processes By Sobin Joseph; Lekhapriya Dheeraj Kashyap; Shashi Jain
Parametric Modeling of Quantile Regression Coefficient Functions with Longitudinal Data By Paolo Frumento; Matteo Bottai; Iv\'an Fern\'andez-Val
Testing the Root-N Asymptotic Normality for the GMM and M Estimators By Yuya Sasaki; Yulong Wang
Estimates of derivatives of (log) densities and related objects By Joris Pinkse; Karl Schurter
Multifractal temporally weighted detrended partial cross-correlation analysis to quantify intrinsic power-law cross-correlation of two non-stationary time series affected by common external factors By Bao-Gen Li; Dian-Yi Ling; Zu-Guo Yu
An Adaptive Recursive Volatility Prediction Method By Nicklas Werge; Olivier Wintenberger
Factor extraction using Kalman filter and smoothing: this is not just another survey By Miranda Gualdrón, Karen Alejandra; Ruiz Ortega, Esther; Poncela Blanco, Maria Pilar
Uniform Rates for Kernel Estimators of Weakly Dependent Data By Juan Carlos Escanciano
Assessing variable importance in clustering: a new method based on unsupervised binary decision trees By Ghattas Badih; Michel Pierre; Boyer Laurent
Macro-Finance Decoupling: Robust Evaluations of Macro Asset Pricing Models By Xu Cheng; Winston Wei Dou; Zhipeng Liao
Efficient Bayesian nonparametric hazard regression By Kaeding, Matthias
On the plausibility of the latent ignorability assumption By Martin Huber
Identification of School Admission Effects Using Propensity Scores Based on a Matching Market Structure By Marin Drlje
So close and so far. Finding similar tendencies in econometrics and machine learning papers. Topic models comparison. By Marcin Chlebus; Maciej Stefan Świtała
On the Nuisance of Control Variables in Regression Analysis By Paul H\"unermund; Beyers Louw
Testing Random Assignment to Peer Groups By Jochmans, K.
Sufficient Statistics Revisited By Henrik Kleven
Using the Epps effect to detect discrete data generating processes By Patrick Chang; Etienne Pienaar; Tim Gebbie

Semiparametric Tests for the Order of Integration in the Possible Presence of Level Breaks

By:	Fabrizio Iacone (Universita degli Studi di Milano); Morten Ørregaard Nielsen (Queen's University and CREATES); A.M. Robert Taylor (University of Essex)
Abstract:	Lobato and Robinson (1998) develop semiparametric tests for the null hypothesis that a series is weakly autocorrelated, or I(0), about a constant level, against fractionally integrated alternatives. These tests have the advantage that the user is not required to specify a parametric model for any weak autocorrelation present in the series. We extend this approach in two distinct ways. First, we show that it can be generalised to allow for testing of the null hypothesis that a series is I(\delta) for any \delta lying in the usual stationary and invertible region of the parameter space. Second, it is well known in the literature that long memory and level breaks can be mistaken for one another, with unmodelled level breaks rendering fractional integration tests highly unreliable. We therefore extend the Lobato and Robinson (1998) approach to allow for the possibility of changes in level at unknown points in the series. We show that the resulting statistics have standard limiting null distributions, and that the tests based on these statistics attain the same asymptotic local power functions as infeasible tests based on the unobserved errors, and hence there is no loss in asymptotic local power from allowing for level breaks, even where none is present. We report results from a Monte Carlo study into the finite-sample behaviour of our proposed tests, as well as several empirical examples.
Keywords:	fractional integration, level breaks, Lagrange multiplier testing principle, spurious long memory, local Whittle likelihood, conditional heteroskedasticity
JEL:	C22
Date:	2020–06
URL:	http://d.repec.org/n?u=RePEc:qed:wpaper:1431&r=all

Estimation of the Kronecker Covariance Model by Quadratic Form

By:	Linton, O.; Tang, H.
Abstract:	We propose a new estimator, the quadratic form estimator, of the Kronecker product model for covariance matrices. We show that this estimator has good properties in the large dimensional case (i.e., the cross-sectional dimension n is large relative to the sample size T ). In particular, the quadratic form estimator is consistent in a relative Frobenius norm sense provided log 3 n/T → 0. We obtain the limiting distributions of Lagrange multiplier (LM) and Wald tests under both the null and local alternatives concerning the mean vector μ. Testing linear restrictions of μ is also investigated. Finally, our methodology performs well in the finite-sample situations both when the Kronecker product model is true, and when it is not true.
Keywords:	Covariance matrix, Kronecker product, Quadratic form, Lagrange multiplier test, Wald test
Date:	2020–06–01
URL:	http://d.repec.org/n?u=RePEc:cam:camdae:2050&r=all

Adaptive, Rate-Optimal Testing in Instrumental Variables Models

By:	Christoph Breunig; Xiaohong Chen
Abstract:	This paper proposes simple, data-driven, optimal rate-adaptive inferences on a structural function in semi-nonparametric conditional moment restrictions. We consider two types of hypothesis tests based on leave-one-out sieve estimators. A structure-space test (ST) uses a quadratic distance between the structural functions of endogenous variables; while an image-space test (IT) uses a quadratic distance of the conditional moment from zero. For both tests, we analyze their respective classes of nonparametric alternative models that are separated from the null hypothesis by the minimax rate of testing. That is, the sum of the type I and the type II errors of the test, uniformly over the class of nonparametric alternative models, cannot be improved by any other test. Our new minimax rate of ST differs from the known minimax rate of estimation in nonparametric instrumental variables (NPIV) models. We propose computationally simple and novel exponential scan data-driven choices of sieve regularization parameters and adjusted chi-squared critical values. The resulting tests attain the minimax rate of testing, and hence optimally adapt to the unknown smoothness of functions and are robust to the unknown degree of ill-posedness (endogeneity). Data-driven confidence sets are easily obtained by inverting the adaptive ST. Monte Carlo studies demonstrate that our adaptive ST has good size and power properties in finite samples for testing monotonicity or equality restrictions in NPIV models. Empirical applications to nonparametric multi-product demands with endogenous prices are presented.
Date:	2020–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2006.09587&r=all

New robust inference for predictive regressions

By:	Rustam Ibragimov; Jihyun Kim; Anton Skrobotov
Abstract:	We propose two robust methods for testing hypotheses on unknown parameters of predictive regression models under heterogeneous and persistent volatility as well as endogenous, persistent and/or fat-tailed regressors and errors. The proposed robust testing approaches are applicable both in the case of discrete and continuous time models. Both of the methods use the Cauchy estimator to effectively handle the problems of endogeneity, persistence and/or fat-tailedness in regressors and errors. The difference between our two methods is how the heterogeneous volatility is controlled. The first method relies on robust t-statistic inference using group estimators of a regression parameter of interest proposed in Ibragimov and Muller, 2010. It is simple to implement, but requires the exogenous volatility assumption. To relax the exogenous volatility assumption, we propose another method which relies on the nonparametric correction of volatility. The proposed methods perform well compared with widely used alternative inference procedures in terms of their finite sample properties.
Date:	2020–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2006.01191&r=all

Heteroskedasticity-Robust Inference in Linear Regression Models with Many Covariates

By:	Jochmans, K.
Abstract:	We consider inference in linear regression models that is robust to heteroskedasticity and the presence of many control variables. When the number of control variables increases at the same rate as the sample size the usual heteroskedasticity-robust estimators of the covariance matrix are inconsistent. Hence, tests based on these estimators are size distorted even in large samples. An alternative covariance-matrix estimator for such a setting is presented that complements recent work by Cattaneo, Jansson and Newey (2018). We provide high-level conditions for our approach to deliver (asymptotically) size-correct inference as well as more primitive conditions for three special cases. Simulation results and an empirical illustration to inference on the union premium are also provided.
Keywords:	heteroskedasticity, inference, many regressors, statistical leverage
JEL:	C12
Date:	2020–04–28
URL:	http://d.repec.org/n?u=RePEc:cam:camdae:2033&r=all

A note on exploratory item factor analysis by singular value decomposition

By:	Zhang, Haoran; Chen, Yunxiao; Li, Xiaoou
Abstract:	We revisit a singular value decomposition (SVD) algorithm given in Chen et al. (2019b) for exploratory Item Factor Analysis (IFA). This algorithm estimates a mul- tidimensional IFA model by SVD and was used to obtain a starting point for joint maximum likelihood estimation in Chen et al. (2019b). Thanks to the analytic and computational properties of SVD, this algorithm guarantees a unique solution and has computational advantage over other exploratory IFA methods. Its computational ad- vantage becomes significant when the numbers of respondents, items, and factors are all large. This algorithm can be viewed as a generalization of principal component analysis (PCA) to binary data. In this note, we provide the statistical underpinning of the algorithm. In particular, we show its statistical consistency under the same double asymptotic setting as in Chen et al. (2019b). We also demonstrate how this algorithm provides a scree plot for investigating the number of factors and provide its asymptotic theory. Further extensions of the algorithm are discussed. Finally, simulation studies suggest that the algorithm has good finite sample performance.
Keywords:	exploratory item factor analysis; IFA; singular value decomposition; double asymptotics; generalised PCA fir binary data
JEL:	C1
Date:	2020–05–26
URL:	http://d.repec.org/n?u=RePEc:ehl:lserod:104166&r=all

Shallow Neural Hawkes: Non-parametric kernel estimation for Hawkes processes

By:	Sobin Joseph; Lekhapriya Dheeraj Kashyap; Shashi Jain
Abstract:	Multi-dimensional Hawkes process (MHP) is a class of self and mutually exciting point processes that find wide range of applications -- from prediction of earthquakes to modelling of order books in high frequency trading. This paper makes two major contributions, we first find an unbiased estimator for the log-likelihood estimator of the Hawkes process to enable efficient use of the stochastic gradient descent method for maximum likelihood estimation. The second contribution is, we propose a specific single hidden layered neural network for the non-parametric estimation of the underlying kernels of the MHP. We evaluate the proposed model on both synthetic and real datasets, and find the method has comparable or better performance than existing estimation methods. The use of shallow neural network ensures that we do not compromise on the interpretability of the Hawkes model, while at the same time have the flexibility to estimate any non-standard Hawkes excitation kernel.
Date:	2020–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2006.02460&r=all

Parametric Modeling of Quantile Regression Coefficient Functions with Longitudinal Data

By:	Paolo Frumento; Matteo Bottai; Iv\'an Fern\'andez-Val
Abstract:	In ordinary quantile regression, quantiles of different order are estimated one at a time. An alternative approach, which is referred to as quantile regression coefficients modeling (QRCM), is to model quantile regression coefficients as parametric functions of the order of the quantile. In this paper, we describe how the QRCM paradigm can be applied to longitudinal data. We introduce a two-level quantile function, in which two different quantile regression models are used to describe the (conditional) distribution of the within-subject response and that of the individual effects. We propose a novel type of penalized fixed-effects estimator, and discuss its advantages over standard methods based on $\ell_1$ and $\ell_2$ penalization. We provide model identifiability conditions, derive asymptotic properties, describe goodness-of-fit measures and model selection criteria, present simulation results, and discuss an application. The proposed method has been implemented in the R package qrcm.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2006.00160&r=all

Testing the Root-N Asymptotic Normality for the GMM and M Estimators

By:	Yuya Sasaki; Yulong Wang
Abstract:	Common approaches to statistical inference for structural and reduced-form parameters in empirical economic analysis are based on the root-n asymptotic normality of the GMM and M estimators. The canonical root-n asymptotic normality for these classes of the estimators requires at least the second moment of the score to be bounded. In this article, we present a method of testing this condition for the asymptotic normality of the GMM and M estimators. Our test has a uniform size control over the set of data generating processes compatible with the root-n asymptotic normality. Simulation studies support this theoretical result. Applying the proposed test to the market share data from the Dominick's Finer Foods retail chain, we find that a common ad hoc procedure to deal with zero market shares results in a failure of the root-n asymptotic normality
Date:	2020–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2006.02541&r=all

Estimates of derivatives of (log) densities and related objects

By:	Joris Pinkse; Karl Schurter
Abstract:	We estimate the density and its derivatives using a local polynomial approximation to the logarithm of an unknown density $f$. The estimator is guaranteed to be nonnegative and achieves the same optimal rate of convergence in the interior as well as the boundary of the support of $f$. The estimator is therefore well-suited to applications in which nonnegative density estimates are required, such as in semiparametric maximum likelihood estimation. In addition, we show that our estimator compares favorably with other kernel-based methods, both in terms of asymptotic performance and computational ease. Simulation results confirm that our method can perform similarly in finite samples to these alternative methods when they are used with optimal inputs, i.e. an Epanechnikov kernel and optimally chosen bandwidth sequence. Further simulation evidence demonstrates that, if the researcher modifies the inputs and chooses a larger bandwidth, our approach can even improve upon these optimized alternatives, asymptotically. We provide code in several languages.
Date:	2020–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2006.01328&r=all

Multifractal temporally weighted detrended partial cross-correlation analysis to quantify intrinsic power-law cross-correlation of two non-stationary time series affected by common external factors

By:	Bao-Gen Li; Dian-Yi Ling; Zu-Guo Yu
Abstract:	When common factors strongly influence two cross-correlated time series recorded in complex natural and social systems, the results will be biased if we use multifractal detrended cross-correlation analysis (MF-DXA) without considering these common factors. Based on multifractal temporally weighted detrended cross-correlation analysis (MF-TWXDFA) proposed by our group and multifractal partial cross-correlation analysis (MF-DPXA) proposed by Qian et al., we propose a new method---multifractal temporally weighted detrended partial cross-correlation analysis (MF-TWDPCCA) to quantify intrinsic power-law cross-correlation of two non-stationary time series affected by common external factors in this paper. We use MF-TWDPCCA to characterize the intrinsic cross-correlations between the two simultaneously recorded time series by removing the effects of other potential time series. To test the performance of MF-TWDPCCA, we apply it, MF-TWXDFA and MF-DPXA on simulated series. Numerical tests on artificially simulated series demonstrate that MF-TWDPCCA can accurately detect the intrinsic cross-correlations for two simultaneously recorded series. To further show the utility of MF-TWDPCCA, we apply it on time series from stock markets and find that there exists significantly multifractal power-law cross-correlation between stock returns. A new partial cross-correlation coefficient is defined to quantify the level of intrinsic cross-correlation between two time series.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2006.09154&r=all

An Adaptive Recursive Volatility Prediction Method

By:	Nicklas Werge (LPSM); Olivier Wintenberger (LPSM)
Abstract:	The Quasi-Maximum Likelihood (QML) procedure is widely used for statistical inference due to its robustness against overdisper-sion. However, while there are extensive references on non-recursive QML estimation, recursive QML estimation has attracted little attention until recently. In this paper, we investigate the convergence properties of the QML procedure in a general conditionally heteroscedastic time series model, extending the classical offline optimization routines to recursive approximation. We propose an adaptive recursive estimation routine for GARCH models using the technique of Variance Targeting Estimation (VTE) to alleviate the convergence difficulties encountered in the usual QML estimation. Finally, empirical results demonstrate a favorable trade-off between the ability to adapt to time-varying estimates and stability of the estimation routine.
Date:	2020–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2006.02077&r=all

Factor extraction using Kalman filter and smoothing: this is not just another survey

By:	Miranda Gualdrón, Karen Alejandra; Ruiz Ortega, Esther; Poncela Blanco, Maria Pilar
Abstract:	Dynamic Factor Models, which assume the existence of a small number of unobservedlatent factors that capture the comovements in a system of variables, are the main "bigdata" tool used by empirical macroeconomists during the last 30 years. One importanttool to extract the factors is based on Kalman lter and smoothing procedures that cancope with missing data, mixed frequency data, time-varying parameters, non-linearities,non-stationarity and many other characteristics often observed in real systems of economicvariables. This paper surveys the literature on latent common factors extracted using Kalmanfilter and smoothing procedures in the context of Dynamic Factor Models. Signal extractionand parameter estimation issues are separately analyzed. Identi cation issues are also tackledin both stationary and non-stationary models. Finally, empirical applications are surveyedin both cases.
Keywords:	State-Space Model; Identi Cation; Em Algorithm; Dynamic Factor Model
Date:	2020–06–25
URL:	http://d.repec.org/n?u=RePEc:cte:wsrepe:30644&r=all

Uniform Rates for Kernel Estimators of Weakly Dependent Data

By:	Juan Carlos Escanciano
Abstract:	This paper provides new uniform rate results for kernel estimators of absolutely regular stationary processes that are uniform in the bandwidth and in infinite-dimensional classes of dependent variables and regressors. Our results are useful for establishing asymptotic theory for two-step semiparametric estimators in time series models. We apply our results to obtain nonparametric estimates and their rates for Expected Shortfall processes.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.09951&r=all

Assessing variable importance in clustering: a new method based on unsupervised binary decision trees

By:	Ghattas Badih (I2M - Institut de Mathématiques de Marseille - AMU - Aix Marseille Université - ECM - École Centrale de Marseille - CNRS - Centre National de la Recherche Scientifique); Michel Pierre (I2M - Institut de Mathématiques de Marseille - AMU - Aix Marseille Université - ECM - École Centrale de Marseille - CNRS - Centre National de la Recherche Scientifique, CEReSS - Centre d'études et de recherche sur les services de santé et la qualité de vie - AMU - Aix Marseille Université); Boyer Laurent (CEReSS - Centre d'études et de recherche sur les services de santé et la qualité de vie - AMU - Aix Marseille Université)
Abstract:	We consider different approaches for assessing variable importance in clustering. We focus on clustering using binary decision trees (CUBT), which is a non-parametric top-down hierarchical clustering method designed for both continuous and nominal data. We suggest a measure of variable importance for this method similar to the one used in Breiman's classification and regression trees. This score is useful to rank the variables in a dataset, to determine which variables are the most important or to detect the irrelevant ones. We analyze both stability and efficiency of this score on different data simulation models in the presence of noise, and compare it to other classical variable importance measures. Our experiments show that variable importance based on CUBT is much more efficient than other approaches in a large variety of situations.
Keywords:	Variables ranking,Variable importance,Unsupervised learning,CUBT,Deviance
Date:	2019–03
URL:	http://d.repec.org/n?u=RePEc:hal:journl:hal-02007388&r=all

Macro-Finance Decoupling: Robust Evaluations of Macro Asset Pricing Models

By:	Xu Cheng (University of Pennsylvania); Winston Wei Dou (University of Pennsylvania); Zhipeng Liao (University of California, Los Angeles)
Abstract:	This paper shows that robust inference under weak identi?cation is important to the evaluation of many in?uential macro asset pricing models, including long-run risk models, disaster risk models, and multifactor linear asset pricing models. Building on recent developments in the conditional inference literature, we provide a new speci?cation test by simulating the critical value conditional on a su?cient statistic. This su?cient statistic can be intuitively interpreted as a measure capturing the macroeconomic information decoupled from the underlying content of asset pricing theories. Macro-?nance decoupling is an e?ective way to improve the power of our speci?cation test when asset pricing theories are di?cult to refute due to an imbalance in the information content about the key model parameters between macroeconomic moment restrictions and asset pricing cross-equation restrictions.
Keywords:	Asset Pricing, Conditional Inference, Disaster Risk, Long-Run Risk, Factor Models, Speci?cation Test, Weak Identi?cation
JEL:	C12 C32 C52 G12
Date:	2020–05–24
URL:	http://d.repec.org/n?u=RePEc:pen:papers:20-019&r=all

Efficient Bayesian nonparametric hazard regression

By:	Kaeding, Matthias
Abstract:	We model the log-cumulative baseline hazard for the Cox model via Bayesian, monotonic P-splines. This approach permits fast computation, accounting for arbitrary censorship and the inclusion of nonparametric effects. We leverage the computational efficiency to simplify effect interpretation for metric and non-metric variables by combining the restricted mean survival time approach with partial dependence plots. This allows effect interpretation in terms of survival times. Monte Carlo simulations indicate that the proposed methods work well. We illustrate our approach using a large data set of real estate data advertisements.
Keywords:	Bayesian survival analysis,nonparametric modeling,penalized spline: restricted mean survival time
JEL:	C11 C14 C41
Date:	2020
URL:	http://d.repec.org/n?u=RePEc:zbw:rwirep:850&r=all

On the plausibility of the latent ignorability assumption

By:	Martin Huber
Abstract:	The estimation of the causal effect of an endogenous treatment based on an instrumental variable (IV) is often complicated by attrition, sample selection, or non-response in the outcome of interest. To tackle the latter problem, the latent ignorability (LI) assumption imposes that attrition/sample selection is independent of the outcome conditional on the treatment compliance type (i.e. how the treatment behaves as a function of the instrument), the instrument, and possibly further observed covariates. As a word of caution, this note formally discusses the strong behavioral implications of LI in rather standard IV models. We also provide an empirical illustration based on the Job Corps experimental study, in which the sensitivity of the estimated program effect to LI and alternative assumptions about outcome attrition is investigated.
Date:	2020–06
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2006.01703&r=all

Identification of School Admission Effects Using Propensity Scores Based on a Matching Market Structure

By:	Marin Drlje
Abstract:	A large literature estimates various school admission and graduation effects by employing variation in student admission scores around schools’ admission cutoffs, assuming (quasi-) random school assignment close to the cutoffs. In this paper, I present evidence suggesting that the samples corresponding to typical applications of the regression discontinuity design (RDD) fail to satisfy these assumptions. I distinguish ex-post randomization (as in admission lotteries applicable to those at the margin of admission) from ex-ante randomization, reflecting uncertainty about the market structure of applicants, which can be naturally quantified by resampling from the applicant population. Using data from the Croatian centralized collegeadmission system, I show that these ex-ante admission probabilities differ dramatically between treated and non-treated students within typical RDD bandwidths. Such unbalanced admission probability distributions suggest that bandwidths (and sample sizes) should be drastically reduced to avoid selection bias. I also show that a sizeable fraction of quasirandomized assignments occur outside of the typical RDD bandwidths, suggesting that these are also inefficient. As an alternative, I propose a new estimator, the Propensity Score Discontinuity Design (PSDD), based on all observations with random assignments, which compares outcomes of applicants matched on ex-ante admission probabilities, conditional on admission scores.
Keywords:	RDD; PSDD; school admission effects; lottery;
JEL:	C01 C51
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:cer:papers:wp658&r=all

So close and so far. Finding similar tendencies in econometrics and machine learning papers. Topic models comparison.

By:	Marcin Chlebus (Faculty of Economic Sciences, University of Warsaw); Maciej Stefan Świtała (Faculty of Economic Sciences, University of Warsaw)
Abstract:	The paper takes into consideration the broad idea of topic modelling and its application. The aim of the research was to identify mutual tendencies in econometric and machine learning abstracts. Different topic models were compared in terms of their performance and interpretability. The former was measured with a newly introduced approach. Summaries collected from esteemed journals were analysed with LSA, LDA and CTM algorithms. The obtained results enable finding similar trends in both corpora. Probabilistic models – LDA and CTM – outperform the semantic alternative – LSA. It appears that econometrics and machine learning are fields that consider problems being rather homogenous at the level of concept. However, they differ in terms of used tools and dominance in particular areas.
Keywords:	abstracts, comparison, interpretability, tendencies, topics
JEL:	A12 C18 C38 C52 C61
Date:	2020
URL:	http://d.repec.org/n?u=RePEc:war:wpaper:2020-16&r=all

On the Nuisance of Control Variables in Regression Analysis

By:	Paul H\"unermund; Beyers Louw
Abstract:	Control variables are included in regression analyses to estimate the causal effect of a treatment variable of interest on an outcome. In this note we argue that control variables are unlikely to have a causal interpretation themselves though. We therefore suggest to refrain from discussing their marginal effects in the results sections of empirical research papers.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.10314&r=all

Testing Random Assignment to Peer Groups

By:	Jochmans, K.
Abstract:	Identification of peer effects is complicated by the fact that the individuals under study may self-select their peers. Random assignment to peer groups has proven useful to sidestep such a concern. In the absence of a formal randomization mechanism it needs to be argued that assignment is `as good as' random. This paper introduces a simple yet powerful test to do so. We provide theoretical results for this test and explain why it dominates existing alternatives. Asymptotic power calculations and an analysis of the assignment mechanism of players to playing partners in tournaments of the Professional Golfer's Association is used to illustrate these claims. Our approach can equally be used to test for the presence of peer effects. To illustrate this we test for the presence of peer effects in the classroom using kindergarten data collected within Project STAR. We find no evidence of peer effects once we control for classroom fixed effects and a set of student characteristics.
Keywords:	asymptotic power, bias, peer effects, random assignment
JEL:	C12 C21
Date:	2020–04–06
URL:	http://d.repec.org/n?u=RePEc:cam:camdae:2024&r=all

Sufficient Statistics Revisited

By:	Henrik Kleven
Abstract:	This paper reviews and generalizes the sufficient statistics approach to policy evaluation. The idea of the approach is that the welfare effect of policy changes can be expressed in terms estimable reduced-form elasticities, allowing for policy evaluation without estimating the structural primitives of fully specified models. The approach relies on three assumptions: that policy changes are small, that government policy is the only source of market imperfection, and that a set of high-level restrictions on the environment and on preferences can be used to reduce the number of elasticities to be estimated. We generalize the approach in all three dimensions. It is possible to develop transparent sufficient statistics formulas under very general conditions, but the estimation requirements increase greatly. Starting from such general formulas elucidates that feasible empirical implementations are in fact structural approaches.
JEL:	D01 D04 D1 D6 H0 H2 H3 J08 J2 J38
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:nbr:nberwo:27242&r=all

Using the Epps effect to detect discrete data generating processes

By:	Patrick Chang; Etienne Pienaar; Tim Gebbie
Abstract:	On different time-intervals it can be useful to empirically determine whether the measurement process being observed is fundamental and representative of actual discrete events, or whether these measurements can still be faithfully represented as random samples of some underlying continuous process. As sampling time-scales become smaller for a continuous-time process one can expect to be able to continue to measure correlations, even as the sampling intervals become very small. However, with a discrete event process one can expect the correlation measurements to quickly break-down. This is a theoretically well explored problem. Here we concern ourselves with a simulation based empirical investigation that uses the Epps effect as a discriminator between situations where the underlying system is discrete e.g. a D-type Hawkes process, and when it can still be appropriate to represent the problem with a continuous-time random process that is being asynchronously sampled e.g. an asynchronously sampled set of correlated Brownian motions. We derive a method aimed to compensate for the Epps effect from asynchrony and then use this to discriminate. We are able to compare the correction on a simple continuous Brownian price path model with a Hawkes price model when the sampling is either a simple homogeneous Poisson Process or a Hawkes sampling process. This suggests that Epps curves can sometimes provide insight into whether discrete data are in fact observables realised from fundamental co-dependent discrete processes, or when they are merely samples of some correlated continuous-time process.
Date:	2020–05
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2005.10568&r=all

This nep-ecm issue is ©2020 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.