
on Econometrics 
By:  Ruiqi Liu; Ben Boukai; Zuofeng Shang 
Abstract:  A new statistical procedure, based on a modified spline basis, is proposed to identify the linear components in the panel data model with fixed effects. Under some mild assumptions, the proposed procedure is shown to consistently estimate the underlying regression function, correctly select the linear components, and effectively conduct the statistical inference. When compared to existing methods for detection of linearity in the panel model, our approach is demonstrated to be theoretically justified as well as practically convenient. We provide a computational algorithm that implements the proposed procedure along with a pathbased solution method for linearity detection, which avoids the burden of selecting the tuning parameter for the penalty term. Monte Carlo simulations are conducted to examine the finite sample performance of our proposed procedure with detailed findings that confirm our theoretical results in the paper. Applications to Aggregate Production and Environmental Kuznets Curve data also illustrate the necessity for detecting linearity in the partially linear panel model. 
Date:  2019–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1911.08830&r=all 
By:  Michael P. Leung 
Abstract:  This paper studies causal inference in randomized experiments under network interference. Most existing models of interference posit that treatments assigned to alters only affect the ego's response through a lowdimensional exposure mapping, which only depends on units within some known network radius around the ego. We propose a substantially weaker "approximate neighborhood interference" (ANI) assumption, which allows treatments assigned to alters far from the ego to have a small, but potentially nonzero, impact on the ego's response. Unlike the exposure mapping model, we can show that ANI is satisfied in wellknown models of social interactions. Despite its generality, inference in a singlenetwork setting is still possible under ANI, as we prove that standard inverseprobability weighting estimators can consistently estimate treatment and spillover effects and are asymptotically normal. For practical inference, we propose a new conservative variance estimator based on a network bootstrap and suggest a datadependent bandwidth using the network diameter. Finally, we illustrate our results in a simulation study and empirical application. 
Date:  2019–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1911.07085&r=all 
By:  Kiviet, Jan 
Abstract:  A fullyfledged alternative to TwoStage LeastSquares (TSLS) inference is developed for general linear models with endogenous regressors. This alternative approach does not require the adoption of external instrumental variables. It generalizes earlier results which basically assumed all variables in the model to be normally distributed and their observational units to be stochastically independent. Now the chosen underlying framework corresponds completely to that of most empirical crosssection or timeseries studies using TSLS. This enables revealing empirically relevant replication studies, also because the new technique allows testing the earlier untestable exclusion restrictions adopted when applying TSLS. For three illustrative case studies a new perspective on their empirical findings results. The new technique is computationally not very demanding. It involves scanning leastsquaresbased results over all compatible values of the nuisance parameters established by the correlations between regressors and disturbances. 
Keywords:  endogeneity robust inference, instrument validity tests, replication studies, sensitivity analysis, twostage leastsquares. 
JEL:  C12 C13 C21 C22 C26 
Date:  2019–11–06 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:96839&r=all 
By:  Abhimanyu Gupta; Myung Hwan Seo 
Abstract:  We develop a class of tests for the structural stability of infiniteorder models such as the infiniteorder autoregressive model and the nonparametric sieve regression. When the number $ p $ of restrictions diverges, the traditional tests based on the suprema of Wald, LM and LR statistics or their exponentially weighted averages diverge as well. We introduce a suitable transformation of these tests and obtain proper weak limits under the condition that $p $ grows to infinity as the sample size $n $ goes to infinity. In general, this limit distribution is different from the sequential limit, which can be obtained by increasing the order of the standardized tieddown Bessel process in Andrews (1993). In particular, our joint asymptotic analysis discovers a nonlinear high order serial correlation, for which we provide a consistent estimator. Our Monte Carlo simulation illustrates the importance of robustifying the structural break test against the nonlinear serial correlation even when $ p $ is moderate. Furthermore, we also establish a weighted power optimality property of our tests under some regularity conditions. We examine finitesample performance in a Monte Carlo study and illustrate the test with a number of empirical examples. 
Date:  2019–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1911.08637&r=all 
By:  Das, Tirthatanmoy (Indian Institute of Management); Polachek, Solomon (Binghamton University, New York) 
Abstract:  This paper proposes a new strategy to identify causal effects. Instead of finding a conventional instrumental variable correlated with the treatment but not with the confounding effects, we propose an approach which employs an instrument correlated with the confounders, but which itself is not causally related to the direct effect of the treatment. Utilizing such an instrument enables one to estimate the confounding endogeneity bias. This bias can then be utilized in subsequent regressions first to obtain a "binding" causal effect for observations unaffected by institutional barriers that eliminate a treatment's effectiveness, and second to obtain a populationwide treatment effect for all observations independent of institutional restrictions. Both are computed whether the treatment effects are homogeneous or heterogeneous. To illustrate the technique, we apply the approach to estimate sheepskin effects. We find the bias to be approximately equal to the OLS coefficient, meaning that the sheepskin effect is near zero. This result is consistent with FloresLagunes and Light (2010) and Clark and Martorell (2014). Our technique expands the econometrician's toolkit by introducing an alternative method that can be used to estimate causality. Further, one potentially can use both the conventional instrumental variable approach in tandem with our alternative approach to test the equality of the two estimators for a conventionally exactly identified causal model, should one claim to already have a valid conventional instrument. 
Keywords:  causality, OLS biases, sheepskin effects 
JEL:  C18 C36 I26 J24 J33 
Date:  2019–11 
URL:  http://d.repec.org/n?u=RePEc:iza:izadps:dp12766&r=all 
By:  Komarova, Tatiana; Sanches, Fábio Adriano; Silva Junior, Daniel; Srisuma, Sorawoot 
Abstract:  Most empirical and theoretical econometric studies of dynamic discrete choice models assume the discount factor to be known. We show the knowledge of the discount factor is not necessary to identify parts, or all, of the payoff function. We show the discount factor can be generically identifed jointly with the payoff parameters. It is known the payoff function cannot nonparametrically identified without any a priori restrictions. Our identification of the discount factor is robust to any normalization choice on the payoff parameters. In IO applications normalizations are usually made on switching costs, such as entry costs and scrap values. We also show that switching costs can be nonparametrically identified, in closedform, independently of the discount factor and other parts of the payoff function. Our identification strategies are constructive. They lead to easy to compute estimands that are global solutions. We illustrate with a Monte Carlo study and the dataset from Ryan (2012). 
Keywords:  discount factor; dynamic discrete choice problem; identification; estimation; switching costs 
JEL:  C14 C25 C51 
Date:  2018–11–01 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:86858&r=all 
By:  Fotouhi, Babak; Rytina, Steven 
Abstract:  The structure of social networks is usually inferred from limited sets of observations via suitable network sampling designs. In offline social network sampling, for practical considerations, researchers sometimes build in a cap on the number of social ties any respondent may claim. It is commonly known in the literature that using a cap on the degrees begets methodologically undesirable features because it discards information about the network connections. In this paper, we consider a mathematical model of this sampling procedure and seek analytical solutions to recover some of the lost information about the underlying network. We obtain closedform expressions for several network statistics, including the first and second moments of the degree distribution, network density, number of triangles, and clustering. We corroborate the accuracy of these estimators via simulated and empirical network data. Our contribution highlights notable room for improvement in the analysis of some existing social network data sets. 
Date:  2018–11–29 
URL:  http://d.repec.org/n?u=RePEc:osf:socarx:5kez8&r=all 
By:  Anna Conte (Sapienza University of Rome); Peter G Moffatt (University of East Anglia); Mary Riddel (University of Nevada) 
Abstract:  The use of Multiple Price Lists to elicit individuals' risk preferences is widespread. To model data collected through this method, we introduce the Multivariate Random Preference (MRP) estimator, specifically designed for the \switching" variant of such lists. This is a new estimation approach that enables us to exploit all available information derived from subjects' switch points in the lists. Monte Carlo simulations show that our estimator is consistent and has good smallsample properties. The estimator is derived for a twoparameter model in a risky context. 
Keywords:  Risk Preference; Monte Carlo Simulations; Importance Sampling 
JEL:  C51 C52 C91 D81 
Date:  2019–11–21 
URL:  http://d.repec.org/n?u=RePEc:uea:ueaeco:2019_04&r=all 
By:  Samuele Centorrino; Aman Ullah; Jing Xue 
Abstract:  We study a linear random coefficient model where slope parameters may be correlated with some continuous covariates. Such a model specification may occur in empirical research, for instance, when quantifying the effect of a continuous treatment observed at two time periods. We show one can carry identification and estimation without instruments. We propose a semiparametric estimator of average partial effects and of average treatment effects on the treated. We showcase the small sample properties of our estimator in an extensive simulation study. Among other things, we reveal that it compares favorably with a control function estimator. We conclude with an application to the effect of malaria eradication on economic development in Colombia. 
Date:  2019–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1911.06857&r=all 
By:  Duxbury, Scott W 
Abstract:  Statistical network methods have grown increasingly popular in the social sciences. However, like other nonlinear probability models, statistical network model parameters can only be identiﬁed to a scale and cannot be compared between groups or models ﬁt to the same network. This study addresses these issues by developing methods for mediation and moderation analyses in exponential random graph models (ERGM). It ﬁrst discusses ERGM as an autologistic regression to illustrate that ERGM estimates can be aﬀected by unobserved heterogeneity. Second, it develops methods for mediation analysis for both discrete and continuous mediators. Third, it provides recommendations and methods for interpreting interactions in ERGM. Finally, it considers scenarios where interactions are implicated in mediation analysis. The methodological discussion is accompanied with empirical applications and extensions to other classes of statistical network models are discussed. 
Date:  2019–07–17 
URL:  http://d.repec.org/n?u=RePEc:osf:socarx:9bs4u&r=all 
By:  Bruno Ferman; Cristine Pinto 
Abstract:  We analyze the properties of the Synthetic Control (SC) and related estimators when the pretreatment fit is imperfect. In this framework, we show that these estimators are generally biased if treatment assignment is correlated with unobserved confounders, even when the number of pretreatment periods goes to infinity. Still, we also show that a modified version of the SC method can substantially improve in terms of bias and variance relative to the differenceindifference estimator. We also consider the properties of these estimators in settings with nonstationary common factors. 
Date:  2019–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1911.08521&r=all 
By:  Wodtke, Geoffrey; Zhou, Xiang 
Abstract:  Abstract Analyses of causal mediation are often complicated by treatmentinduced confounders of the mediatoroutcome relationship. In the presence of such confounders, the natural direct and indirect effects of treatment on the outcome, into which the total effect can be additively decomposed, are not identified. An alternative but similar set of effects, known as randomized intervention analogues to the natural direct effect (RNDE) and the natural indirect effect (RNIE), can still be identified in this situation, but existing estimators for these effects require a complicated weighting procedure that is difficult to use in practice. In this paper, we introduce a new method for estimating the RNDE and RNIE that involves only a minor adaption of the comparatively simple regression methods used to perform effect decomposition in the absence of treatmentinduced confounding. It involves fitting linear models for (a) the conditional mean of the mediator given treatment and a set of baseline confounders and (b) the conditional mean of the outcome given the treatment, mediator, baseline confounders, and the treatmentinduced confounders after first residualizing them with respect to the observed past. The RNDE and RNIE are simple functions of the parameters in these models when they are correctly specified and when there are no unobserved variables that confound the treatmentoutcome, treatmentmediator, or mediatoroutcome relationships. We illustrate the method by decomposing the effect of education on depression symptoms at midlife into components operating through income versus alternative factors. R and Stata packages are available for implementing the proposed method. 
Date:  2019–05–15 
URL:  http://d.repec.org/n?u=RePEc:osf:socarx:86d2k&r=all 
By:  Kosaku Takanashi; Kenichiro McAlinn 
Abstract:  This paper studies the theoretical predictive properties of classes of forecast combination methods. The study is motivated by the recently developed Bayesian framework for synthesizing predictive densities: Bayesian predictive synthesis. A novel strategy based on continuous time stochastic processes is proposed and developed, where the combined predictive error processes are expressed as stochastic differential equations, evaluated using Ito's lemma. We show that a subclass of synthesis functions under Bayesian predictive synthesis, which we categorize as nonlinear synthesis, entails an extra term that "corrects" the bias from misspecification and dependence in the predictive error process, effectively improving forecasts. Theoretical properties are examined and shown that this subclass improves the expected squared forecast error over any and all linear combination, averaging, and ensemble of forecasts, under mild conditions. We discuss the conditions for which this subclass outperforms others, and its implications for developing forecast combination methods. A finite sample simulation study is presented to illustrate our results. 
Date:  2019–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1911.08662&r=all 
By:  Michael P. Leung 
Abstract:  This paper studies inference in models of discrete choice with social interactions when the data consists of a single large network. We provide theoretical justification for the use of spatial and network HAC variance estimators in applied work, the latter constructed by using network path distance in place of spatial distance. Toward this end, we prove new central limit theorems for network moments in a large class of social interactions models. The results are applicable to discrete games on networks and dynamic models where social interactions enter through lagged dependent variables. We illustrate our results in an empirical application and simulation study. 
Date:  2019–11 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:1911.07106&r=all 
By:  Jakob Kapeller (Institute for SocioEconomics, University of DuisburgEssen, Germany; Institute for Comprehensive Analysis of the Economy, Johannes Kepler University Linz, Austria); Rafael Wildauer (Department of International Business and Economics, University of Greenwich) 
Abstract:  This paper develops a new approach for dealing with the underreporting of wealth in household survey data (differential nonresponse). The current practice among researchers relying on household wealth survey data is one out of three approaches. First, simply ignore the problem. Second, fit a Pareto distribution to the tail of the survey data and use that distribution. Third, add rich list data to the sample and fit a Pareto distribution to the combined data (Vermeulen, 2018). We propose a fourth approach  the rank correction approach  which improves over the first two and does not require information drawn from publicly available rich lists. We show by means of Monte Carlo simulations that this rank correction approach substantially reduces nonresponse bias in the Pareto tail estimates. Applying the procedure to wealth survey data (HFCS, SCF, WAS) yields substantial increases in aggregate wealth and top wealth shares, which are closely in line with wealth summary statistics from other sources such as the World Inequality Database. As such the rank correction approach can serve as a complement and robustness check to Vermeulenâ€™s (2018) rich list approach and as an attractive alternative to the second approach in situations where rich list data is not available or of poor quality. 
Keywords:  Wealth distribution, differential nonresponse, Pareto distribution 
Date:  2019–11 
URL:  http://d.repec.org/n?u=RePEc:ico:wpaper:101&r=all 
By:  Susan Athey; Raj Chetty; Guido W. Imbens; Hyunseung Kang 
Abstract:  A common challenge in estimating the longterm impacts of treatments (e.g., job training programs) is that the outcomes of interest (e.g., lifetime earnings) are observed with a long delay. We address this problem by combining several shortterm outcomes (e.g., shortrun earnings) into a \surrogate index," the predicted value of the longterm outcome given the shortterm outcomes. We show that the average treatment effect on the surrogate index equals the treatment effect on the longterm outcome under the assumption that the longterm outcome is independent of the treatment conditional on the surrogate index. We then characterize the bias that arises from violations of this assumption, deriving feasible bounds on the degree of bias and providing simple methods to validate the key assumption using additional outcomes. Finally, we develop efficient estimators for the surrogate index and show that even in settings where the longterm outcome is observed, using a surrogate index can increase precision. We apply our method to analyze the longterm impacts of a multisite job training experiment in California. Using shortterm employment rates as surrogates, one could have estimated the program's impacts on mean employment rates over a 9 year horizon within 1.5 years, with a 35% reduction in standard errors. Our empirical results suggest that the longterm impacts of programs on labor market outcomes can be predicted accurately by combining their shortterm treatment effects into a surrogate index. 
JEL:  C01 J0 
Date:  2019–11 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:26463&r=all 
By:  Emilio Zanetti Chini 
Abstract:  We provide a new frequentist methodology that detects forecasting bias due to strategic interaction. This is based on a new environment, named "Scoring Structure", where a Forecast User interacts with a Forecast Producer and Reality. A formal test for the null hypothesis of linearity in Scoring Structure is introduced. Linearity implies that forecasts are strategically coherent with evaluations and viceversa. The new test has good smallsample properties and behaves consistently with theoretical requirements. We illustrate the use of the Scoring Structure and the coherence test via two case studies on the assessment of the probability of recessions for the U.S. economy and the evaluation of Norges Bankâ€™s Fan Charts of Output Gap. These support the endemic nature of the strategic judgment in Macroeconomics. Finally, we discuss the economic interpretation of the results obtained by our approach. 
Keywords:  Business Cycle, Predictive Density, Forecast Evaluation, Coherence Testing,Scoring Rules and Structures 
JEL:  C12 C22 C44 C53 
Date:  2019–10 
URL:  http://d.repec.org/n?u=RePEc:sap:wpaper:wp190&r=all 
By:  Manuel Ammann; Alexander Feser; 
Abstract:  This study provides an indepth analysis of how to estimate riskneutral moments robustly. A simulation and an empirical study show that estimating riskneutral moments presents a tradeoffbetween (1) the bias of estimates caused by a limited strike price domain and (2) the variance of estimates induced by mircostructural noise. The best tradeoff is offered by optionimplied quantile moments estimated from a volatility surface interpolated with a locallinear kernel regression and extrapolated linearly. A similarly good tradeoff is achieved by estimating regular central optionimplied moments from a volatility surface interpolated with a cubic smoothing spline and flat extrapolation. 
Keywords:  riskneutral moments, riskneutral distribution 
JEL:  C14 G10 G13 G17 
Date:  2019–03 
URL:  http://d.repec.org/n?u=RePEc:usg:sfwpfi:2019:02&r=all 
By:  Sariev, Eduard; Germano, Guido 
Abstract:  Artificial neural networks (ANN) have been extensively used for classification problems in many areas such as gene, text and image recognition. Although ANN are popular also to estimate the probability of default in credit risk, they have drawbacks; a major one is their tendency to overfit the data. Here we propose an improved Bayesian regularization approach to train ANN and compare it to the classical regularization that relies on the backpropagation algorithm for training feedforward networks. We investigate different network architectures and test the classification accuracy on three data sets. Profitability, leverage and liquidity emerge as important financial default driver categories. 
Keywords:  Artificial neural networks; Bayesian regularization; Credit risk; Probability of default; ES/K002309/1 
JEL:  C11 C13 
Date:  2019–10–31 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:101029&r=all 
By:  Hirschauer, Norbert; Gruener, Sven; Mußhoff, Oliver; Becker, Claudia; Jantsch, Antje 
Abstract:  Besides the inferential errors that abound in the interpretation of pvalues, the probabilistic preconditions (i.e. random sampling or equivalent) for using them at all are not often met by observational studies in the social sciences. This paper systematizes different sampling designs and discusses the restrictive requirements of data collection that are the sinequanon for using pvalues. 
Date:  2019–08–15 
URL:  http://d.repec.org/n?u=RePEc:osf:socarx:yazr8&r=all 