
on Econometrics 
By:  Louise Laage 
Abstract:  This paper studies a class of linear panel models with random coefficients. We do not restrict the joint distribution of the timeinvariant unobserved heterogeneity and the covariates. We investigate identification of the average partial effect (APE) when fixedeffect techniques cannot be used to control for the correlation between the regressors and the timevarying disturbances. Relying on control variables, we develop a constructive twostep identification argument. The first step identifies nonparametrically the conditional expectation of the disturbances given the regressors and the control variables, and the second step uses "betweengroup" variations, correcting for endogeneity, to identify the APE. We propose a natural semiparametric estimator of the APE, show its $\sqrt{n}$ asymptotic normality and compute its asymptotic variance. The estimator is computationally easy to implement, and Monte Carlo simulations show favorable finite sample properties. Control variables arise in various economic and econometric models, and we provide variations of our argument to obtain identification in some applications. As an empirical illustration, we estimate the average elasticity of intertemporal substitution in a labor supply model with random coefficients. 
Date:  2020–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2003.09367&r=all 
By:  Songnian Chen (HKUST); Shakeeb Khan (Boston College); Xun Tang (Rice University) 
Abstract:  We study the identification and estimation of treatment effect parameters in weakly separable models. In their seminal work, Vytlacil and Yildiz (2007) showed how to identify and estimate the average treatment effect of a dummy endogenous variable when the outcome is weakly separable in a single index. Their identification result builds on a monotonicity condition with respect to this single index. In comparison, we consider similar weakly separable models with multiple indices, and relax the monotonicity condition for identification. Unlike Vytlacil and Yildiz (2007), we exploit the full information in the distribution of the outcome variable, instead of just its mean. Indeed, when the outcome distribution function is more informative than the mean, our method is applicable to more general settings than theirs; in particular we do not rely on their monotonicity assumption and at the same time we also allow for multiple indices. To illustrate the advantage of our approach, we provide examples of models where our approach can identify parameters of interest whereas existing methods would fail. These examples include models with multiple unobserved disturbance terms such as the Roy model and multinomial choice models with dummy endogenous variables, as well as potential outcome models with endogenous random coefficients. Our method is easy to implement and can be applied to a wide class of models. We establish standard asymptotic properties such as consistency and asymptotic normality. 
Keywords:  Weak Separability, Treatment Effects, Monotonicity, Endogeneity 
JEL:  C14 C31 C35 
Date:  2020–04–01 
URL:  http://d.repec.org/n?u=RePEc:boc:bocoec:996&r=all 
By:  Dmitry Arkhangelsky (CEMFI, Centro de Estudios Monetarios y Financieros) 
Abstract:  I construct a nonlinear model for causal inference in the empirical settings where researchers observe individuallevel data for few large clusters over at least two time periods. It allows for identification (sometimes partial) of the counterfactual distribution, in particular, identifying average treatment effects and quantile treatment effects. The model is exible enough to handle multiple outcome variables, multidimensional heterogeneity, and multiple clusters. It applies to the settings where the new policy is introduced in some of the clusters, and a researcher additionally has information about the pretreatment periods. I argue that in such environments we need to deal with two different sources of bias: selection and technological. In my model, I employ standard methods of causal inference to address the selection problem and use pretreatment information to eliminate the technological bias. In case of onedimensional heterogeneity, identification is achieved under natural monotonicity assumptions. The situation is considerably more complicated in case of multidimensional heterogeneity where I propose three di erent approaches to identification using results from transportation theory. 
Keywords:  Treatment effects, differenceindifference, multidimensional heterogeneity, optimal transportation. 
JEL:  C14 
Date:  2019–03 
URL:  http://d.repec.org/n?u=RePEc:cmf:wpaper:wp2019_1903&r=all 
By:  Andrii Babii 
Abstract:  This paper introduces a highdimensional linear IV regression for the data sampled at mixed frequencies. We show that the highdimensional slope parameter of a highfrequency covariate can be identified and accurately estimated leveraging on a lowfrequency instrumental variable. The distinguishing feature of the model is that it allows handing highdimensional datasets without imposing the approximate sparsity restrictions. We propose a Tikhonovregularized estimator and derive the convergence rate of its meanintegrated squared error for time series data. The estimator has a closedform expression that is easy to compute and demonstrates excellent performance in our Monte Carlo experiments. We estimate the realtime price elasticity of supply on the Australian electricity spot market. Our estimates suggest that the supply is relatively inelastic and that its elasticity is heterogeneous throughout the day. 
Date:  2020–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2003.13478&r=all 
By:  Takamitsu Kurita (Faculty of Economics, Fukuoka University); B. Nielsen (Nuffield College, University of Oxford) 
Abstract:  This paper proposes a class of partial cointegrated models allowing for structural breaks in their deterministic terms. Details of the proposed models and their movingaverage representations are examined. It is then shown that, under the assumption of martingale di§erence innovations, the limit distributions of partial quasilikelihood ratio tests for cointegrating rank have a close connection to those for standard full models. This connection facilitates a response surface analysis which is required to extract critical information about moments from largescale simulation studies. An empirical illustration of the proposed methodology is also provided. This paper renders partial cointegrated models more áexible and reliable devices for the study of nonstationary time series data with structural breaks. 
Keywords:  Partial cointegrated vector autoregressive models, Structural breaks, Deterministic terms, Weak exogeneity, Cointegrating rank, Response surface. 
JEL:  C12 C32 C50 
Date:  2018–10–22 
URL:  http://d.repec.org/n?u=RePEc:nuf:econwp:1803&r=all 
By:  Pedro H. C. Sant'Anna; Xiaojun Song 
Abstract:  This paper proposes a new class of nonparametric tests for the correct specification of generalized propensity score models. The test procedure is based on two different projection arguments, which lead to test statistics with several appealing properties. They accommodate highdimensional covariates; are asymptotically invariant to the estimation method used to estimate the nuisance parameters and do not requite estimators to be rootn asymptotically linear; are fully datadriven and do not require tuning parameters, can be written in closedform, facilitating the implementation of an easytouse multiplier bootstrap procedure. We show that our proposed tests are able to detect a broad class of local alternatives converging to the null at the parametric rate. Monte Carlo simulation studies indicate that our double projected tests have much higher power than other tests available in the literature, highlighting their practical appeal. 
Date:  2020–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2003.13803&r=all 
By:  Gluschenko, Konstantin 
Abstract:  A sufficient issue in studies of economic development is whether economies (countries, regions of a country, etc.) converge to one another in terms of per capita income. In this paper, nonlinear asymptotically subsiding trends of income gap in a pair of economies model the convergence process. A few specific forms of such trends are proposed: logexponential trend, exponential trend, and fractional trend. A pair of economies is deemed converging if time series of their income gap is stationary about any of these trends. To test for stationarity, standard unit root tests are applied with nonstandard test statistics that are estimated for each kind of the trends. 
Keywords:  income convergence; time series econometrics; nonlinear timeSeries model; unit root 
JEL:  C32 C51 
Date:  2020–03–28 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:99316&r=all 
By:  Timothy B. Armstrong; Michal Koles\'ar; Mikkel PlagborgM{\o}ller 
Abstract:  We construct robust empirical Bayes confidence intervals (EBCIs) in a normal means problem. The intervals are centered at the usual empirical Bayes estimator, but use a larger critical value to account for the effect of shrinkage. We show that in this setting, parametric EBCIs based on the assumption that the means are normally distributed (Morris, 1983) can have coverage substantially below the nominal level when the normality assumption is violated, and we derive a simple rule of thumb for gauging the potential coverage distortion. In contrast, while our EBCIs remain close in length to the parametric EBCIs when the means are indeed normally distributed, they achieve correct coverage regardless of the means distribution. If the means are treated as fixed, our EBCIs have an average coverage guarantee: the coverage probability is at least $1\alpha$ on average across the $n$ EBCIs for each of the means. We illustrate our methods with applications to effects of U.S. neighborhoods on intergenerational mobility, and structural changes in factor loadings in a large dynamic factor model for the Eurozone. Our approach generalizes to the construction of intervals with average coverage guarantees in other regularized estimation settings. 
Date:  2020–04 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2004.03448&r=all 
By:  Benjamin Avanzi; Greg Taylor; Bernard Wong; Alan Xian 
Abstract:  The Markovmodulated Poisson process is utilised for count modelling in a variety of areas such as queueing, reliability, network and insurance claims analysis. In this paper, we extend the Markovmodulated Poisson process framework through the introduction of a flexible frequency perturbation measure. This contribution enables known information of observed event arrivals to be naturally incorporated in a tractable manner, while the hidden Markov chain captures the effect of unobservable drivers of the data. In addition to increases in accuracy and interpretability, this method supplements analysis of the latent factors. Further, this procedure naturally incorporates data features such as overdispersion and autocorrelation. Additional insights can be generated to assist analysis, including a procedure for iterative model improvement. Implementation difficulties are also addressed with a focus on dealing with large data sets, where latent models are especially advantageous due the large number of observations facilitating identification of hidden factors. Namely, computational issues such as numerical underflow and high processing cost arise in this context and in this paper, we produce procedures to overcome these problems. This modelling framework is demonstrated using a large insurance data set to illustrate theoretical, practical and computational contributions and an empirical comparison to other count models highlight the advantages of the proposed approach. 
Date:  2020–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2003.13888&r=all 
By:  C. Tort\`u; L. Forastiere; I. Crimaldi; F. Mealli 
Abstract:  Policy evaluation studies, which aim to assess the effect of an intervention, imply some statistical challenges: realworld scenarios provide treatments which have not been assigned randomly and the analysis might be further complicated by the presence of interference between units. Researchers have started to develop novel methods that allow to manage spillover mechanisms in observational studies, under binary treatments. But many policy evaluation studies require complex treatments, such as multivalued treatments. For instance, in political sciences, evaluating the impact of policies implemented by administrative entities often implies a multivalued approach, as the general political stance towards a specific issue varies over many dimensions. In this work, we extend the statistical framework about causal inference under network interference in observational studies, allowing for a multivalued individual treatment and an interference structure shaped by a weighted network. Under multivalued treatment, each unit is exposed to all levels of the treatment, due to the influence of his neighbors, according to the network weights. The estimation strategy is based on a joint multiple generalized propensity score and allows to estimate direct effects, controlling for both individual and network covariates. We follow the proposed methodology to analyze the impact of national immigration policy on crime rates. We define a multivalued characterization of political attitudes towards migrants and we assume that the extent to which each country can be influenced by another is modeled by an appropriate indicator, that we call Interference Compound Index (ICI). Results suggest that implementing highly restrictive immigration policies leads to an increase of crime rates and the magnitude of estimated effects is stronger if we take into account multivalued interference. 
Date:  2020–02 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2003.10525&r=all 
By:  WeiZhen Li (ECUST); JinRui Zhai (ECUST); ZhiQiang Jiang (ECUST); GangJin Wang (HNU); WeiXing Zhou (ECUST) 
Abstract:  Predicting the occurrence of tail events is of great importance in financial risk management. By employing the method of peakoverthreshold (POT) to identify the financial extremes, we perform a recurrence interval analysis (RIA) on these extremes. We find that the waiting time between consecutive extremes (recurrence interval) follow a $q$exponential distribution and the sizes of extremes above the thresholds (exceeding size) conform to a generalized Pareto distribution. We also find that there is a significant correlation between recurrence intervals and exceeding sizes. We thus model the joint distribution of recurrence intervals and exceeding sizes through connecting the two corresponding marginal distributions with the Frank and AMH copula functions, and apply this joint distribution to estimate the hazard probability to observe another extreme in $\Delta t$ time since the last extreme happened $t$ time ago. Furthermore, an extreme predicting model based on RIAEVTCopula is proposed by applying a decisionmaking algorithm on the hazard probability. Both insample and outofsample tests reveal that this new extreme forecasting framework has better performance in prediction comparing with the forecasting model based on the hazard probability only estimated from the distribution of recurrence intervals. Our results not only shed a new light on understanding the occurring pattern of extremes in financial markets, but also improve the accuracy to predict financial extremes for risk management. 
Date:  2020–04 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2004.03190&r=all 
By:  Lorenzo Trapani; Emily Whitehouse 
Abstract:  We develop monitoring procedures for cointegrating regressions, testing the null of no breaks against the alternatives that there is either a change in the slope, or a change to noncointegration. After observing the regression for a calibration sample m, we study a CUSUMtype statistic to detect the presence of change during a monitoring horizon m+1,...,T. Our procedures use a class of boundary functions which depend on a parameter whose value affects the delay in detecting the possible break. Technically, these procedures are based on almost sure limiting theorems whose derivation is not straightforward. We therefore define a monitoring function which  at every point in time  diverges to infinity under the null, and drifts to zero under alternatives. We cast this sequence in a randomised procedure to construct an i.i.d. sequence, which we then employ to define the detector function. Our monitoring procedure rejects the null of no break (when correct) with a small probability, whilst it rejects with probability one over the monitoring horizon in the presence of breaks. 
Date:  2020–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2003.12182&r=all 
By:  Don Harding 
Abstract:  We study the puzzle that econometric tests reject the great ratios hypothesis but economic growth theorists and quantitative macroeconomic model builders continue to embed that hypothesis in their work. We develop an econometric framework for the great ratios hypothesis and apply that framework to investigate the commonly used econometric techniques that produce rejection of the great ratios hypothesis. We prove that these methods cannot produce valid inference on the great ratios hypothesis. Thus we resolve the puzzle in favour of the growth theorists and quantitative macroeconomic model builders. We apply our framework to investigate the econometric basis for an influential paper that uses unit root and cointegration tests to reject the great ratios hypothesis for a vector that comprises consumption, financial wealth and labour income. 
Keywords:  Great Ratios Hypothesis Cointegration Likelihood Ratio Inference 
JEL:  C12 C18 C32 E00 
Date:  2020–03 
URL:  http://d.repec.org/n?u=RePEc:cop:wpaper:g300&r=all 
By:  Jennifer L. Castle (Magdelen College, University of Oxford); Jurgen A. Doornik (Nuffield College, University of Oxford); David Hendry (Nuffield College, University of Oxford) 
Abstract:  Economic forecasting is difficult, largely because of the many sources of nonstationarity. The M4 competition aims to improve the practice of economic forecasting by providing a large data set on which the efficacy of forecasting methods can be evaluated. We consider the general principles that seem to be the foundation for successful forecasting, and show how these are relevant for methods that do well in M4. We establish some general properties of the M4 data set, which we use to improve the basic benchmark methods, as well as the Card method that we created for our submission to the M4 competition. A data generation process is proposed that captures the salient features of the annual data in M4. 
Keywords:  Automatic forecasting, Calibration, Prediction intervals, Regression, M4, Seasonality, Software, Time series, Unit roots 
Date:  2019–01–09 
URL:  http://d.repec.org/n?u=RePEc:nuf:econwp:1901&r=all 
By:  Zoë Fannon (Somerville College, University of Oxford); B. Nielsen (Nuffield College, University of Oxford) 
Abstract:  Outcomes of interest often depend on the age, period, or cohort of the individual observed, where cohort and age add up to period. An example is consumption: consumption patterns change over the lifecycle (age) but are also affected by the availability of products at different times (period) and by birth cohortspecific habits and preferences (cohort). Ageperiodcohort (APC) models are additive models where the predictor is a sum of three time effects, which are functions of age, period and cohort, respectively. Variations of these models are available for data aggregated over age, period, and cohort, and for data drawn from repeated crosssections, where the time effects can be combined with individual covariates. The age, period and cohort time effects are intertwined. Inclusion of an indicator variable for each level of age, period, and cohort results in perfect collinearity, which is referred to as “the ageperiodcohort identification problem”. Estimation can be done by dropping indicator variables. However, this has the adverse consequence that the time effects are not individually interpretable and inference becomes complicated. These consequences are avoided by decomposing the time effects into linear and nonlinear components and noting that the identification problem relates to the linear components, whereas the nonlinear components are identifiable. Thus, confusion is avoided by keeping the identifiable nonlinear components of the time effects and the unidentifiable linear components apart. A variety of hypotheses of practical interest can be expressed in terms of the nonlinear components 
Date:  2018–11–28 
URL:  http://d.repec.org/n?u=RePEc:nuf:econwp:1804&r=all 
By:  Adeniyi, Isaac Adeola; Yahya, Waheed Babatunde 
Abstract:  A standard assumption is that the random effects of Generalized Linear Mixed Effects Models (GLMMs) follow the normal distribution. However, this assumption has been found to be quite unrealistic and sometimes too restrictive as revealed in many reallife situations. A common case of departures from normality includes the presence of outliers leading to heavytailed distributed random effects. This work, therefore, aims to develop a robust GLMM framework by replacing the normality assumption on the random effects by the distributions belonging to the NormalIndependent (NI) class. The resulting models are called the NormalIndependent GLMM (NIGLMM). The four special cases of the NI class considered in these models’ formulations include the normal, Studentt, Slash and contaminated normal distributions. A full Bayesian technique was adopted for estimation and inference. A reallife data set on cotton bolls was used to demonstrate the performance of the proposed NIGLMM methodology. 
Keywords:  Generalized Linear Mixed Effects Models, NormalIndependent class, Normal density, Studentt, Slash density, Bayesian Method. 
JEL:  C11 C53 C63 
Date:  2020–03–18 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:99165&r=all 
By:  Vladim\'ir Hol\'y; Petra Tomanov\'a 
Abstract:  We investigate the computational issues related to the memory size in the estimation of quadratic covariation using financial ultrahighfrequency data. In the multivariate price process, we consider both contamination by the market microstructure noise and the nonsynchronous observations. We express the multiscale, flattop realized kernel, nonflattop realized kernel, preaveraging and modulated realized covariance estimators in a quadratic form and fix their bandwidth parameter at a constant value. This allows us to operate with limited memory and formulate such estimation approach as a streaming algorithm. We compare the performance of the estimators with fixed bandwidth parameter in a simulation study. We find that the estimators ensuring positive semidefiniteness require much higher bandwidth than the estimators without such constraint. 
Date:  2020–03 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2003.13062&r=all 
By:  Dadakas, Dimitrios 
Abstract:  Advances in gravity literature have presented econometric approaches for the theoretically consistent estimation of structural gravity. When estimating the impact of policyshocks on trade values however, researchers are confronted with two problems. Once multilateral resistances are taken into account, through timevarying importer and exporter fixed effects, they absorb the effect of policyshock indicator variables. Hence, we cannot obtain a coefficient for the impact of policy. The second problem is rooted in the necessary panel data dimensions in structural gravity that requires multipleexporters and multipleimporters. The (at least) three dimensional panel implies that any coefficients/impacts that are estimated apply to the whole set of exporters rather than the country related to the scope of the research. I propose a method to approach these two problems, estimate the impact that policyshock variables have on trade and differentiate the results for the country/countries related to the scope of the research. A short application on the impact that the Global Financial Crisis had on trade values is presented. 
Keywords:  Trade, Structural Gravity, PPML, Poisson Pseudo Maximum Likelihood, Global Financial Crisis 
JEL:  C1 C10 C2 C23 F10 F14 
Date:  2020–03–05 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:98956&r=all 
By:  Ian Crawford (Nuffield College, University of Oxford) 
Abstract:  This paper studies labour supply in panel data by means of random fields. In doing so it describes a way of uniting classical revealed preference techniques and econometric prediction by means of a best linear, unbiased prediction procedure based on Goldberger (1962) and known as WeinerKolmogorov prediction or Universal Kriging in the spatial statistics literature. This, it is argued, retains the best features of both revealed preference and statistical approaches. In an application to the consumption and labour supply decisions of NYC taxi drivers this paper makes a number of empirical points: first that behaviour which is, on the basis of conventional revealed preferencebased measures, rational, can be shown to be highly economically implausible; secondly that modelling labour supply using parsimonious relevant conditioning can solve the puzzle by providing predictions which match the data, are theoreticallyconsistent yet are behaviourally and economically sensible; thirdly it shows that modelling behaviour at the level at the which the theory is designed to apply (which is to say at the level of the individual) can given greater insights into behaviour and heterogeneity than modelling population moments or quantiles; lastly than the practice of assuming monotonic scalar heterogeneity when modelling cross sectional data may give a strongly misleading impression of both behaviour and preference heterogeneity. 
Date:  2019–10–11 
URL:  http://d.repec.org/n?u=RePEc:nuf:econwp:1906&r=all 