|
on Econometrics |
By: | Jinyuan Chang; Zhentao Shi; Jia Zhang |
Abstract: | Models defined by moment conditions are at the center of structural econometric estimation, but economic theory is mostly silent about moment selection. A large pool of valid moments can potentially improve estimation efficiency, whereas a few invalid ones may undermine consistency. This paper investigates the empirical likelihood estimation of these moment-defined models in high-dimensional settings. We propose a penalized empirical likelihood (PEL) estimation and show that it achieves the oracle property under which the invalid moments can be consistently detected. While the PEL estimator is asymptotically normally distributed, a projected PEL procedure can further eliminate its asymptotic bias and provide more accurate normal approximation to the finite sample distribution. Simulation exercises are carried out to demonstrate excellent numerical performance of these methods in estimation and inference. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.03382&r= |
By: | Hanno Reuvers; Etienne Wijler |
Abstract: | We consider sparse estimation of a class of high-dimensional spatio-temporal models. Unlike classical spatial autoregressive models, we do not rely on a predetermined spatial interaction matrix. Instead, under the assumption of sparsity, we estimate the relationships governing both the spatial and temporal dependence in a fully data-driven way by penalizing a set of Yule-Walker equations. While this regularization can be left unstructured, we also propose a customized form of shrinkage to further exploit diagonally structured forms of sparsity that follow intuitively when observations originate from spatial grids such as satellite images. We derive finite sample error bounds for this estimator, as well estimation consistency in an asymptotic framework wherein the sample size and the number of spatial units diverge jointly. A simulation exercise shows strong finite sample performance compared to competing procedures. As an empirical application, we model satellite measured NO2 concentrations in London. Our approach delivers forecast improvements over a competitive benchmark and we discover evidence for strong spatial interactions between sub-regions. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.02864&r= |
By: | Stephen Coussens; Jann Spiess |
Abstract: | Instrumental variables (IV) regression is widely used to estimate causal treatment effects in settings where receipt of treatment is not fully random, but there exists an instrument that generates exogenous variation in treatment exposure. While IV can recover consistent treatment effect estimates, they are often noisy. Building upon earlier work in biostatistics (Joffe and Brensinger, 2003) and relating to an evolving literature in econometrics (including Abadie et al., 2019; Huntington-Klein, 2020; Borusyak and Hull, 2020), we study how to improve the efficiency of IV estimates by exploiting the predictable variation in the strength of the instrument. In the case where both the treatment and instrument are binary and the instrument is independent of baseline covariates, we study weighting each observation according to its estimated compliance (that is, its conditional probability of being affected by the instrument), which we motivate from a (constrained) solution of the first-stage prediction problem implicit to IV. The resulting estimator can leverage machine learning to estimate compliance as a function of baseline covariates. We derive the large-sample properties of a specific implementation of a weighted IV estimator in the potential outcomes and local average treatment effect (LATE) frameworks, and provide tools for inference that remain valid even when the weights are estimated nonparametrically. With both theoretical results and a simulation study, we demonstrate that compliance weighting meaningfully reduces the variance of IV estimates when first-stage heterogeneity is present, and that this improvement often outweighs any difference between the compliance-weighted and unweighted IV estimands. These results suggest that in a variety of applied settings, the precision of IV estimates can be substantially improved by incorporating compliance estimation. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.03726&r= |
By: | Tadao Hoshino; Takahide Yanagi |
Abstract: | In this paper, we investigate a treatment effect model in which individuals interact in a social network and they may not comply with the assigned treatments. We introduce a new concept of exposure mapping, which summarizes spillover effects into a fixed dimensional statistic of instrumental variables, and we call this mapping the instrumental exposure mapping (IEM). We investigate identification conditions for the intention-to-treat effect and the average causal effect for compliers, while explicitly considering the possibility of misspecification of IEM. Based on our identification results, we develop nonparametric estimation procedures for the treatment parameters. Their asymptotic properties, including consistency and asymptotic normality, are investigated using an approximate neighborhood interference framework by Leung (2021). For an empirical illustration of our proposed method, we revisit Paluck et al.'s (2016) experimental data on the anti-conflict intervention school program. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.07455&r= |
By: | Kenwin Maung |
Abstract: | Maximum likelihood estimation of large Markov-switching vector autoregressions (MS-VARs) can be challenging or infeasible due to parameter proliferation. To accommodate situations where dimensionality may be of comparable order to or exceeds the sample size, we adopt a sparse framework and propose two penalized maximum likelihood estimators with either the Lasso or the smoothly clipped absolute deviation (SCAD) penalty. We show that both estimators are estimation consistent, while the SCAD estimator also selects relevant parameters with probability approaching one. A modified EM-algorithm is developed for the case of Gaussian errors and simulations show that the algorithm exhibits desirable finite sample performance. In an application to short-horizon return predictability in the US, we estimate a 15 variable 2-state MS-VAR(1) and obtain the often reported counter-cyclicality in predictability. The variable selection property of our estimators helps to identify predictors that contribute strongly to predictability during economic contractions but are otherwise irrelevant in expansions. Furthermore, out-of-sample analyses indicate that large MS-VARs can significantly outperform "hard-to-beat" predictors like the historical average. |
Date: | 2021–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2107.12552&r= |
By: | Dante Amengual (CEMFI, Centro de Estudios Monetarios y Financieros); Xinyue Bei (Duke University); Enrique Sentana (CEMFI, Centro de Estudios Monetarios y Financieros) |
Abstract: | We propose a multivariate normality test against skew normal distributions using higher-order log-likelihood derivatives which is asymptotically equivalent to the likelihood ratio but only requires estimation under the null. Numerically, it is the supremum of the univariate skewness coefficient test over all linear combinations of the variables. We can simulate its exact finite sample distribution for any multivariate dimension and sample size. Our Monte Carlo exercises confirm its power advantages over alternative approaches. Finally, we apply it to the joint distribution of US city sizes in two consecutive censuses finding that non-normality is very clearly seen in their growth rates. |
Keywords: | City size distribution, exact test, extremum test, Gibrat's law, skew normal distribution. |
JEL: | C46 R11 |
Date: | 2021–05 |
URL: | http://d.repec.org/n?u=RePEc:cmf:wpaper:wp2021_2104&r= |
By: | Ruoxuan Xiong; Allison Koenecke; Michael Powell; Zhu Shen; Joshua T. Vogelstein; Susan Athey |
Abstract: | Analyzing observational data from multiple sources can be useful for increasing statistical power to detect a treatment effect; however, practical constraints such as privacy considerations may restrict individual-level information sharing across data sets. This paper develops federated methods that only utilize summary-level information from heterogeneous data sets. Our federated methods provide doubly-robust point estimates of treatment effects as well as variance estimates. We derive the asymptotic distributions of our federated estimators, which are shown to be asymptotically equivalent to the corresponding estimators from the combined, individual-level data. We show that to achieve these properties, federated methods should be adjusted based on conditions such as whether models are correctly specified and stable across heterogeneous data sets. |
Date: | 2021–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2107.11732&r= |
By: | Guido Imbens; Nathan Kallus; Xiaojie Mao |
Abstract: | We develop a new approach for identifying and estimating average causal effects in panel data under a linear factor model with unmeasured confounders. Compared to other methods tackling factor models such as synthetic controls and matrix completion, our method does not require the number of time periods to grow infinitely. Instead, we draw inspiration from the two-way fixed effect model as a special case of the linear factor model, where a simple difference-in-differences transformation identifies the effect. We show that analogous, albeit more complex, transformations exist in the more general linear factor model, providing a new means to identify the effect in that model. In fact many such transformations exist, called bridge functions, all identifying the same causal effect estimand. This poses a unique challenge for estimation and inference, which we solve by targeting the minimal bridge function using a regularized estimation approach. We prove that our resulting average causal effect estimator is root-N consistent and asymptotically normal, and we provide asymptotically valid confidence intervals. Finally, we provide extensions for the case of a linear factor model with time-varying unmeasured confounders. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.03849&r= |
By: | Sakae Oya; Teruo Nakatsuma |
Abstract: | Harvey et al. (2010) extended the Bayesian estimation method by Sahu et al. (2003) to a multivariate skew-elliptical distribution with a general skewness matrix, and applied it to Bayesian portfolio optimization with higher moments. Although their method is epochal in the sense that it can handle the skewness dependency among asset returns and incorporate higher moments into portfolio optimization, it cannot identify all elements in the skewness matrix due to label switching in the Gibbs sampler. To deal with this identification issue, we propose to modify their sampling algorithm by imposing a positive lower-triangular constraint on the skewness matrix of the multivariate skew- elliptical distribution and improved interpretability. Furthermore, we propose a Bayesian sparse estimation of the skewness matrix with the horseshoe prior to further improve the accuracy. In the simulation study, we demonstrate that the proposed method with the identification constraint can successfully estimate the true structure of the skewness dependency while the existing method suffers from the identification issue. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.04019&r= |
By: | Amadou Barry; Karim Oualkacha; Arthur Charpentier |
Abstract: | The fixed-effects model estimates the regressor effects on the mean of the response, which is inadequate to summarize the variable relationships in the presence of heteroscedasticity. In this paper, we adapt the asymmetric least squares (expectile) regression to the fixed-effects model and propose a new model: expectile regression with fixed-effects $(\ERFE).$ The $\ERFE$ model applies the within transformation strategy to concentrate out the incidental parameter and estimates the regressor effects on the expectiles of the response distribution. The $\ERFE$ model captures the data heteroscedasticity and eliminates any bias resulting from the correlation between the regressors and the omitted factors. We derive the asymptotic properties of the $\ERFE$ estimators and suggest robust estimators of its covariance matrix. Our simulations show that the $\ERFE$ estimator is unbiased and outperforms its competitors. Our real data analysis shows its ability to capture data heteroscedasticity (see our R package, \url{github.com/AmBarry/erfe}). |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.04737&r= |
By: | Nicolas Debarsy (LEM - Lille économie management - UMR 9221 - UA - Université d'Artois - UCL - Université catholique de Lille - Université de Lille - CNRS - Centre National de la Recherche Scientifique); James Lesage (Texas State University) |
Abstract: | There is a great deal of literature regarding use of non-geographically based connectivity matrices or combinations of geographic and nongeographic structures in spatial econometrics models. We focus on convex combinations of weight matrices that result in a single weight matrix reflecting multiple types of connectivity, where coefficients from the convex combination can be used for inference regarding the relative importance of each type of connectivity. This type of model specification raises the question — which connectivity matrices should be used and which should be ignored. For example, in the case of L candidate weight matrices, there are M = 2L −L−1 possible ways to employ two or more of the L weight matrices in alternative model specifications. When L = 5, we have M = 26 possible models involving two or more weight matrices, and for L = 10, M = 1, 013. We use Metropolis-Hastings guided Monte Carlo integration during MCMC estimation of the models to produce log-marginal likelihoods and associated posterior model probabilities for the set of M possible models, which allows 1 for Bayesian model averaged estimates. We focus on MCMC estimation for a set of M models, estimates of posterior model probabilities, model averaged estimates of the parameters, scalar summary measures of the non-linear partial derivative impacts, and associated empirical measures of dispersion for the impacts. |
Keywords: | cross-sectional dependence,SAR,block sampling parameters for a convex combination,Markov Chain Monte Carlo estimation,hedonic price model |
Date: | 2021 |
URL: | http://d.repec.org/n?u=RePEc:hal:journl:hal-03046651&r= |
By: | Bastian Schäfer (Paderborn University); Yuanhua Feng (Paderborn University) |
Abstract: | This paper examines data-driven estimation of the mean surface in nonparamet- ric regression for huge functional time series. In this framework, we consider the use of the double conditional smoothing (DCS), an equivalent but much faster translation of the 2D-kernel regression. An even faster, but again equivalent func- tional DCS (FCDS) scheme and a boundary correction method for the DCS/FCDS is proposed. The asymptotically optimal bandwidths are obtained and selected by an IPI (iterative plug-in) algorithm. We show that the IPI algorithm works well in practice in a simulation study and apply the proposals to estimate the spot-volatility and trading volume surface in high-frequency nancial data under a functional representation. Our proposals also apply to large lattice spatial or spatial-temporal data from any research area. |
Keywords: | Spatial nonparametric regression, boundary correction, functional double conditional smoothing, bandwidth selection, spot volatility surface |
JEL: | C14 C51 |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:pdn:ciepap:143&r= |
By: | Zhaonan Qu; Ruoxuan Xiong; Jizhou Liu; Guido Imbens |
Abstract: | In many observational studies in social science and medical applications, subjects or individuals are connected, and one unit's treatment and attributes may affect another unit's treatment and outcome, violating the stable unit treatment value assumption (SUTVA) and resulting in interference. To enable feasible inference, many previous works assume the ``exchangeability'' of interfering units, under which the effect of interference is captured by the number or ratio of treated neighbors. However, in many applications with distinctive units, interference is heterogeneous. In this paper, we focus on the partial interference setting, and restrict units to be exchangeable conditional on observable characteristics. Under this framework, we propose generalized augmented inverse propensity weighted (AIPW) estimators for general causal estimands that include direct treatment effects and spillover effects. We show that they are consistent, asymptotically normal, semiparametric efficient, and robust to heterogeneous interference as well as model misspecifications. We also apply our method to the Add Health dataset and find that smoking behavior exhibits interference on academic outcomes. |
Date: | 2021–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2107.12420&r= |
By: | Igor L. Kheifets; Peter C. B. Phillips |
Abstract: | Multicointegration is traditionally defined as a particular long run relationship among variables in a parametric vector autoregressive model that introduces additional cointegrating links between these variables and partial sums of the equilibrium errors. This paper departs from the parametric model, using a semiparametric formulation that reveals the explicit role that singularity of the long run conditional covariance matrix plays in determining multicointegration. The semiparametric framework has the advantage that short run dynamics do not need to be modeled and estimation by standard techniques such as fully modified least squares (FM-OLS) on the original I(1) system is straightforward. The paper derives FM-OLS limit theory in the multicointegrated setting, showing how faster rates of convergence are achieved in the direction of singularity and that the limit distribution depends on the distribution of the conditional one-sided long run covariance estimator used in FM-OLS estimation. Wald tests of restrictions on the regression coefficients have nonstandard limit theory which depends on nuisance parameters in general. The usual tests are shown to be conservative when the restrictions are isolated to the directions of singularity and, under certain conditions, are invariant to singularity otherwise. Simulations show that approximations derived in the paper work well in finite samples. The findings are illustrated empirically in an analysis of fiscal sustainability of the US government over the post-war period. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.03486&r= |
By: | Subhadeep Mukhopadhyay |
Abstract: | This article introduces a general statistical modeling principle called "Density Sharpening" and applies it to the analysis of discrete count data. The underlying foundation is based on a new theory of nonparametric approximation and smoothing methods for discrete distributions which play a useful role in explaining and uniting a large class of applied statistical methods. The proposed modeling framework is illustrated using several real applications, from seismology to healthcare to physics. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.07372&r= |
By: | Zheng Fang |
Abstract: | This paper makes the following econometric contributions. First, we develop a unifying framework for testing shape restrictions based on the Wald principle. Second, we examine the applicability and usefulness of some prominent shape enforcing operators in implementing our test, including rearrangement and the greatest convex minorization (or the least concave majorization). In particular, the influential rearrangement operator is inapplicable due to a lack of convexity, while the greatest convex minorization is shown to enjoy the analytic properties required to employ our framework. The importance of convexity in establishing size control has been noted elsewhere in the literature. Third, we show that, despite that the projection operator may not be well-defined/behaved in general non-Hilbert parameter spaces (e.g., ones defined by uniform norms), one may nonetheless devise a powerful distance-based test by applying our framework. The finite sample performance of our test is evaluated through Monte Carlo simulations, and its empirical relevance is showcased by investigating the relationship between weekly working hours and the annual wage growth in the high-end labor market. |
Date: | 2021–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2107.12494&r= |
By: | Michael Keane (School of Economics); Timothy Neal (UNSW School of Economics) |
Abstract: | We provide a simple survey of the weak instrument literature, aimed at giving practical advice to applied researchers. It is well-known that 2SLS has poor properties if instruments are exogenous but weak. We clarify these properties, explain weak instrument tests, and examine how behavior of 2SLS depends on instrument strength. A common standard for acceptable instruments is a ï¬ rst-stage F-statistic of at least 10. But 2SLS has poor properties in that context: It has very little power, and generates artiï¬ cially low standard errors precisely in those samples where it generates estimates most contaminated by endogeneity. This causes standard t-tests to give misleading results. In fact, one-tailed 2SLS t-tests suffer from severe size distortions unless F is in the thousands. Anderson-Rubin and conditional t-tests alleviate this problem, and should be used even with strong instruments. A ï¬ rst-stage F of 50 or more is necessary to give reasonable conï¬ dence that 2SLS will outperform OLS. Otherwise, OLS combined with controls for sources of endogeneity may be a superior research strategy to IV. |
Keywords: | Instrumental variables, weak instruments, 2SLS, endogeneity, F-test, size distortions of tests, Anderson-Rubin test, conditional t-test, Fuller, JIVE |
Date: | 2021–06 |
URL: | http://d.repec.org/n?u=RePEc:swe:wpaper:2021-05a&r= |
By: | Yuanhua Feng (Paderborn University); Bastian Schäfer (Paderborn University) |
Abstract: | This paper discusses the suitable choice of the weighting function at a boundary point in local polynomial regression and introduces two new boundary modi- cation methods by adapting known ideas for generating boundary kernels. Now continuous estimates at endpoints are achievable. Under given conditions the use of those quite different weighting functions at an interior point is equivalent. At a boundary point the use of dierent methods will lead to different estimates. It is also shown that the optimal weighting function at the endpoints is a natural extension of one of the optimal weighting functions in the interior. Furthermore, it is shown that the most well known boundary kernels proposed in the literature can be generated by local polynomial regression using corresponding weighting functions. The proposals are particularly useful, when one-side smoothing or de- tection of change points in nonparametric regression are considered. |
Keywords: | Local polynomial regression, equivalent weighting methods, boundary modification, boundary kernels, finite sample property |
JEL: | C14 C51 |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:pdn:ciepap:144&r= |
By: | Serena Ng |
Abstract: | The coronavirus is a global event of historical proportions and just a few months changed the time series properties of the data in ways that make many pre-covid forecasting models inadequate. It also creates a new problem for estimation of economic factors and dynamic causal effects because the variations around the outbreak can be interpreted as outliers, as shifts to the distribution of existing shocks, or as addition of new shocks. I take the latter view and use covid indicators as controls to 'de-covid' the data prior to estimation. I find that economic uncertainty remains high at the end of 2020 even though real economic activity has recovered and covid uncertainty has receded. Dynamic responses of variables to shocks in a VAR similar in magnitude and shape to the ones identified before 2020 can be recovered by directly or indirectly modeling covid and treating it as exogenous. These responses to economic shocks are distinctly different from those to a covid shock, and distinguishing between the two types of shocks can be important in macroeconomic modeling post-covid. |
JEL: | C18 E0 E32 |
Date: | 2021–07 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:29060&r= |
By: | Muhammed Taher Al-Mudafer; Benjamin Avanzi; Greg Taylor; Bernard Wong |
Abstract: | Neural networks offer a versatile, flexible and accurate approach to loss reserving. However, such applications have focused primarily on the (important) problem of fitting accurate central estimates of the outstanding claims. In practice, properties regarding the variability of outstanding claims are equally important (e.g., quantiles for regulatory purposes). In this paper we fill this gap by applying a Mixture Density Network ("MDN") to loss reserving. The approach combines a neural network architecture with a mixture Gaussian distribution to achieve simultaneously an accurate central estimate along with flexible distributional choice. Model fitting is done using a rolling-origin approach. Our approach consistently outperforms the classical over-dispersed model both for central estimates and quantiles of interest, when applied to a wide range of simulated environments of various complexity and specifications. We further extend the MDN approach by proposing two extensions. Firstly, we present a hybrid GLM-MDN approach called "ResMDN". This hybrid approach balances the tractability and ease of understanding of a traditional GLM model on one hand, with the additional accuracy and distributional flexibility provided by the MDN on the other. We show that it can successfully improve the errors of the baseline ccODP, although there is generally a loss of performance when compared to the MDN in the examples we considered. Secondly, we allow for explicit projection constraints, so that actuarial judgement can be directly incorporated in the modelling process. Throughout, we focus on aggregate loss triangles, and show that our methodologies are tractable, and that they out-perform traditional approaches even with relatively limited amounts of data. We use both simulated data -- to validate properties, and real data -- to illustrate and ascertain practicality of the approaches. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.07924&r= |
By: | Emmanouil Sfendourakis; Ioane Muni Toke |
Abstract: | A point process model for order flows in limit order books is proposed, in which the conditional intensity is the product of a Hawkes component and a state-dependent factor. In the LOB context, state observations may include the observed imbalance or the observed spread. Full technical details for the computationally-efficient estimation of such a process are provided, using either direct likelihood maximization or EM-type estimation. Applications include models for bid and ask market orders, or for upwards and downwards price movements. Empirical results on multiple stocks traded in Euronext Paris underline the benefits of state-dependent formulations for LOB modeling, e.g. in terms of goodness-of-fit to financial data. |
Date: | 2021–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2107.12872&r= |
By: | Jing Tian; Jan P.A.M. Jacobs; Denise R. Osborn |
Abstract: | Multivariate analysis can help to focus on economic phenomena, including trend and cyclical movements. To allow for potential correlation with seasonality, the present paper studies a three component multivariate unobserved component model, focusing on the case of quarterly data and showing that economic restrictions, including common trends and common cycles, can ensure identification. Applied to seasonal aggregate gender employment in Australia, a bivariate male/female model with a common cycle is preferred to both univariate correlated component and bivariate uncorrelated component specifications. This model evidences distinct gender-based seasonal patterns with seasonality declining over time for females and increasing for males. |
Keywords: | trend-cycle-seasonal decomposition, multivariate unobserved components models, correlated component models, identification, gender employment, Australia |
JEL: | C22 E24 E32 E37 F01 |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:een:camaaa:2021-72&r= |
By: | Arie Beresteanu |
Abstract: | We provide a sharp identification region for discrete choice models in which consumers' preferences are not necessarily complete and only aggregate choice data is available to the analysts. Behavior with non complete preferences is modeled using an upper and a lower utility for each alternative so that non-comparability can arise. The identification region places intuitive bounds on the probability distribution of upper and lower utilities. We show that the existence of an instrumental variable can be used to reject the hypothesis that all consumers' preferences are complete, while attention sets can be used to rule out the hypothesis that all individuals cannot compare any two alternatives. We apply our methods to data from the 2018 mid-term elections in Ohio. |
Date: | 2021–01 |
URL: | http://d.repec.org/n?u=RePEc:pit:wpaper:7145&r= |