|
on Econometrics |
By: | Matteo Barigozzi; Giuseppe Cavaliere; Lorenzo Trapani |
Abstract: | We study inference on the common stochastic trends in a non-stationary, $N$-variate time series $y_{t}$, in the possible presence of heavy tails. We propose a novel methodology which does not require any knowledge or estimation of the tail index, or even knowledge as to whether certain moments (such as the variance) exist or not, and develop an estimator of the number of stochastic trends $m$ based on the eigenvalues of the sample second moment matrix of $y_{t}$. We study the rates of such eigenvalues, showing that the first $m$ ones diverge, as the sample size $T$ passes to infinity, at a rate faster by $O\left(T \right)$ than the remaining $N-m$ ones, irrespective of the tail index. We thus exploit this eigen-gap by constructing, for each eigenvalue, a test statistic which diverges to positive infinity or drifts to zero according to whether the relevant eigenvalue belongs to the set of the first $m$ eigenvalues or not. We then construct a randomised statistic based on this, using it as part of a sequential testing procedure, ensuring consistency of the resulting estimator of $m$. We also discuss an estimator of the common trends based on principal components and show that, up to a an invertible linear transformation, such estimator is consistent in the sense that the estimation error is of smaller order than the trend itself. Finally, we also consider the case in which we relax the standard assumption of \textit{i.i.d.} innovations, by allowing for heterogeneity of a very general form in the scale of the innovations. A Monte Carlo study shows that the proposed estimator for $m$ performs particularly well, even in samples of small size. We complete the paper by presenting four illustrative applications covering commodity prices, interest rates data, long run PPP and cryptocurrency markets. |
Date: | 2021–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2107.13894&r= |
By: | Budhi Surya |
Abstract: | Although it has been well accepted that the asymptotic covariance matrix of maximum likelihood estimates (MLE) for complete data is given by the inverse Fisher information, this paper shows that when the MLE for incomplete data is derived using the EM algorithm, the asymptotic covariance matrix is however no longer specified by the inverse Fisher information. In general, the new information is smaller than the latter in the sense of Loewner partial ordering. A sandwich estimator of covariance matrix is developed based on the observed information of incomplete data and a consistent estimator of complete-data information matrix. The observed information simplifies calculation of conditional expectation of outer product of the complete-data score function appeared in the Louis (1982) general matrix formula. The proposed sandwich estimator takes a different form than the Huber sandwich estimator under model misspecification framework (Freedman, 2006 and Little and Rubin, 2020). Moreover, it does not involve the inverse observed Fisher information of incomplete data which therefore notably gives an appealing feature for application. Recursive algorithms for the MLE and the sandwich estimator of covariance matrix are presented. Application to parameter estimation of regime switching conditional Markov jump process is considered to verify the results. The simulation study confirms that the MLEs are accurate and consistent having asymptotic normality. The sandwich estimator produces standard errors of the MLE which are closer to their analytic values than those provided by the inverse observed Fisher information. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.01243&r= |
By: | Pesaran, M. H.; Xie, Y. |
Abstract: | In a recent paper Juodis and Reese (2021) (JR) show that the application of the CD test proposed by Pesaran (2004) to residuals from panels with latent factors results in over-rejection and propose a randomized test statistic to correct for over-rejection, and add a screening component to achieve power. This paper considers the same problem but from a different perspective and shows that the standard CD test remains valid if the latent factors are weak, and proposes a simple bias-corrected CD test, labelled CD*, which is shown to be asymptotically normal, irrespective of whether the latent factors are weak or strong. This result is shown to hold for pure latent factor models as well as for panel regressions with latent factors. Small sample properties of the CD* test are investigated by Monte Carlo experiments and are shown to have the correct size and satisfactory power for both Gaussian and non-Gaussian errors. In contrast, it is found that JR's test tends to over-reject in the case of panels with non-Gaussian errors, and have low power against spatial network alternatives. The use of the CD* test is illustrated with two empirical applications from the literature. |
Keywords: | Latent factor models, strong and weak factors, error cross-sectional dependence, spatial and network alternatives |
JEL: | C18 C23 C55 |
Date: | 2021–08–05 |
URL: | http://d.repec.org/n?u=RePEc:cam:camdae:2158&r= |
By: | Jason R. Blevins; Minhae Kim |
Abstract: | We introduce a sequential estimator for continuous time dynamic discrete choice models (single-agent models and games) by adapting the nested pseudo likelihood (NPL) estimator of Aguirregabiria and Mira (2002, 2007), developed for discrete time models with discrete time data, to the continuous time case with data sampled either discretely (i.e., uniformly-spaced snapshot data) or continuously. We establish conditions for consistency and asymptotic normality of the estimator, a local convergence condition, and, for single agent models, a zero Jacobian property assuring local convergence. We carry out a series of Monte Carlo experiments using an entry-exit game with five heterogeneous firms to confirm the large-sample properties and demonstrate finite-sample bias reduction via iteration. In our simulations we show that the convergence issues documented for the NPL estimator in discrete time models are less likely to affect comparable continuous-time models. We also show that there can be large bias in economically-relevant parameters, such as the competitive effect and entry cost, from estimating a misspecified discrete time model when in fact the data generating process is a continuous time model. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.02182&r= |
By: | Li, Erqian; Härdle, Wolfgang; Dai, Xiaowen; Tian, Maozai |
Abstract: | The proportional subdistribution hazards (PSH) model is popularly used to deal with competing risks data. Censored quantile regression provides an important supplement as well as variable selection methods, due to large numbers of irrelevant covariates in practice. In this paper, we study variable selection procedures based on penalized weighted quantile regression for competing risks models, which is conveniently applied by researchers. Asymptotic properties of the proposed estimators including consistency and asymptotic normality of non-penalized estimator and consistency of variable selection are established. Monte Carlo simulation studies are conducted, showing that the proposed methods are considerably stable and efficient. A real data about bone marrow transplant (BMT) is also analyzed to illustrate the application of proposed procedure. |
Keywords: | Competing risks,Cumulative incidence function,Kaplan-Meier estimator,Redistribution method |
JEL: | C00 |
Date: | 2021 |
URL: | http://d.repec.org/n?u=RePEc:zbw:irtgdp:2021013&r= |
By: | Chang, Jinyuan; Kolaczyk, Eric D.; Yao, Qiwei |
Abstract: | While it is common practice in applied network analysis to report various standard network summary statistics, these numbers are rarely accompanied by uncertainty quantification. Yet any error inherent in the measurements underlying the construction of the network, or in the network construction procedure itself, necessarily must propagate to any summary statistics reported. Here we study the problem of estimating the density of an arbitrary subgraph, given a noisy version of some underlying network as data. Under a simple model of network error, we show that consistent estimation of such densities is impossible when the rates of error are unknown and only a single network is observed. Accordingly, we develop method-of-moment estimators of network subgraph densities and error rates for the case where a minimal number of network replicates are available. These estimators are shown to be asymptotically normal as the number of vertices increases to infinity. We also provide confidence intervals for quantifying the uncertainty in these estimates based on the asymptotic normality. To construct the confidence intervals, a new and nonstandard bootstrap method is proposed to compute asymptotic variances, which is infeasible otherwise. We illustrate the proposed methods in the context of gene coexpression networks. Supplementary materials for this article are available online. |
Keywords: | bootstrap; edge density; graph; method of moments; triangles; two-stars |
JEL: | C1 |
Date: | 2020–07–20 |
URL: | http://d.repec.org/n?u=RePEc:ehl:lserod:104684&r= |
By: | Sungwon Lee |
Abstract: | This paper considers identification and inference for the distribution of treatment effects conditional on a set of observable covariates. Since the conditional distribution of treatment effects is not point identified without strong assumptions on the joint distribution of potential outcomes, we obtain bounds on the conditional distribution of treatment effects by using the Fr\'echet-Hoeffding bounds. We also consider the case where the treatment is endogenous and propose two stochastic dominance assumptions that are consistent with many economic theories to tighten the bounds. We develop a nonparametric framework to estimate the bounds on the conditional distribution of treatment effects and establish the asymptotic theory for uniform inference over the support of treatment effects. Two empirical examples are presented to illustrate the usefulness of the methods, with a focus on heterogeneous treatment effects. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.00723&r= |
By: | Dmitry Arkhangelsky; Guido W. Imbens; Lihua Lei; Xiaoman Luo |
Abstract: | We propose a new estimator for the average causal effects of a binary treatment with panel data in settings with general treatment patterns. Our approach augments the two-way-fixed-effects specification with the unit-specific weights that arise from a model for the assignment mechanism. We show how to construct these weights in various settings, including situations where units opt into the treatment sequentially. The resulting estimator converges to an average (over units and time) treatment effect under the correct specification of the assignment model. We show that our estimator is more robust than the conventional two-way estimator: it remains consistent if either the assignment mechanism or the two-way regression model is correctly specified and performs better than the two-way-fixed-effect estimator if both are locally misspecified. This strong double robustness property quantifies the benefits from modeling the assignment process and motivates using our estimator in practice. |
Date: | 2021–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2107.13737&r= |
By: | Harold D Chiang; Yukitoshi Matsushita; Taisuke Otsu |
Abstract: | This paper develops a general methodology to conduct statistical inference for observations indexed by multiple sets of entities. We propose a novel multiway empirical likelihood statistic that converges to a chi-square distribution under the non-degenerate case, where corresponding Hoeffding type decomposition is dominated by linear terms. Our methodology is related to the notion of jackknife empirical likelihood but the leave-out pseudo values are constructed by leaving columns or rows. We further develop a modified version of our multiway empirical likelihood statistic, which converges to a chi-square distribution regardless of the degeneracy, and discover its desirable higher-order property compared to the t-ratio by the conventional Eicker-White type variance estimator. The proposed methodology is illustrated by several important statistical problems, such as bipartite network, two-stage sampling, generalized estimating equations, and three-way observations. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.04852&r= |
By: | Masahiro Kato; Haruo Kakehi; Kenichiro McAlinn; Shota Yasui |
Abstract: | We consider learning causal relationships under conditional moment conditions. Unlike causal inference under unconditional moment conditions, conditional moment conditions pose serious challenges for causal inference, especially in complex, high-dimensional settings. To address this issue, we propose a method that transforms conditional moment conditions to unconditional moment conditions through importance weighting using the conditional density ratio. Then, using this transformation, we propose a method that successfully approximates conditional moment conditions. Our proposed approach allows us to employ methods for estimating causal parameters from unconditional moment conditions, such as generalized method of moments, adequately in a straightforward manner. In experiments, we confirm that our proposed method performs well compared to existing methods. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.01312&r= |
By: | Daniel R. Kowal |
Abstract: | Functional data are frequently accompanied by parametric templates that describe the typical shapes of the functions. Although the templates incorporate critical domain knowledge, parametric functional data models can incur significant bias, which undermines the usefulness and interpretability of these models. To correct for model misspecification, we augment the parametric templates with an infinite-dimensional nonparametric functional basis. Crucially, the nonparametric factors are regularized with an ordered spike-and-slab prior, which implicitly provides rank selection and satisfies several appealing theoretical properties. This prior is accompanied by a parameter-expansion scheme customized to boost MCMC efficiency, and is broadly applicable for Bayesian factor models. The nonparametric basis functions are learned from the data, yet constrained to be orthogonal to the parametric template in order to preserve distinctness between the parametric and nonparametric terms. The versatility of the proposed approach is illustrated through applications to synthetic data, human motor control data, and dynamic yield curve data. Relative to parametric alternatives, the proposed semiparametric functional factor model eliminates bias, reduces excessive posterior and predictive uncertainty, and provides reliable inference on the effective number of nonparametric terms--all with minimal additional computational costs. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.02151&r= |
By: | Hidalgo, Javier |
Abstract: | The aim of the paper is to describe a bootstrap, contrary to the sieve boot- strap, valid under either long memory (LM) or short memory (SM) depen- dence. One of the reasons of the failure of the sieve bootstrap in our context is that under LM dependence, the sieve bootstrap may not be able to capture the true covariance structure of the original data. We also describe and ex- amine the validity of the bootstrap scheme for the least squares estimator of the parameter in a regression model and for model specification. The moti- vation for the latter example comes from the observation that the asymptotic distribution of the test is intractable. |
Keywords: | Long memory; bootstrap methods; aggregation; semiparametric model |
JEL: | J1 C1 |
Date: | 2020–07–21 |
URL: | http://d.repec.org/n?u=RePEc:ehl:lserod:106149&r= |
By: | Li Li; Yanfei Kang; Feng Li |
Abstract: | In this work, we propose a novel framework for density forecast combination by constructing time-varying weights based on time series features, which is called Feature-based Bayesian Forecasting Model Averaging (FEBAMA). Our framework estimates weights in the forecast combination via Bayesian log predictive scores, in which the optimal forecasting combination is determined by time series features from historical information. In particular, we use an automatic Bayesian variable selection method to add weight to the importance of different features. To this end, our approach has better interpretability compared to other black-box forecasting combination schemes. We apply our framework to stock market data and M3 competition data. Based on our structure, a simple maximum-a-posteriori scheme outperforms benchmark methods, and Bayesian variable selection can further enhance the accuracy for both point and density forecasts. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.02082&r= |
By: | Katsikatsou, Myrsini; Moustaki, Irini; Md Jamil, Haziq |
Abstract: | Methods for the treatment of item non-response in attitudinal scales and in large-scale assessments under the pairwise likelihood (PL) estimation framework and under a missing at random (MAR) mechanism are proposed. Under a full information likelihood estimation framework and MAR, ignorability of the missing data mechanism does not lead to biased estimates. However, this is not the case for pseudo-likelihood approaches such as the PL. We develop and study the performance of three strategies for incorporating missing values into confirmatory factor analysis (CFA) under the PL framework, the complete-pairs (CP), the available-cases (AC) and the doubly robust (DR) approaches. The CP and AC require only a model for the observed data and standard errors are easy to compute. Doubly-robust versions of the PL estimation require a predictive model for the missing responses given the observed ones and are computationally more demanding than the AC and CP. A simulation study is used to compare the proposed methods. The proposed methods are employed to analyze the UK data on numeracy and literacy collected as part of the OECD Survey of Adult Skills. |
Keywords: | composite likelihood; item non-response; latent variable model; latent variable models; Wiley deal |
JEL: | C1 |
Date: | 2021–04–15 |
URL: | http://d.repec.org/n?u=RePEc:ehl:lserod:108933&r= |
By: | Éric Gautier (TSE - Toulouse School of Economics - UT1 - Université Toulouse 1 Capitole - Université Fédérale Toulouse Midi-Pyrénées - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement) |
Abstract: | This paper considers endogenous selection models, in particular nonparametric ones. Estimating the unconditional law of the outcomes is possible when one uses instrumental variables. Using a selection equation which is additively separable in a one dimensional unobservable has the sometimes undesirable property of instrument monotonicity. We present models which allow for nonmonotonicity and are based on nonparametric random coefficients indices. We discuss their non parametric identification and apply these results to inference on nonlinear statistics such as the Gini index in surveys when the nonresponse is not missing at random. |
Date: | 2021–06 |
URL: | http://d.repec.org/n?u=RePEc:hal:journl:hal-03306234&r= |
By: | Alberto Abadie; Jinglong Zhao |
Abstract: | This article studies experimental design in settings where the experimental units are large aggregate entities (e.g., markets), and only one or a small number of units can be exposed to the treatment. In such settings, randomization of the treatment may induce large estimation biases under many or all possible treatment assignments. We propose a variety of synthetic control designs as experimental designs to select treated units in non-randomized experiments with large aggregate units, as well as the untreated units to be used as a control group. Average potential outcomes are estimated as weighted averages of treated units for potential outcomes with treatment, and control units for potential outcomes without treatment. We analyze the properties of such estimators and propose inferential techniques. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.02196&r= |
By: | Karol Binkowski; Peilun He; Nino Kordzakhia; Pavel Shevchenko |
Abstract: | The two unobservable state variables representing the short and long term factors introduced by Schwartz and Smith in [16] for risk-neutral pricing of futures contracts are modelled as two correlated Ornstein-Uhlenbeck processes. The Kalman Filter (KF) method has been implemented to estimate the short and long term factors jointly with un- known model parameters. The parameter identification problem arising within the likelihood function in the KF has been addressed by introduc- ing an additional constraint. The obtained model parameter estimates are the conditional Maximum Likelihood Estimators (MLEs) evaluated within the KF. Consistency of the conditional MLEs is studied. The methodology has been tested on simulated data. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.01881&r= |
By: | Fernando E. Alvarez; Katarína Borovičková; Robert Shimer |
Abstract: | We develop an estimator and tests of a discrete time mixed proportional hazard (MPH) model of duration with unobserved heterogeneity. We allow for competing risks, observable characteristics, and censoring, and we use linear GMM, making estimation and inference straightforward. With repeated spell data, our estimator is consistent and robust to the unknown shape of the frailty distribution. We apply our estimator to the duration of price spells in weekly store data from IRI. We find substantial unobserved heterogeneity, accounting for a large fraction of the decrease in the Kaplan-Meier hazard with elapsed duration. Still, we show that the estimated baseline hazard rate is decreasing and a homogeneous firm model can accurately capture the response of the economy to a monetary policy shock even if there is significant strategic complementarity in pricing. Using competing risks and spell-specific observable characteristics, we separately estimate the model for regular and temporary price changes and find that the MPH structure describes regular price changes better than temporary ones. |
JEL: | C14 C41 E31 E50 |
Date: | 2021–07 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:29112&r= |
By: | Ronald Richman; Mario V. W\"uthrich |
Abstract: | Deep learning models have gained great popularity in statistical modeling because they lead to very competitive regression models, often outperforming classical statistical models such as generalized linear models. The disadvantage of deep learning models is that their solutions are difficult to interpret and explain, and variable selection is not easily possible because deep learning models solve feature engineering and variable selection internally in a nontransparent way. Inspired by the appealing structure of generalized linear models, we propose a new network architecture that shares similar features as generalized linear models, but provides superior predictive power benefiting from the art of representation learning. This new architecture allows for variable selection of tabular data and for interpretation of the calibrated deep learning model, in fact, our approach provides an additive decomposition in the spirit of Shapley values and integrated gradients. |
Date: | 2021–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2107.11059&r= |
By: | William Torous; Florian Gunsilius; Philippe Rigollet |
Abstract: | We propose a method based on optimal transport theory for causal inference in classical treatment and control study designs. Our approach sheds a new light on existing approaches and generalizes them to settings with high-dimensional data. The implementation of our method leverages recent advances in computational optimal transport to produce an estimate of high-dimensional counterfactual outcomes. The benefits of this extension are demonstrated both on synthetic and real data that are beyond the reach of existing methods. In particular, we revisit the classical Card & Krueger dataset on the effect of a minimum wage increase on employment in fast food restaurants and obtain new insights about the impact of raising the minimum wage on employment of full- and part-time workers in the fast food industry. |
Date: | 2021–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2108.05858&r= |
By: | Marc Grossouvre (URBS); Didier Rullière (Mines Saint-Étienne MSE - École des Mines de Saint-Étienne - IMT - Institut Mines-Télécom [Paris], FAYOL-ENSMSE - Institut Henri Fayol - Mines Saint-Étienne MSE - École des Mines de Saint-Étienne - IMT - Institut Mines-Télécom [Paris], LIMOS - Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes - Ecole Nationale Supérieure des Mines de St Etienne - CNRS - Centre National de la Recherche Scientifique - UCA - Université Clermont Auvergne - INP Clermont Auvergne - Institut national polytechnique Clermont Auvergne - UCA - Université Clermont Auvergne, FAYOL-ENSMSE - Département Génie mathématique et industriel - Ecole Nationale Supérieure des Mines de St Etienne - Institut Henri Fayol) |
Abstract: | This paper deals with three related problems in a geostatistical context. First, some data are available for given areas of the space, rather than for some specic locations, which creates specic problems of multiscale areal data. Second, some uncertainties rely both on the input locations and on measured quantities at these locations, which creates specic uncertainty propagation problems. Third, multidimensional outputs can be observed, with sometimes missing data. These three problems are addressed simultaneously here by considering mixtures of multivariate random elds, and by adapting standard Kriging methodology to this context. While the usual Gaussian setting is lost, we show that conditional mean, variance and covariances can be derived from this specic setting. A numerical illustration on simulated data is given. |
Keywords: | Mixture Kriging,granular data,ecological inference,disaggregation,change of support,block Kriging,areal data,area-to-point,regional Kriging,multiscale processes |
Date: | 2021–07–01 |
URL: | http://d.repec.org/n?u=RePEc:hal:wpaper:hal-03276127&r= |
By: | Nguyen, Hoang (Örebro University School of Business); Nguyen, Trong-Nghia (The University of Sydney Business School); Tran, Minh-Ngoc (The University of Sydney Business School) |
Abstract: | Stock returns are considered as a convolution of two random processes that are the return innovation and the volatility innovation. The correlation of these two processes tends to be negative which is the so-called leverage effect. In this study, we propose a dynamic leverage stochastic volatility (DLSV) model where the correlation structure between the return innovation and the volatility innovation is assumed to follow a generalized autoregressive score (GAS) process. We founnd that the leverage effect is reinforced in the market downturn period and weakened in the market upturn period. |
Keywords: | Dynamic leverage; GAS; stochastic volatility (SV) |
JEL: | C11 C52 C58 |
Date: | 2021–05–20 |
URL: | http://d.repec.org/n?u=RePEc:hhs:oruesi:2021_014&r= |