|
on Econometrics |
By: | Myrto Kalouptsidi; Paul T. Scott; Eduardo Souza-Rodrigues |
Abstract: | In structural dynamic discrete choice models, the presence of serially correlated unobserved states and state variables that are measured with error may lead to biased parameter estimates and misleading inference. In this paper, we show that instrumental variables can address these issues, as long as measurement problems involve state variables that evolve exogenously from the perspective of individual agents (i.e., market-level states). We define a class of linear instrumental variables estimators that rely on Euler equations expressed in terms of conditional choice probabilities (ECCP estimators). These estimators do not require observing or modeling the agent’s entire information set, nor solving or simulating a dynamic program. As such, they are simple to implement and computationally light. We provide constructive identification arguments to identify the model primitives, and establish the consistency and asymptotic normality of the estimator. A Monte Carlo study demonstrates the good finite-sample performance of the ECCP estimator in the context of a dynamic demand model for durable goods. |
JEL: | C13 C35 C36 C51 C61 |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:25134&r=ecm |
By: | Chang, Jinyuan; Guo, Bin; Yao, Qiwei |
Abstract: | We extend the principal component analysis (PCA) to secondorder stationary vector time series in the sense that we seek for a contemporaneous linear transformation for a p-variate time series such that the transformed series is segmented into several lowerdimensional subseries, and those subseries are uncorrelated with each other both contemporaneously and serially. Therefore those lowerdimensional series can be analyzed separately as far as the linear dynamic structure is concerned. Technically it boils down to an eigenanalysis for a positive definite matrix. When p is large, an additional step is required to perform a permutation in terms of either maximum cross-correlations or FDR based on multiple tests. The asymptotic theory is established for both fixed p and diverging p when the sample size n tends to infinity. Numerical experiments with both simulated and real data sets indicate that the proposed method is an effective initial step in analyzing multiple time series data, which leads to substantial dimension reduction in modelling and forecasting high-dimensional linear dynamical structures. Unlike PCA for independent data, there is no guarantee that the required linear transformation exists. When it does not, the proposed method provides an approximate segmentation which leads to the advantages in, for example, forecasting for future values. The method can also be adapted to segment multiple volatility processes |
JEL: | C1 |
Date: | 2017–07–09 |
URL: | http://d.repec.org/n?u=RePEc:ehl:lserod:84106&r=ecm |
By: | Valentin Zelenyuk (CEPA - School of Economics, The University of Queensland); Robin C. Sickles (Department of Economics, Rice University); Wonho Song (School of Economics, Chung-Ang University) |
Abstract: | Our chapter details a wide variety of approaches used in estimating productivity and efficiency based on methods developed to estimate frontier production using Stochastic Frontier Analysis (SFA) and Data Envelopment Analysis (DEA). The estimators utilize panel, single cross section, and time series data sets. The R programs include such approaches to estimate firm efficiency as the time invariant fixed effects, correlated random effects, and uncorrelated random effects panel stochastic frontier estimators, time varying fixed effects, correlated random effects, and uncorrelated random effects estimators, semi-parametric efficient panel frontier estimators, factor models for cross-sectional and time-varying efficiency, bootstrapping methods to develop confidence intervals for index number-based productivity estimates and their decompositions, DEA and Free Disposable Hull estimators. The chapter provides the professional researcher, analyst, statistician, and regulator with the most up to date efficiency modeling methods in the easily accessible open source programming language R. |
Keywords: | Production (technical) efficiency; Stochastic frontier analysis; Data envelopment analysis; Panel data; Index numbers; Non-parametric analysis; Bootstrapping |
Date: | 2018–09 |
URL: | http://d.repec.org/n?u=RePEc:qld:uqcepa:129&r=ecm |
By: | Jia Chen; Degui Li; Oliver Linton |
Abstract: | This paper studies the estimation of large dynamic covariance matrices with multiple condition- ing variables. We introduce an easy-to-implement semiparametric method to estimate each entry of the covariance matrix via model averaging marginal regression, and then apply a shrinkage technique to obtain the dynamic covariance matrix estimation. Under some regularity conditions, we derive the asymptotic properties for the proposed estimators including the uniform consistency with general convergence rates. We further consider extending our methodology to deal with the scenarios: (i) the number of conditioning variables is divergent as the sample size increases, and (ii) the large covariance matrix is conditionally sparse relative to contemporaneous market factors. We provide a simulation study that illustrates the finite-sample performance of the developed methodology. We also provide an application to financial portfolio choice from daily stock returns. |
Keywords: | Dynamic covariance matrix, MAMAR, Semiparametric estimation, Sparsity, Uniform consistency. |
JEL: | C13 C14 |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:yor:yorken:18/14&r=ecm |
By: | Thomas M. Russell |
Abstract: | Many inference procedures in the literature on partial identification are designed for when the inferential object of interest is the entire (partially identified) vector of parameters. However, when the researcher's inferential object of interest is a subvector or functional of the parameter vector, these inference procedures can be highly conservative, especially when the dimension of the parameter vector is large. This paper considers uniformly valid inference for continuous functionals of partially identified parameters in cases where the identified set is defined by convex (in the parameter) moment inequalities. We use a functional delta method and propose a method for constructing uniformly valid confidence sets for a (possibly stochastic) convex functional of a partially identified parameter. The proposed method amounts to bootstrapping the Lagrangian of a convex optimization problem, and subsumes subvector inference as a special case. Unlike other proposed subvector inference procedures, our procedure does not require the researcher to repeatedly invert a hypothesis test. Finally, we discuss sufficient conditions on the moment functions to ensure uniform validity. |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1810.03180&r=ecm |
By: | Sokbae Lee; Yuan Liao; Myung Hwan Seo; Youngki Shin |
Abstract: | We propose a novel two-regime regression model where the switching between the regimes is driven by a vector of possibly unobservable factors. When the factors are latent, we estimate them by the principal component analysis of a much larger panel data set. Our approach enriches conventional threshold models in that a vector of factors may represent economy-wide shocks more realistically than a scalar observed random variable. Estimating our model brings new challenges as well as opportunities in terms of both computation and asymptotic theory. We show that the optimization problem can be reformulated as mixed integer optimization and present two alternative computational algorithms. We derive the asymptotic distributions of the resulting estimators under the scheme that the threshold effect shrinks to zero. In particular, with latent factors, not only do we establish the conditions on factor estimation for a strong oracle property, which are different from those for smooth factor augmented models, but we also identify semi-strong and weak oracle cases and establish a phase transition that describes the effect of first stage factor estimation as the cross-sectional dimension of panel data increases relative to the time-series dimension. Moreover, we develop a consistent factor selection procedure with a penalty term on the number of factors and present a complementary bootstrap testing procedure for linearity with the aid of efficient computational algorithms. Finally, we illustrate our methods via Monte Carlo experiments and by applying them to factor-driven threshold autoregressive models of US macro data. |
Keywords: | threshold regression, factors, mixed integer optimization, panel data, phase transition, oracle properties, l0-penalization. |
JEL: | C13 C51 |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:mcm:deptwp:2018-14&r=ecm |
By: | Jia Chen |
Abstract: | This paper studies the estimation of latent group structures in heterogeneous time-varying coefficient panel data models. While allowing the coefficient functions to vary over cross sections provides a good way to model cross-sectional heterogeneity, it reduces the degree of freedom and leads to poor estimation accuracy when the time-series length is short. On the other hand, in a lot of empirical studies, it is not uncommon to find that heterogeneous coefficients exhibit group structures where coefficients belonging to the same group are similar or identical. This paper aims to provide an easy and straightforward approach for estimating the underlying latent groups. This approach is based on the hierarchical agglomerative clustering (HAC) of kernel estimates of the heterogeneous time-varying coefficients when the number of groups is known. We establish the consistency of this clustering method and also propose a generalised information criterion for estimating the number of groups when it is unknown. Simulation studies are carried out to examine the finite sample properties of the proposed clustering method as well as the post-clustering estimation of the group- specific time-varying coefficients. The simulation results show that our methods give comparable performance as the penalised-sieve-estimation based classifier Lasso approach by Su et al. (2018), but are computationally easier. An application to a cross-country growth study is also provided. |
Keywords: | Hierarchical agglomerative clustering; Generalised information criterion; Kernel estimation; Panel data; Time-varying coefficients. |
JEL: | C13 C14 C23 |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:yor:yorken:18/15&r=ecm |
By: | Xi Chen; Weidong Liu; Yichen Zhang |
Abstract: | This paper studies the inference problem in quantile regression (QR) for a large sample size $n$ but under a limited memory constraint, where the memory can only store a small batch of data of size $m$. A natural method is the na\"ive divide-and-conquer approach, which splits data into batches of size $m$, computes the local QR estimator for each batch, and then aggregates the estimators via averaging. However, this method only works when $n=o(m^2)$ and is computationally expensive. This paper proposes a computationally efficient method, which only requires an initial QR estimator on a small batch of data and then successively refines the estimator via multiple rounds of aggregations. Theoretically, as long as $n$ grows polynomially in $m$, we establish the asymptotic normality for the obtained estimator and show that our estimator with only a few rounds of aggregations achieves the same efficiency as the QR estimator computed on all the data. Moreover, our result allows the case that the dimensionality $p$ goes to infinity. The proposed method can also be applied to address the QR problem under distributed computing environment (e.g., in a large-scale sensor network) or for real-time streaming data. |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1810.08264&r=ecm |
By: | Olivier Ledoit; Michael Wolf |
Abstract: | Applied researchers often want to make inference for the difference of a given performance measure for two investment strategies. In this paper, we consider the class of performance measures that are smooth functions of population means of the underlying returns; this class is very rich and contains many performance measures of practical interest (such as the Sharpe ratio and the variance). Unfortunately, many of the inference procedures that have been suggested previously in the applied literature make unreasonable assumptions that do not apply to real-life return data, such as normality and independence over time. We will discuss inference procedures that are asymptotically valid under very general conditions, allowing for heavy tails and time dependence in the return data. In particular, we will promote a studentized time series bootstrap procedure. A simulation study demonstrates the improved finite-sample performance compared to existing procedures. Applications to real data are also provided. |
Keywords: | Bootstrap, HAC inference, kurtosis, Sharpe ratio, sknewness, variance |
JEL: | C12 C14 C22 |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:zur:econwp:305&r=ecm |
By: | Elvin Isufi; Andreas Loukas; Nathanael Perraudin; Geert Leus |
Abstract: | Graph-based techniques emerged as a choice to deal with the dimensionality issues in modeling multivariate time series. However, there is yet no complete understanding of how the underlying structure could be exploited to ease this task. This work provides contributions in this direction by considering the forecasting of a process evolving over a graph. We make use of the (approximate) time-vertex stationarity assumption, i.e., timevarying graph signals whose first and second order statistical moments are invariant over time and correlated to a known graph topology. The latter is combined with VAR and VARMA models to tackle the dimensionality issues present in predicting the temporal evolution of multivariate time series. We find out that by projecting the data to the graph spectral domain: (i) the multivariate model estimation reduces to that of fitting a number of uncorrelated univariate ARMA models and (ii) an optimal low-rank data representation can be exploited so as to further reduce the estimation costs. In the case that the multivariate process can be observed at a subset of nodes, the proposed models extend naturally to Kalman filtering on graphs allowing for optimal tracking. Numerical experiments with both synthetic and real data validate the proposed approach and highlight its benefits over state-of-the-art alternatives. |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1810.08581&r=ecm |
By: | Szabolcs Majoros; Andr\'as Zempl\'eni |
Abstract: | In this paper we extend the known methodology for fitting stable distributions to the multivariate case and apply the suggested method to the modelling of daily cryptocurrency-return data. The investigated time period is cut into 10 non-overlapping sections, thus the changes can also be observed. We apply bootstrap tests for checking the models and compare our approach to the more traditional extreme-value and copula models. |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1810.09521&r=ecm |
By: | Tadao Hoshino; Takahide Yanagi |
Abstract: | This study develops identification and estimation methods for treatment effect models with strategic interaction in treatment decisions. We consider models where one's treatment choice and outcome can be endogenously affected by others' treatment choices. We formulate the interaction of treatment decisions as a two-player complete information game with potential multiple equilibria. For this model, under the assumption of a stochastic equilibrium selection rule, we prove that the marginal treatment effect (MTE) from one's own treatment and that from his/her partner's can be separately point-identified using a latent index framework. Based on our constructive identification results, we propose a two-step semiparametric procedure for estimating the MTE parameters using series approximation. We show that the proposed estimator is uniformly consistent with the optimal convergence rate and has asymptotic normality. |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1810.08350&r=ecm |
By: | Hannah Druckenmiller; Solomon Hsiang |
Abstract: | We propose a simple cross-sectional research design to identify causal effects that is robust to unobservable heterogeneity. When many observational units are adjacent, it may be sufficient to regress the "spatial first differences" (SFD) of the outcome on the treatment and omit all covariates. This approach is conceptually similar to first differencing approaches in time-series or panel models, except the index for time is replaced with an index for locations in space. The SFD approach identifies plausibly causal effects so long as local changes in the treatment and unobservable confounders are not systematically correlated between immediately adjacent neighbors. We illustrate how this approach can mitigate omitted variables bias through simulation and by estimating returns to schooling along 10th Avenue in New York and I-90 in Chicago. We then more fully explore the benefits of this approach by estimating effects of climate and soil on maize yields across US counties. In each case, we demonstrate the performance of the research design by withholding important covariates during estimation. SFD has multiple appealing features, such as internal robustness checks that exploit rotation of the coordinate system or double-differencing across space, it is immediately applicable to spatially-gridded data sets, and it can be easily implemented in statical packages by replacing a single index in pre-existing time-series functions. |
JEL: | C21 Q15 Q51 Q54 |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:25177&r=ecm |
By: | Xinwei Ma; Jingshen Wang |
Abstract: | Inverse Probability Weighting (IPW) is widely used in program evaluation and other empirical economics applications. As Gaussian approximations perform poorly in the presence of "small denominators," trimming is routinely employed as a regularization strategy. However, ad hoc trimming of the observations renders usual inference procedures invalid for the target estimand, even in large samples. In this paper, we propose an inference procedure that is robust not only to small probability weights entering the IPW estimator, but also to a wide range of trimming threshold choices. Our inference procedure employs resampling with a novel bias correction technique. Specifically, we show that both the IPW and trimmed IPW estimators can have different (Gaussian or non-Gaussian) limiting distributions, depending on how "close to zero" the probability weights are and on the trimming threshold. Our method provides more robust inference for the target estimand by adapting to these different limiting distributions. This robustness is partly achieved by correcting a non-negligible trimming bias. We demonstrate the finite-sample accuracy of our method in a simulation study, and we illustrate its use by revisiting a dataset from the National Supported Work program. |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1810.11397&r=ecm |
By: | Arturo Lamadrid-Contreras (Citibanamex); N.R. Ramírez-Rondán (Universidad del Pacífico) |
Abstract: | We develop threshold estimation methods for panel data models with two threshold variables and individual fixed specific effects covering short time periods. In the static panel data model, we propose least squares estimation of the threshold and regression slopes using fixed effects transformations; while in the dynamic panel data model, we propose maximum likelihood estimation of the threshold and slope parameters using first difference transformations. In both models, we propose to estimate the threshold parameters sequentially. We apply the methods to a 15-year sample of 565 U.S. firms to test whether financial constraints affect investment decisions. |
Keywords: | Threshold model, panel data, capital market imperfections |
JEL: | C13 C23 G11 |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:apc:wpaper:128&r=ecm |
By: | Tzougas, George; Karlis, Dimitris; Frangos, Nicholas |
Abstract: | In view of the economic importance of motor third-party liability insurance in developed countries the construction of optimal BMS has been given considerable interest. However, a major drawback in the construction of optimal BMS is that they fail to account for the variability on premium calculations which are treated as point estimates. The present study addresses this issue. Specifically, nonparametric mixtures of Poisson laws are used to construct an optimal BMS with a finite number of classes. The mixing distribution is estimated by nonparametric maximum likelihood (NPML). The main contribution of this paper is the use of the NPML estimator for the construction of confidence intervals for the premium rates derived by updating the posterior mean claim frequency. Furthermore, we advance one step further by improving the performance of the confidence intervals based on a bootstrap procedure where the estimated mixture is used for resampling. The construction of confidence intervals for the individual premiums based on the asymptotic maximum likelihood theory is beneficial for the insurance company as it can result in accurate and effective adjustments to the premium rating policies from a practical point of view. |
JEL: | C1 |
Date: | 2017–04–25 |
URL: | http://d.repec.org/n?u=RePEc:ehl:lserod:70926&r=ecm |
By: | Manganelli, Simone |
Abstract: | A statistical decision rule incorporating judgment does not perform worse than a judgmental decision with a given probability. Under model misspecification, this probability is unknown. The best model is the least misspecified, as it is the one whose probability of underperforming the judgmental decision is closest to the chosen probability. It is identified by the statistical decision rule incorporating judgment with lowest in sample loss. Averaging decision rules according to their asymptotic performance results in decisions which are weakly better than the best decision rule. The model selection criterion is applied to a vector autoregression model for euro area inflation. JEL Classification: C1, C11, C12, C13 |
Keywords: | inflation forecasting, model selection criteria, statistical decision theory |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:ecb:ecbwps:20182188&r=ecm |
By: | Ollech, Daniel |
Abstract: | Currently, the methods used by producers of official statistics do not facilitate the seasonal and calendar adjustment of daily time series, even though an increasing number of series with daily observations are available. The aim of this paper is the development of a procedure to estimate and adjust for periodically recurring systematic effects and the influence of moving holidays in time series with daily observations. To this end, an iterative STL based seasonal adjustment routine is combined with a RegARIMA model for the estimation of calendar and outlier effects. The procedure is illustrated and validated using the currency in circulation in Germany and a set of simulated time series. A comparison with established methods used for the adjustment of monthly data shows that the procedures estimate similar seasonally adjusted series. Thus, the developed procedure closes a gap by facilitating the seasonal and calendar adjustment of daily time series. |
Keywords: | Seasonal adjustment,STL,Daily time series,Seasonality |
JEL: | C14 C22 C53 |
Date: | 2018 |
URL: | http://d.repec.org/n?u=RePEc:zbw:bubdps:412018&r=ecm |
By: | Timothy B. Armstrong (Cowles Foundation, Yale University) |
Abstract: | We derive bounds on the scope for a con?dence band to adapt to the unknown regularity of a nonparametric function that is observed with noise, such as a regression function or density, under the self-similarity condition proposed by Gine and Nickl (2010). We ?nd that adaptation can only be achieved up to a term that depends on the choice of the constant used to de?ne self-similarity, and that this term becomes arbitrarily large for conservative choices of the self-similarity constant. We construct a con?dence band that achieves this bound, up to a constant term that does not depend on the self-similarity constant. Our results suggest that care must be taken in choosing and interpreting the constant that de?nes self-similarity, since the dependence of adaptive con?dence bands on this constant cannot be made to disappear asymptotically. |
Keywords: | Adaptation, Nonparametric inference, Self-similarity |
JEL: | C14 |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:cwl:cwldpp:2146&r=ecm |
By: | Ji Hyung Lee; Zhentao Shi; Zhan Gao |
Abstract: | A typical predictive regression employs a multitude of potential regressors with various degrees of persistence while their signal strength in explaining the dependent variable is often low. Variable selection in such context is of great importance. In this paper, we explore the pitfalls and possibilities of the LASSO methods in this predictive regression A typical predictive regression employs a multitude of potential regressors with various degrees of persistence while their signal strength in explaining the dependent variable is often low. Variable selection in such context is of great importance. In this paper, we explore the pitfalls and possibilities of the LASSO methods in this predictive regression framework with mixed degrees of persistence. With the presence of stationary, unit root and cointegrated predictors, we show that the adaptive LASSO maintains the consistent variable selection and the oracle property due to its penalty scheme that accommodates the system of regressors. On the contrary, conventional LASSO does not have this desirable feature as the penalty its imposed according to the marginal behavior of each individual regressor. We demonstrate this theoretical property via extensive Monte Carlo simulations, and evaluate its empirical performance for short- and long-horizon stock return predictability. |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1810.03140&r=ecm |
By: | Jinook Jeong (Yonsei University); Hyunwoo Lee (Yonsei University) |
Abstract: | In the public construction procurement market, 'abnormally low bids (ALB)' are prevalent and they cause many social and economic problems. Also, when the procurement bids are colluded, ALB make the competitive price systematically underestimated. As many countries regulate ALB, their criteria to identify ALB are not homogenous. Most of the criteria are based on construction cost, which is usually inaccurate, vulnerable to accounting manipulation, and limited to the supply side information of the market. We propose an econometric identification process of ALB using a discriminant analysis. It is based on a switching regression with incomplete separation information and easily estimable by MLE. Through a Monte Carlo simulation, we show that our new method works well. We apply our method to Korean public construction bidding data from 2007 to 2016. The estimation results identify the determinants of the bid prices, along with the determinants of ALB, and presents a more accurate assessment of the collusion damage. |
Keywords: | abnormally low bids, discriminant analysis, public procurement market |
JEL: | H57 L40 L70 |
Date: | 2018–10 |
URL: | http://d.repec.org/n?u=RePEc:yon:wpaper:2018rwp-129&r=ecm |
By: | Valentin Zelenyuk (CEPA - School of Economics, The University of Queensland); Léopold Simar (Institut de Statistique, Biostatistique et Sciences Actuarielles, Université Catholique de Louvain.) |
Abstract: | We propose an improvement of the finite sample approximation of the central limit theorems (CLTs) that were recently derived for statistics involving production efficiency scores estimated via Data Envelopment Analysis (DEA) or Free Disposal Hull (FDH) approaches. The improvement is very easy to implement since it involves a simple correction of the already employed statistics without any additional computational burden and preserves the original asymptotic results such as consistency and asymptotic normality. The proposed approach persistently showed improvement in all the scenarios that we tried in various Monte-Carlo experiments, especially for relatively small samples or relatively large dimensions (measured by total number of inputs and outputs) of the underlying production model. This approach therefore is expected to be valuable (and at almost no additional computational costs) for practitioners wishing to perform statistical inference about production efficiency using DEA or FDH approaches. |
Keywords: | Data Envelopment Analysis, DEA; Free Disposal Hull, FDH; Statistical; Inference; Production Efficiency; Productivity |
Date: | 2018–09 |
URL: | http://d.repec.org/n?u=RePEc:qld:uqcepa:128&r=ecm |