Econometrics
http://lists.repec.org/mailman/listinfo/nep-ecm
Econometrics
2018-10-15
Covariate Distribution Balance via Propensity Scores
http://d.repec.org/n?u=RePEc:arx:papers:1810.01370&r=ecm
The propensity score plays an important role in causal inference with observational data. Once the propensity score is available, one can use it to estimate a variety of causal effects in a unified setting. Despite this appeal, a main practical difficulty arises because the propensity score is usually unknown, has to be estimated, and extreme propensity score estimates can lead to distorted inference procedures. To address these limitations, this article proposes to estimate the propensity score by fully exploiting its covariate balancing property. We call the resulting estimator the integrated propensity score (IPS) as it is based on integrated moment conditions. In sharp contrast with other methods that balance only some specific moments of covariates, the IPS aims to balance \textit{all} functions of covariates. Further, the IPS estimator is data-driven, does not rely on tuning parameters such as bandwidths, admits an asymptotic linear representation, and is $\sqrt{n}$-consistent and asymptotically normal. We derive the asymptotic properties of inverse probability weighted estimators for the average, distributional and quantile treatment effects based on the IPS, and illustrate their relative performance via Monte Carlo simulations and three empirical applications. An implementation of the proposed methods is provided in the new package \texttt{IPS} for \texttt{R}.
Pedro H. C. Sant'Anna
Xiaojun Song
Qi Xu
2018-10
Parameter Estimation of Heavy-Tailed AR Model with Missing Data via Stochastic EM
http://d.repec.org/n?u=RePEc:arx:papers:1809.07203&r=ecm
The autoregressive (AR) model is a widely used model to understand time series data. Traditionally, the innovation noise of the AR is modeled as Gaussian. However, many time series applications, for example, financial time series data are non-Gaussian, therefore, the AR model with more general heavy-tailed innovations are preferred. Another issue that frequently occurs in time series is missing values, due to the system data record failure or unexpected data loss. Although there are numerous works about Gaussian AR time series with missing values, as far as we know, there does not exist any work addressing the issue of missing data for the heavy-tailed AR model. In this paper, we consider this issue for the first time, and propose an efficient framework for the parameter estimation from incomplete heavy-tailed time series based on the stochastic approximation expectation maximization (SAEM) coupled with a Markov Chain Monte Carlo (MCMC) procedure. The proposed algorithm is computationally cheap and easy to implement. The convergence of the proposed algorithm to a stationary point of the observed data likelihood is rigorously proved. Extensive simulations on synthetic and real datasets demonstrate the efficacy of the proposed framework.
Junyan Liu
Sandeep Kumar
Daniel P. Palomar
2018-09
Focused econometric estimation for noisy and small datasets: A Bayesian Minimum Expected Loss estimator approach
http://d.repec.org/n?u=RePEc:arx:papers:1809.06996&r=ecm
Central to many inferential situations is the estimation of rational functions of parameters. The mainstream in statistics and econometrics estimates these quantities based on the plug-in approach without consideration of the main objective of the inferential situation. We propose the Bayesian Minimum Expected Loss (MELO) approach focusing explicitly on the function of interest, and calculating its frequentist variability. Asymptotic properties of the MELO estimator are similar to the plug-in approach. Nevertheless, simulation exercises show that our proposal is better in situations characterized by small sample sizes and noisy models. In addition, we observe in the applications that our approach gives lower standard errors than frequently used alternatives when datasets are not very informative.
Andres Ramirez-Hassan
Manuel Correa-Giraldo
2018-09
Bootstrapping tail statistics: Tail quantile process, Hill estimator, and confidence intervals for highquantiles of heavy tailed distributions
http://d.repec.org/n?u=RePEc:msh:ebswps:2018-12&r=ecm
In risk management areas such as reinsurance, the need often arises to construct a confidence interval for a quantile in the tail of the distribution. While different methods are available for this purpose, doubts have been raised about the validity of full-sample bootstrap. In this paper, we first obtain some general results on the validity of fullsample bootstrap for the tail quantile process. This opens the possibility of developing bootstrap methods based on tail statistics. Second, we develop a bootstrap method for constructing confidence intervals for high-quantiles of heavy-tailed distributions and show that it is consistent. In our simulation study, the bootstrap method for constructing confidence intervals for high quantiles performed overall better than the data tilting method, but none was uniformly the best; the data tilting method appears to be currently the preferred choice. Since the two methods are based on quite different approaches, we recommend that both methods be used side by side in applications.
Svetlana Litvinova
Mervyn J Silvapulle
Full-sample bootstrap, intermediate order statistic, Hill estimator, extreme value index, tail empirical process, tail quantile process
2018
Nonparametric Regression with Selectively Missing Covariates
http://d.repec.org/n?u=RePEc:arx:papers:1810.00411&r=ecm
We consider the problem of regressions with selectively observed covariates in a nonparametric framework. Our approach relies on instrumental variables that explain variation in the latent covariates but have no direct effect on se- lection. The regression function of interest is shown to be a weighted version of observed conditional expectation where the weighting function is a fraction of selection probabilities. Nonparametric identification of the fractional probabil- ity weight (FPW) function is achieved via a partial completeness assumption. We provide primitive functional form assumptions for partial completeness to hold. The identification result is constructive for the FPW series estimator. We derive the rate of convergence and also the pointwise asymptotic distribution. In both cases, the asymptotic performance of the FPW series estimator does not suffer from the inverse problem which derives from the nonparametric instru- mental variable approach. In a Monte Carlo study, we analyze the finite sample properties of our estimator and we demonstrate the usefulness of our method in analyses based on survey data. In the empirical application, we estimate the as- sociation between income and health using linked data from the SHARE survey data and administrative pension information. The pension information which is a function of the full earnings history is used as an instrument. We show that income is selectively missing and we demonstrate that standard methods that do not account for the nonrandom selection process are strongly downward biased, in particular for high income individuals.
Christoph Breunig
Peter Haan
2018-09
Jensen-Shannon Divergence as a Goodness-of-Fit Measure for Maximum Likelihood Estimation and Curve Fitting
http://d.repec.org/n?u=RePEc:arx:papers:1809.11052&r=ecm
The coefficient of determination, known as $R^2$, is commonly used as a goodness-of-fit criterion for fitting linear models. $R^2$ is somewhat controversial when fitting nonlinear models, although it may be generalised on a case-by-case basis to deal with specific models such as the logistic model. Assume we are fitting a parametric distribution to a data set using, say, the maximum likelihood estimation method. A general approach to measure the goodness-of-fit of the fitted parameters, which we advocate herein, is to use a nonparametric measure for model comparison between the raw data and the fitted model. In particular, for this purpose we put forward the Jensen-Shannon divergence ($JSD$) as a metric, which is bounded and has an intuitive information-theoretic interpretation. We demonstrate, via a straightforward procedure making use of the $JSD$, that it can be used as part of maximum likelihood estimation or curve fitting as a measure of goodness-of-fit, including the construction of a confidence interval for the fitted parametric distribution. We also propose that the $JSD$ can be used more generally in nonparametric hypothesis testing for model selection.
Mark Levene
Aleksejus Kononovicius
2018-09
Bootstrap Assisted Tests of Symmetry for Dependent Data
http://d.repec.org/n?u=RePEc:svk:wpaper:1058&r=ecm
TThe paper considers the problem of testing for symmetry (about an unknown centre) of the marginal distribution of a strictly stationary and weakly dependent stochastic process. The possibility of using the autoregressive sieve bootstrap and stationary bootstrap procedures to obtain critical values and P-values for symmetry tests is explored. Bootstrap-assisted tests for symmetry are straightforward to implement and require no prior estimation of asymptotic variances. The small-sample properties of a wide variety of tests are investigated using Monte Carlo experiments. A bootstrap-assisted version of the triples test is found to have the best overall performance.
Zacharias Psaradakis
Marian Vavra
Autoregressive sieve bootstrap; Stationary bootstrap; Symmetry; Weak dependence
2018-10
Nonparametric Estimation and Identification in Non-Separable Models Using Panel Data
http://d.repec.org/n?u=RePEc:arx:papers:1810.00283&r=ecm
We present non-parametric identification results for panel models in the presence of a vector of unobserved heterogeneity that is not additively separable in the structural function. We exploit the time-invariance and finite dimension of the heterogeneity to achieve identification of a number of objects of interest with the panel length fixed. Identification does not require that the researcher have access to an instrument that is uncorrelated with the unobserved heterogeneity. Instead the identification strategy relies on an assumption that some lags and leads of observables are independent conditional on the unobserved heterogeneity and some controls. The identification strategy motivates an estimation procedure based on penalized sieve minimum distance estimation in the non-parametric instrumental variables framework. We give conditions under which the estimator is consistent and derive its rate of convergence. We present Monte Carlo evidence of its efficacy in finite samples.
Ben Deaner
2018-09
Large mixed-frequency VARs with a parsimonious time-varying parameter structure
http://d.repec.org/n?u=RePEc:zbw:bubdps:402018&r=ecm
To simultaneously consider mixed-frequency time series, their joint dynamics, and possible structural changes, we introduce a time-varying parameter mixed-frequency VAR. To keep our approach from becoming too complex, we implement time variation parsimoniously: only the intercepts and a common factor in the error variances vary over time. We can therefore estimate moderately large systems in a reasonable amount of time, which makes our modifications appealing for practical use. For eleven U.S. variables, we examine the performance of our model and compare the results to the time-constant MF-VAR of Schorfheide and Song (2015). Our results demonstrate the feasibility and usefulness of our method.
Götz, Thomas B.
Hauzenberger, Klemens
Mixed Frequencies,Time-Varying Intercepts,Common Stochastic Volatility,Bayesian VAR,Forecasting
2018
Modelling Structural Zeros in Compositional Data
http://d.repec.org/n?u=RePEc:crt:wpaper:1803&r=ecm
Inspired by Butler and Glasbey (2008) we propose a model that treats the zero values for compositional data in a different manner.
Michail Tsagris
compositional data, a-transformation, structural zeros
2018-10-04
Prediction Regions for Interval-valued Time Series
http://d.repec.org/n?u=RePEc:ucr:wpaper:201817&r=ecm
We approximate probabilistic forecasts for interval-valued time series by offering alternative approaches to construct bivariate prediction regions of the interval center and range (or lower/upper bounds). We estimate a bivariate system of the center/log-range, which may not be normally distributed. Implementing analytical or bootstrap methods, we directly transform prediction regions for center/log-range into those for center/range and upper/lower bounds systems. We propose new metrics to evaluate the regions performance. Monte Carlo simulations show bootstrap methods being preferred even in Gaussian systems. For daily SP500 low/high return intervals, we build joint conditional prediction regions of the return level and return volatility.
Gloria Gonzalez-Rivera
Yun Luo
Esther Ruiz
Bootstrap, Constrained Regression, Coverage Rates, Logarithmic Transformation, QML estimation
2018-10
Seasonality Detection in Small Samples using Score-Driven Nonlinear Multivariate Dynamic Location Models
http://d.repec.org/n?u=RePEc:cte:werepe:27483&r=ecm
We suggest a new mechanism to detect stochastic seasonality of multivariate macroeconomic variables, by using an extension of the score-driven first-order multivariate t-distribution model. We name the new model as the quasi-vector autoregressive (QVAR) model. QVAR is a nonlinear extension of Gaussian VARMA (VAR moving average). The location of dependent variables for QVAR is updated by the score function, thus QVAR is robust to extreme observations. For QVAR, we present the econometric formulation, computation of the impulse response function (IRF), maximum likelihood (ML) estimation, and conditions of the asymptotic properties of ML that include invertibility. We use quarterly macroeconomic data for the period of 1987:Q1 to 2013:Q2 inclusive, which include extreme observations from three I(0) variables: percentage change in crude oil real price, United States (US) inflation rate, and US real gross domestic product (GDP) growth. The sample size of these data is relatively small, which occurs frequently in macroeconomic analyses. The statistical performance of QVAR is superior to that of VARMA and VAR. Annual seasonality effects are identified for QVAR, whereas those effects are not identified for VARMA and VAR. Our results suggest that QVAR may be used as a practical tool for seasonality detection in small macroeconomic datasets.
Licht, Adrian
Escribano Sáez, Álvaro
Blazsek, Szabolcs Istvan
2018-09-12
Ergodicity conditions for a double mixed Poisson autoregression
http://d.repec.org/n?u=RePEc:pra:mprapa:88843&r=ecm
We propose a double mixed Poisson autoregression in which the intensity, scaled by a unit mean independent and identically distributed (iid) mixing process, has different regime specifications according to the state of a finite unobserved iid chain. Under some contraction in mean conditions, we show that the proposed model is strictly stationary and ergodic with a finite mean. Applications to various count time series models are given.
Aknouche, Abdelhakim
Demouche, Nacer
Double mixed Poisson autoregression, negative binomial mixture INGARCH model, ergodicity, weak dependence, contraction in mean
2018-03-03
Multivariate Stochastic Volatility with Co-Heteroscedasticity
http://d.repec.org/n?u=RePEc:ngi:dpaper:18-12&r=ecm
This paper develops a new methodology that decomposes shocks into homoscedastic and heteroscedastic components. This specification implies there exist linear combinations of heteroscedastic variables that eliminate heteroscedasticity. That is, these linear combinations are homoscedastic; a property we call co-heteroscedasticity. The heteroscedastic part of the model uses a multivariate stochastic volatility inverse Wishart process. The resulting model is invariant to the ordering of the variables, which we show is important for impulse response analysis but is generally important for, e.g., volatility estimation and variance decompositions. The specification allows estimation in moderately high-dimensions. The computational strategy uses a novel particle lter algorithm, a reparameterization that substantially improves algorithmic convergence and an alternating-order particle Gibbs that reduces the amount of particles needed for accurate estimation. We provide two empirical applications; one to exchange rate data and another to a large Vector Autoregression (VAR) of US macroeconomic variables. We find strong evidence for co-heteroscedasticity and, in the second application, estimate the impact of monetary policy on the homoscedastic and heteroscedastic components of macroeconomic variables.
Joshua Chan
Arnaud Doucet
Roberto Leon-Gonzalez
Rodney W. Strachan
2018-10
Efficient generation of time series with diverse and controllable characteristics
http://d.repec.org/n?u=RePEc:msh:ebswps:2018-15&r=ecm
The explosion of time series data in recent years has brought a flourish of new time series analysis methods, for forecasting, clustering, classification and other tasks. The evaluation of these new methods requires a diverse collection of time series data to enable reliable comparisons against alternative approaches. We propose the use of mixture autoregressive (MAR) models to generate collections of time series with diverse features. We simulate sets of time series using MAR models and investigate the diversity and coverage of the simulated time series in a feature space. An efficient method is also proposed for generating new time series with controllable features by tuning the parameters of the MAR models. The simulated data based on our method can be used as evaluation tool for tasks such as time series classification and forecasting.
Yanfei Kang
Rob J Hyndman
Feng Li
time series features, time series generation, mixture autoregressive models
2018
Combining Uncertainty with Uncertainty to Get Certainty? Efficiency Analysis for Regulation Purposes
http://d.repec.org/n?u=RePEc:mia:wpaper:2018-02&r=ecm
Data envelopment analysis (DEA) and stochastic frontier analysis (SFA), as well as combinations thereof, are widely applied in incentive regulation practice, where the assessment of efficiency plays a major role in regulation design and benchmarking. Using a Monte Carlo simulation experiment, this paper compares the performance of six alternative methods commonly applied by regulators. Our results demonstrate that combination approaches, such as taking the maximum or the mean over DEA and SFA efficiency scores, have certain practical merits and might offer an useful alternative to strict reliance on a singular method. In particular, the results highlight that taking the maximum not only minimizes the risk of underestimation, but can also improve the precision of efficiency estimation. Based on our results, we give recommendations for the estimation of individual efficiencies for regulation purposes and beyond.
Mark Andor
Christopher F. Parmeter
Stephan Sommer
Data Envelopment Analysis; Stochastic Frontier Analysis; Efficiency Analysis; Regulation; Network Operators Publication Status: Forthcoming
2018-09-29
Forecast Density Combinations of Dynamic Models and Data Driven Portfolio Strategies
http://d.repec.org/n?u=RePEc:bno:worpap:2018_10&r=ecm
A dynamic asset-allocation model is specified in probabilistic terms as a combination of return distributions resulting from multiple pairs of dynamic models and portfolio strategies based on momentum patterns in US industry returns. The nonlinear state space representation of the model allows efficient and robust simulation-based Bayesian inference using a novel non-linear filter. Combination weights can be crosscorrelated and correlated over time using feedback mechanisms. Diagnostic analysis gives insight into model and strategy misspecification. Empirical results show that a smaller flexible model-strategy combination performs better in terms of expected return and risk than a larger basic model-strategy combination. Dynamic patterns in combination weights and diagnostic learning provide useful signals for improved modelling and policy, in particular, from a risk-management perspective.
Nalan Basturk
Agnieszka Borowska
Stefano Grassi
Lennart Hoogerheide
Herman K. van Dijk
2018-10-08
A Note on Specification Testing in Some Structural Regression Models
http://d.repec.org/n?u=RePEc:bbk:bbkefp:1809&r=ecm
There is a useful but not widely known framework for jointly implementing Durbin-Wu-Hausman exogeneity and Sargan-Hansen overidentification tests, as a single artificial regression. This note sets out the framework for linear models and discusses its extension to non-linear models.
Walter Beckert
endogeneity, identification, testing, artificial regression.
2018-08
Optimal Asset Allocation with Multivariate Bayesian Dynamic Linear Models
http://d.repec.org/n?u=RePEc:brd:wpaper:123&r=ecm
We introduce a simulation-free method to model and forecast multiple asset returns and employ it to investigate the optimal ensemble of features to include when jointly predicting monthly stock and bond excess returns. Our approach builds on the Bayesian Dynamic Linear Models of West and Harrison (1997), and it can objectively determine, through a fully automated procedure, both the optimal set of regressors to include in the predictive system and the degree to which the model coefficients, volatilities, and covariances should vary over time. When applied to a portfolio of five stock and bond returns, we find that our method leads to large forecast gains, both in statistical and economic terms. In particular, we find that relative to a standard no-predictability benchmark, the optimal combination of predictors, stochastic volatility, and time-varying covariances increases the annualized certainty equivalent returns of a leverage-constrained power utility investor by more than 500 basis points.
Carlos Carvalho
Jared D. Fisher
Davide Pettenuzzo
Optimal asset allocation, Bayesian econometrics, Dynamic Linear models
2018-09
Seasonal Quasi-Vector Autoregressive Models with an Application to Crude Oil Production and Economic Activity in the United States and Canada
http://d.repec.org/n?u=RePEc:cte:werepe:27484&r=ecm
We introduce the Seasonal-QVAR (quasi-vector autoregressive) model that we apply to study the relationship between oil production and economic activity. Seasonal-QVAR is a score-driven nonlinear model for the multivariate t distribution. It is an alternative to the basic structural model that disentangles local level and stochastic seasonality. Seasonal-QVAR is robust to extreme observations and it is an extension of Seasonal-VARMA (VAR moving average). We use monthly data from world crude oil production growth, global real economic activity growth and the industrial production growths of the United States and Canada. We address an important economic question about the influence of world crude oil production on the industrial productions of the United States and Canada. We find that the effects of industrial production growth of the United States on world crude oil production growth are about six times higher for the basic structural model and Seasonal-VARMA than for Seasonal-QVAR. We also find that the effects of world crude oil production growth on the industrial production growth of Canada are positive for Seasonal-QVAR, but those effects are negative for Seasonal-VARMA. Likelihood-based performance metrics and transitivity arguments support the estimates of Seasonal- QVAR, as opposed to the basic structural model and Seasonal-VARMA.
Licht, Adrian
Escribano Sáez, Álvaro
Blazsek, Szabolcs Istvan
Vector autoregressive moving average (VARMA) model ;
Basic structural model ;
Nonlinear multivariate dynamic location models ;
Score-driven stochastic seasonality ;
Dynamic conditional score (DCS)
2018-09-12
Term Structure Models During the Global Financial Crisis: A Parsimonious Text Mining Approach
http://d.repec.org/n?u=RePEc:cfi:fseres:cf446&r=ecm
This work develops and estimates a three-factor term structure model with ex-plicit sentiment factors in a period including the global financial crisis, where market confidence was said to erode considerably. It utilizes a large text data of real time, relatively high-frequency market news and takes account of the dfficulties in incor-porating market sentiment into the models. To the best of our knowledge, this is the first attempt to use this category of data in term-structure models. Although market sentiment or market confidence is often regarded as an important driver of asset markets, it is not explicitly incorporated in traditional empirical factor models for daily yield curve data because they are unobservable. To overcome this problem, we use a text mining approach to generate observable variables which are driven by otherwise unobservable sentiment factors. Then, applying the Monte Carlo filter as a filtering method in a state space Bayesian filtering approach, we estimate the dynamic stochastic structure of these latent factors from observable variables driven by these latent variables. As a result, the three-factor model with text mining is able to distinguish (1) a spread-steepening factor which is driven by pessimists¡Ç view and explaining the spreads related to ultra-long term yields from (2) a spread-flattening factor which is driven by optimists¡Ç view and influencing the long and medium term spreads. Also, the three-factor model with text mining has better fitting to the observed yields than the model without text mining. Moreover, we collect market participants¡Ç views about specific spreads in the term structure and find that the movement of the identified sentiment factors are consistent with the market participants¡Ç views, and thus market sentiment.
Kiyohiko G. Nishimura
Seisho Sato
Akihiko Takahashi
2018-10
Discriminant Analysis with Spherical Data
http://d.repec.org/n?u=RePEc:crt:wpaper:1804&r=ecm
Discriminant analysis for spherical data, or directional data in general, has not been extensively studied, and most papers focus on one distribution, the von Mises-Fisher.
Michail Tsagris
Abdulaziz Alenazi
spherical data, rotationally non symmetric, classification
2018-10-04
Detecting Regimes of Predictability in the U.S. Equity Premium
http://d.repec.org/n?u=RePEc:esy:uefcwp:23198&r=ecm
We investigate the stability of predictive regression models for the U.S. equity premium. A new approach for detecting regimes of temporary predictability is proposed using se- quential implementations of standard (heteroskedasticity-robust) regression t-statistics for predictability applied over relatively short time periods. Critical values for each test in the sequence are provided using subsampling methods. Our primary focus is to develop a real-time monitoring procedure for the emergence of predictive regimes using tests based on end-of-sample data in the sequential procedure, although the procedure could be used for an historical analysis of predictability. Our proposed method is robust to both the degree of persistence and endogeneity of the regressors in the predictive regression and to certain forms of heteroskedasticity in the shocks. We discuss how the detection procedure can be designed such that the false positive rate is pre-set by the practitioner at the start of the monitoring period. We use our approach to investigate for the presence of regime changes in the predictability of the U.S. equity premium at the one-month horizon by traditional macroeconomic and financial variables, and by binary technical analysis indicators. Our results suggest that the one-month ahead equity premium has temporarily been predictable (displaying so-called ‘pockets of predictability’), and that these episodes of predictability could have been detected in real-time by practitioners using our proposed methodology.
Harvey, David I
Leybourne, Stephen J
Sollis, Robert
Taylor, AM Robert
2018-05-15