nep-ecm 2016-03-10 papers

on Econometrics

Issue of 2016‒03‒10
fifteen papers chosen by
Sune Karlsson
Örebro universitet

New Distribution Theory for the Estimation of Structural Break Point in Mean By Jiang Liang; Wang Xiaohu; Jun Yu
Big data analytics: a new perspective By Chudik, Alexander; Kapetanios, George; Pesaran, M. Hashem
A dynamic component model for forecasting high-dimensional realized covariance matrices By BAUWENS, L.; BRAIONE, M.; STORTI, G.
Reject inference in application scorecards: evidence from France By Ha-Thu Nguyen
Score-Based Tests of Differential Item Functioning in the Two-Parameter Model By Ting Wang; Carolin Strobl; Achim Zeileis; Edgar C. Merkle
Stationarity of Heterogeneity in Production Technology using Latent Class Modelling By AGRELL, P; BREA-SOLÍS, H.
Revisiting the transitional dynamics of business-cycle phases with mixed frequency data By Marie Bessec
Modified Profile Likelihood Inference and Interval Forecast of the Burst of Financial Bubbles By Vladimir Filimonov; Guilherme Demos; Didier Sornette
Identifying the Discount Factor in Dynamic Discrete Choice Models By Abbring, Jaap H; Daljord, Øystein
Semiparametric Analysis of Network Formation By Koen Jochmans
Measuring poverty with the Foster, Greer and Thorbecke indexes based on the Gamma distribution By Fernández-Morales, Antonio
Sparse Change-Point Time Series Models By Dufays, A.; Rombouts, V.
Maintained Individual Data Distributed Likelihood Estimation (MIDDLE) By Steven M. Boker; Timothy R. Brick; Joschua N. Pritikin; Yang Wang; Timo von Oertzen; Donald Brown; John Lach; Ryne Estabrook; Michael D. Hunter; Hermine H. Maes; Michael C. Neale
Forecasting Daily Stock Volatility Using GARCH-CJ Type Models with Continuous and Jump Variation By BOUSALAM, Issam; HAMZAOUI, Moustapha; ZOUHAYR, Otman
Interaction matrix selection in spatial econometrics with an application to growth theory By Nicolas Debarsy; Cem Ertur

New Distribution Theory for the Estimation of Structural Break Point in Mean

By:	Jiang Liang (Singapore Management University); Wang Xiaohu (The Chinese University of Hong Kong); Jun Yu (Singapore Management University)
Abstract:	Based on the Girsanov theorem, this paper rst obtains the exact distribution of the maximum likelihood estimator of structural break point in a continuous time model. The exact distribution is asymmetric and tri-modal, indicating that the estimator is seriously biased. These two properties are also found in the nite sample distribution of the least squares estimator of structural break point in the discrete time model. The paper then builds a continuous time approximation to the discrete time model and develops an in- ll asymptotic theory for the least squares estimator. The obtained in- ll asymptotic distribution is asymmetric and tri-modal and delivers good approximations to the nite sample distribution. In order to reduce the bias in the estimation of both the continuous time model and the discrete time model, a simulation-based method based on the indirect estima- tion approach is proposed. Monte Carlo studies show that the indirect estimation method achieves substantial bias reductions. However, since the binding function has a slope less than one, the variance of the indirect estimator is larger than that of the original estimator.
Keywords:	Structural break, Bias reduction, Indirect estimation, Exact distribution, In- ll asymptotics
JEL:	C11 C46
Date:	2016–01
URL:	http://d.repec.org/n?u=RePEc:siu:wpaper:01-2016&r=ecm

Big data analytics: a new perspective

By:	Chudik, Alexander (Federal Reserve Bank of Dallas); Kapetanios, George (King's College of); Pesaran, M. Hashem (University of Southern California)
Abstract:	Model specification and selection are recurring themes in econometric analysis. Both topics become considerably more complicated in the case of large-dimensional data sets where the set of specification possibilities can become quite large. In the context of linear regression models, penalised regression has become the de facto benchmark technique used to trade off parsimony and fit when the number of possible covariates is large, often much larger than the number of available observations. However, issues such as the choice of a penalty function and tuning parameters associated with the use of penalised regressions remain contentious. In this paper, we provide an alternative approach that considers the statistical significance of the individual covariates one at a time, whilst taking full account of the multiple testing nature of the inferential problem involved. We refer to the proposed method as One Covariate at a Time Multiple Testing (OCMT) procedure. The OCMT has a number of advantages over the penalised regression methods: It is based on statistical inference and is therefore easier to interpret and relate to the classical statistical analysis, it allows working under more general assumptions, it is computationally simple and considerably faster, and it performs better in small samples for almost all of the five different sets of experiments considered in this paper. Despite its simplicity, the theory behind the proposed approach is quite complicated. We provide extensive theoretical and Monte Carlo results in support of adding the proposed OCMT model selection procedure to the toolbox of applied researchers.
JEL:	C52
Date:	2016–02–29
URL:	http://d.repec.org/n?u=RePEc:fip:feddgw:268&r=ecm

A dynamic component model for forecasting high-dimensional realized covariance matrices

By:	BAUWENS, L. (Université catholique de Louvain, CORE, Belgium); BRAIONE, M. (Université catholique de Louvain, CORE, Belgium); STORTI, G. (Université catholique de Louvain, CORE, Belgium)
Abstract:	The Multiplicative MIDAS Realized DCC (MMReDCC) model of Bauwens et al. [5] decomposes the dynamics of the realized covariance matrix of returns into short-run transitory and long-run secular components where the latter reflects the effect of the continuously changing economic conditions. The model allows to obtain positive-definite forecasts of the realized covariance matrices but, due to the high number of parameters involved, estimation becomes unfeasible for large cross-sectional dimensions. Our contribution in this paper is twofold. First, in order to obtain a computationally feasible estimation procedure, we propose an algorithm that relies on the maximization of an iteratively re-computed moment-based profile likelihood function. We assess the finite sample properties of the proposed algorithm via a simulation study. Second, we propose a bootstrap procedure for generating multi-step ahead forecasts from the MMReDCC model. In an empirical application on realized covariance matrices for fifty equities, we find that the MMReDCC not only statistically outperforms the selected benchmarks in-sample, but also improves the out-of-sample ability to generate accurate multi-step ahead forecasts of the realized covariances.
Keywords:	Realized covariance, dynamic component models, multi-step forecasting, MIDAS, targeting, model confidence set
Date:	2016–02–01
URL:	http://d.repec.org/n?u=RePEc:cor:louvco:2016001&r=ecm

Reject inference in application scorecards: evidence from France

By:	Ha-Thu Nguyen
Abstract:	Credit scoring models are commonly developed using only accepted Known Good/Bad (G/B) applications, called KGB model, because we only know the performance of those accepted in the past. Obviously, the KGB model is not indicative of the entire through-the-door population, and reject inference precisely attempts to address the bias by assigning an inferred G/B status to rejected applications. In this paper, we discuss the pros and cons of various reject inference techniques, and pitfalls to avoid when using them. We consider a real dataset of a major French consumer finance bank to assess the effectiveness of the practice of using reject inference. To do that, we rely on the logistic regression framework to model probabilities to become good/bad, and then validate the model performance with and without sample selection bias correction. Our main results can be summarized as follows. First, we show that the best reject inference technique is not necessarily the most complicated one: reweighting and parceling provide more accurate and relevant results than fuzzy augmentation and Heckman’s two-stage correction. Second, disregarding rejected applications significantly impacts the forecast accuracy of the scorecard. Third, as the sum of standard errors dramatically reduces when the sample size increases, reject inference turns out to produce an improved representation of the population. Finally, reject inference appears to be an effective way to reduce overfitting in model selection.
Keywords:	Reject inference, sample selection, selection bias, logistic regression, reweighting,parceling, fuzzy augmentation, Heckman’s two-stage correction.
JEL:	C51 C52 C53 G21
Date:	2016
URL:	http://d.repec.org/n?u=RePEc:drm:wpaper:2016-10&r=ecm

Score-Based Tests of Differential Item Functioning in the Two-Parameter Model

By:	Ting Wang; Carolin Strobl; Achim Zeileis; Edgar C. Merkle
Abstract:	Measurement invariance is a fundamental assumption in item response theory models, where the relationship between a latent construct (ability) and observed item responses is of interest. Violation of this assumption would render the scale misinterpreted or cause systematic bias against certain groups of people. While a number of methods have been proposed to detect measurement invariance violations, they typically require advance definition of problematic item parameters and respondent grouping information. However, these pieces of information are typically unknown in practice. As an alternative, this paper focuses on a family of recently-proposed tests based on stochastic processes of casewise derivatives of the likelihood function (i.e., scores). These score-based tests only require estimation of the null model (when measurement invariance is assumed to hold), and they have been previously applied in factor-analytic, continuous data contexts as well as in models of the Rasch family. In this paper, we aim to extend these tests to two parameter item response models estimated via maximum likelihood. The tests' theoretical background and implementation are detailed, and the tests' abilities to identify problematic item parameters are studied via simulation. An empirical example illustrating the tests' use in practice is also provided.
Keywords:	measurement invariance, item response theory, factor analysis, 2PL model, differential item functioning
JEL:	C30 C52 C87
Date:	2016–03
URL:	http://d.repec.org/n?u=RePEc:inn:wpaper:2016-05&r=ecm

Stationarity of Heterogeneity in Production Technology using Latent Class Modelling

By:	AGRELL, P (Université catholique de Louvain, CORE, Belgium); BREA-SOLÍS, H. (HEC Management School, University of Liege)
Abstract:	Latent class modelling (LC) has been advanced as a promising alternative for addressing heterogeneity in frontier analysis models, in particular those where the individual scores are used in regulatory settings. If the production possibility set contains multiple distinct technologies, pooled approaches would result in biased results. We revisit the fundamentals of production theory and formulate a set of criteria for identification of heterogeneity: completeness (the inclusion of all data in the analysis), stationarity (the temporal stability of the identified production technologies), and endogeneity (no ad hoc determination of the cardinality of the classes). We also distinguish between the identification of a sporadic idiosyncratic shock, an outlier observation, and the identification of a time-persistent technology. Using a representative data set for regulation (a panel for Swedish electricity distributors 2000-2006), we test LC modelling for a Cobb-Douglas production function using the defined criteria. The LC results are compared to the pooled stochastic frontier analysis (SFA) model as a benchmark. Outliers are detected using an adjusted DEA super-efficiency procedure. Our results show that about 78% of the distributors are assigned to a single class, the remaining 22% split into two smaller classes that are non-stationary and largely composed of outliers. It is hardly conceivable that a production technology could change over this short horizon, implying that LC should be seen more as an enhanced outlier analysis than as a solid identification method for heterogeneity in the production set. More generally, we argue that the claim for heterogeneity in reference set deserves a more rigorous investigation to control for the multiple effects of sample size bias, specification error and the impact on functional form assumptions.
Keywords:	Frontier analysis, latent class models, SFA, DEA, outliers, regulation
JEL:	D72 L51
Date:	2015–11–06
URL:	http://d.repec.org/n?u=RePEc:cor:louvco:2015047&r=ecm

Revisiting the transitional dynamics of business-cycle phases with mixed frequency data

By:	Marie Bessec (LEDa - Laboratoire d'Economie de Dauphine - Université Paris IX - Paris Dauphine)
Abstract:	This paper introduces a Markov-Switching model where transition probabilities depend on higher frequency indicators and their lags, through polynomial weighting schemes. The MSV-MIDAS model is estimated via maximum likel ihood methods. The estimation relies on a slightly modified version of Hamilton’s recursive filter. We use Monte Carlo simulations to assess the robustness of the estimation procedure and related test-statistics. The results show that ML provides accurate estimates, but they suggest some caution in the tests on the parameters involved in the transition probabilities. We apply this new model to the detection and forecast of business cycle turning points. We properly detect recessions in United States and United Kingdom by exploiting the link between GDP growth and higher frequency variables from financial and energy markets. Spread term is a particularly useful indicator to predict recessions in the United States, while stock returns have the strongest explanatory power around British turning points.
Keywords:	Markov-Switching,mixed frequency data,business cycles
Date:	2015–06–22
URL:	http://d.repec.org/n?u=RePEc:hal:journl:hal-01276824&r=ecm

Modified Profile Likelihood Inference and Interval Forecast of the Burst of Financial Bubbles

By:	Vladimir Filimonov; Guilherme Demos; Didier Sornette
Abstract:	We present a detailed methodological study of the application of the modified profile likelihood method for the calibration of nonlinear financial models characterised by a large number of parameters. We apply the general approach to the Log-Periodic Power Law Singularity (LPPLS) model of financial bubbles. This model is particularly relevant because one of its parameters, the critical time $t_c$ signalling the burst of the bubble, is arguably the target of choice for dynamical risk management. However, previous calibrations of the LPPLS model have shown that the estimation of $t_c$ is in general quite unstable. Here, we provide a rigorous likelihood inference approach to determine $t_c$, which takes into account the impact of the other nonlinear (so-called "nuisance") parameters for the correct adjustment of the uncertainty on $t_c$. This provides a rigorous interval estimation for the critical time, rather than a point estimation in previous approaches. As a bonus, the interval estimations can also be obtained for the nuisance parameters ($m,\omega$, damping), which can be used to improve filtering of the calibration results. We show that the use of the modified profile likelihood method dramatically reduces the number of local extrema by constructing much simpler smoother log-likelihood landscapes. The remaining distinct solutions can be interpreted as genuine scenarios that unfold as the time of the analysis flows, which can be compared directly via their likelihood ratio. Finally, we develop a multi-scale profile likelihood analysis to visualize the structure of the financial data at different scales (typically from 100 to 750 days). We test the methodology successfully on synthetic price time series and on three well-known historical financial bubbles.
Date:	2016–02
URL:	http://d.repec.org/n?u=RePEc:arx:papers:1602.08258&r=ecm

Identifying the Discount Factor in Dynamic Discrete Choice Models

By:	Abbring, Jaap H; Daljord, Øystein
Abstract:	The identification of the discount factor in dynamic discrete models is important for counterfactual analysis, but hard. Existing approaches either take the discount factor to be known or rely on high level exclusion restrictions that are difficult to interpret and hard to satisfy in applications, in particular in industrial organization. We provide identification results under an exclusion restriction on primitive utility that is more directly useful to applied researchers. We also show that our and existing exclusion restrictions limit the choice and state transition probability data in different ways; that is, they give the model nontrivial and distinct empirical content.
Keywords:	discount factor; dynamic discrete choice; empirical content; identification
JEL:	C14 C25 D91 D92
Date:	2016–02
URL:	http://d.repec.org/n?u=RePEc:cpr:ceprdp:11133&r=ecm

Semiparametric Analysis of Network Formation

By:	Koen Jochmans (Département d'économie)
Abstract:	We consider a statistical model for network formation that features both node-specific heterogeneity parameters and common parameters that reflect homophily among nodes. The goal is to perform statistical inference on the homophily parameters while allowing the distribution of the node heterogeneity to be unrestricted, that is, by treating the node-specific parameters as fixed effects. Jointly estimating all the parameters leads to asymptotic bias that renders conventional confidence intervals incorrectly centered. As an alternative, we develop an approach based on a sufficient statistic that separates inference on the homophily parameters from estimation of the fixed effects. This estimator is easy to compute and is shown to have desirable asymptotic properties. In numerical experiments we find that the asymptotic results provide a good approximation to the small-sample behavior of the estimator. As an empirical illustration, the technique is applied to explain the import and export patterns in a cross-section of countries.
Keywords:	conditional inference, degree heterogeneity, directed random graph, fixed effects, homophily, U-statistic.
Date:	2016–02
URL:	http://d.repec.org/n?u=RePEc:spo:wpecon:info:hdl:2441/dpido2upv86tqc7td18fd2mna&r=ecm

Measuring poverty with the Foster, Greer and Thorbecke indexes based on the Gamma distribution

By:	Fernández-Morales, Antonio
Abstract:	The purpose of this paper is the estimation of the Foster, Greer and Thorbecke family of poverty indexes using the Gamma distribution as a continuous representation of the distribution of incomes. The expressions of this family of poverty indexes associated with the Gamma probability model and their asymptotic distributions are derived in the text, both for an exogenous and a relative (to the mean) poverty line. Finally, a Monte Carlo experiment is performed to compare three different methods of estimation for grouped data.
Keywords:	Poverty indexes; Income distribution; Gamma distribution
JEL:	C13 C46 I32
Date:	2016
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:69648&r=ecm

Sparse Change-Point Time Series Models

By:	Dufays, A. (Université catholique de Louvain, CORE, Belgium); Rombouts, V. (ESSEC Business School)
Abstract:	Change-point time series specifications constitute flexible models that capture unknown structural changes by allowing for switches in the model parameters. Nevertheless most models suffer from an over-parametrization issue since typically only one latent state vari- able drives the breaks in all parameters. This implies that all parameters have to change when a break happens. We introduce sparse change-point processes, a new approach for detecting which parameters change over time. We propose shrinkage prior distributions allowing to control model parsimony by limiting the number of parameters which evolve from one structural break to another. We also give clear rules with respect to the choice of the hyper parameters of the new prior distributions. Well-known applications are re-visited to emphasize that many popular breaks are, in fact, due to a change in only a subset of the model parameters. It also turns out that sizeable forecasting improvements are made over recent change-point models.
Keywords:	Time series, Shrinkage prior, Change-point model, Online forecasting
JEL:	C11 C15 C22 C51
Date:	2015–07–10
URL:	http://d.repec.org/n?u=RePEc:cor:louvco:2015032&r=ecm

Maintained Individual Data Distributed Likelihood Estimation (MIDDLE)

By:	Steven M. Boker; Timothy R. Brick; Joschua N. Pritikin; Yang Wang; Timo von Oertzen; Donald Brown; John Lach; Ryne Estabrook; Michael D. Hunter; Hermine H. Maes; Michael C. Neale
Abstract:	Maintained Individual Data Distributed Likelihood Estimation (MIDDLE)is a novel paradigm for research in the behavioral, social, and health sciences. The MIDDLE approach is based on the seemingly-impossible idea that data can be privately maintained by participants and never revealed to researchers, while still enabling statistical models to be fit and scientific hypotheses tested. MIDDLE rests on the assumption that participant data should belong to, be controlled by, and remain in the possession of the participants themselves. Distributed likelihood estimation refers to fitting statistical models by sending an objective function and vector of parameters to each participants’ personal device (e.g., smartphone, tablet, computer), where the likelihood of that individual’s data is calculated locally. Only the likelihood value is returned to the central optimizer. The optimizer aggregates likelihood values from responding participants and chooses new vectors of parameters until the model converges. A MIDDLE study provides significantly greater privacy for participants, automatic management of optinand opt-out consent, lower cost for the researcher and funding institute,and faster determination of results. Furthermore, if a participant opts into several studies simultaneously and opts into data sharing, these studies automatically have access to individual-level longitudinal data linked across all studies.
Keywords:	Ethikkommissionen, Forschungsethik, Governance, Medizin,Methodenpluralismus, Regulierung, Sozialwissenschaften
Date:	2016
URL:	http://d.repec.org/n?u=RePEc:rsw:rswwps:rswwps254&r=ecm

Forecasting Daily Stock Volatility Using GARCH-CJ Type Models with Continuous and Jump Variation

By:	BOUSALAM, Issam; HAMZAOUI, Moustapha; ZOUHAYR, Otman
Abstract:	In this paper we decompose the realized volatility of the GARCH-RV model into continuous sample path variation and discontinuous jump variation to provide a practical and robust framework for non-parametrically measuring the jump component in asset return volatility. By using 5-minute high-frequency data of MASI Index in Morocco for the period (January 15, 2010 - January 29, 2016), we estimate parameters of the constructed GARCH and EGARCH-type models (namely, GARCH, GARCH-RV, GARCH-CJ, EGARCH, EGARCH-RV, and EGARCH-CJ) and evaluate their predictive power to forecast future volatility. The results show that the realized volatility and the continuous sample path variation have certain predictive power for future volatility while the discontinuous jump variation contains relatively less information for forecasting volatility. More interestingly, the findings show that the GARCH-CJ-type models have stronger predictive power for future volatility than the other two types of models. These results have a major contribution in financial practices such as financial derivatives pricing, capital asset pricing, and risk measures.
Keywords:	GARCH-CJ; Jumps variation; Realized volatility; MASI Index; Morocco.
JEL:	C22 F37 F47 G17
Date:	2016–01–20
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:69636&r=ecm

Interaction matrix selection in spatial econometrics with an application to growth theory

By:	Nicolas Debarsy (Laboratoire d'Economie d'Orléans - LEO - Laboratoire d'économie d'Orleans - UO - Université d'Orléans - CNRS - Centre National de la Recherche Scientifique); Cem Ertur (Econométrie - LEO - Laboratoire d'économie d'Orleans - UO - Université d'Orléans - CNRS - Centre National de la Recherche Scientifique)
Abstract:	The interaction matrix, or spatial weight matrix, is the fundamental tool to model cross-sectional interdependence between observations in spatial econometric models. However, it is most of the time not derived from theory, as it should be ideally, but chosen on an ad hoc basis. In this paper, we propose a modified version of the J test to formally select the interaction matrix. Our methodology is based on the application of the robust against unknown heteroskedasticity GMM estimation method, developed by Lin & Lee (2010). We then implement the testing procedure developed by Hagemann (2012) to overcome the decision problem inherent to non-nested models tests. An application is presented for the Schumpeterian growth model with worldwide interactions (Ertur & Koch 2011) using three different types of interaction matrix: genetic distance, linguistic distance and bilateral trade flows and we find that the interaction matrix based on trade flows is the most adequate. Furthermore, we propose a network based innovative representation of spatial econometric results.
Keywords:	Bootstrap,GMM,Interaction matrix,J tests,Non-nested models,Heteroscedasticity,Spatial autoregressive models
Date:	2016–02–24
URL:	http://d.repec.org/n?u=RePEc:hal:wpaper:halshs-01278545&r=ecm

This nep-ecm issue is ©2016 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.

General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.