nep-ecm New Economics Papers
on Econometrics
Issue of 2021‒10‒11
nineteen papers chosen by
Sune Karlsson
Örebro universitet

  1. Hierarchical Gaussian Process Models for Regression Discontinuity/Kink under Sharp and Fuzzy Designs By Ximing Wu
  2. Conditional inference and bias reduction for partial effects estimation of fixed-effects logit models By Bartolucci, Francesco; Pigini, Claudia; Valentini, Francesco
  3. MCMC Conditional Maximum Likelihood for the two-way fixed-effects logit By Bartolucci, Francesco; Pigini, Claudia; Valentini, Francesco
  4. Identifcation-Robust Nonparametric Inference in a Linear IV Model By Bertille Antoine; Pascal Lavergne
  5. Tests for random coefficient variation in vector autoregressive models By Dante Amengual; Gabriele Fiorentini; Enrique Sentana
  6. Shrinkage for Gaussian and t Copulas in Ultra-High Dimensions By Stanislav Anatolyev; Vladimir Pyrlik
  7. RieszNet and ForestRiesz: Automatic Debiased Machine Learning with Neural Nets and Random Forests By Victor Chernozhukov; Whitney K. Newey; Victor Quintas-Martinez; Vasilis Syrgkanis
  8. A Time-Varying Endogenous Random Coefficient Model with an Application to Production Functions By Ming Li
  9. Data Sharpening for improving CLT approximations for DEA-type efficiency estimators By Bao Hoang Nguyen; Léopold Simar; Valentin Zelenyuk
  10. Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments By Phillip Heiler; Michael C. Knaus
  11. Heterogeneous Overdispersed Count Data Regressions via Double Penalized Estimations By Shaomin Li; Haoyu Wei; Xiaoyu Lei
  12. Kernel-based Time-Varying IV estimation: handle with care By Lucchetti, Riccardo; Valentini, Francesco
  13. A Method for Predicting VaR by Aggregating Generalized Distributions Driven by the Dynamic Conditional Score By Shijia Song; Handong Li
  14. Bridging the Divide? Bayesian Artificial Neural Networks for Frontier Efficiency Analysis By Mike Tsionas; Christopher F. Parmeter; Valentin Zelenyuk
  15. Value-at-Risk forecasting model based on normal inverse Gaussian distribution driven by dynamic conditional score By Shijia Song; Handong Li
  16. Probabilistic Prediction for Binary Treatment Choice: with focus on personalized medicine By Charles F. Manski
  17. Investigating Growth at Risk Using a Multi-country Non-parametric Quantile Factor Model By Todd E. Clark; Florian Huber; Gary Koop; Massimiliano Marcellino; Michael Pfarrhofer
  18. Stochastic volatility model with range-based correction and leverage By Yuta Kurose
  19. Feature Selection by a Mechanism Design By Xingwei Hu

  1. By: Ximing Wu
    Abstract: We propose nonparametric Bayesian estimators for causal inference exploiting Regression Discontinuity/Kink (RD/RK) under sharp and fuzzy designs. Our estimators are based on Gaussian Process (GP) regression and classification. The GP methods are powerful probabilistic modeling approaches that are advantageous in terms of derivative estimation and uncertainty qualification, facilitating RK estimation and inference of RD/RK models. These estimators are extended to hierarchical GP models with an intermediate Bayesian neural network layer and can be characterized as hybrid deep learning models. Monte Carlo simulations show that our estimators perform similarly and often better than competing estimators in terms of precision, coverage and interval length. The hierarchical GP models improve upon one-layer GP models substantially. An empirical application of the proposed estimators is provided.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.00921&r=
  2. By: Bartolucci, Francesco; Pigini, Claudia; Valentini, Francesco
    Abstract: We propose a multiple-step procedure to compute average partial effects (APEs) for fixed-effects panel logit models estimated by Conditional Maximum Likelihood (CML). As individual effects are eliminated by conditioning on suitable sufficient statistics, we propose evaluating the APEs at the ML estimates for the unobserved heterogeneity, along with the fixed-T consistent estimator of the slope parameters, and then reducing the induced bias in the APE by an analytical correction. The proposed estimator has bias of order O(T −2 ), it performs well in finite samples and, when the dynamic logit model is considered, better than alternative plug-in strategies based on bias-corrected estimates for the slopes, especially with small n and T. We provide a real data application based on labour supply of married women.
    Keywords: Average partial effects, Bias reduction, Binary panel data, Conditional Maximum Likelihood
    JEL: C12 C23 C25
    Date: 2021–10–06
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:110031&r=
  3. By: Bartolucci, Francesco; Pigini, Claudia; Valentini, Francesco
    Abstract: We propose a Markov chain Monte Carlo Conditional Maximum Likelihood (MCMC-CML) estimator for two-way fixed-effects logit models for dyadic data. The proposed MCMC approach, based on a Metropolis algorithm, allows us to overcome the computational issues of evaluating the probability of the outcome conditional on nodes in and out degrees, which are sufficient statistics for the incidental parameters. Under mild regularity conditions, the MCMC-CML estimator converges to the exact CML one and is asymptotically normal. Moreover, it is more efficient than the existing pairwise CML estimator. We study the finite sample properties of the proposed approach by means of a simulation study and three empirical applications, where we also show that the MCMC-CML estimator can be applied to binary logit models for panel data with both subject and time fixed effects. Results confirm the expected theoretical advantage of the proposed approach, especially with small and sparse networks or with rare events in panel data.
    Keywords: Directed network, Fixed effects, Link formation, Metropolis algorithm, Panel data
    JEL: C23 C25 C63
    Date: 2021–10–06
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:110034&r=
  4. By: Bertille Antoine (Simon Fraser University); Pascal Lavergne (Toulouse School of Economics)
    Abstract: For a linear IV regression, we propose two new inference procedures on parameters of endogenous variables that are robust to any identification pattern, do not rely on a linear first-stage equation, and account for heteroskedasticity of unknown form. Building on Bierens (1982), we first propose an Integrated Conditional Moment (ICM) type statistic constructed by setting the parameters to the value under the null hypothesis. The ICM procedure tests at the same time the value of the coefficient and the specification of the model. We then adopt a conditionality principle to condition on a set of ICM statistics that informs on identification strength. Our two procedures uniformly control size irrespective of identification strength. They are powerful irrespective of the nonlinear form of the link between instruments and endogenous variables and are competitive with existing procedures in simulations and application.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:sfu:sfudps:dp21-12&r=
  5. By: Dante Amengual (CEMFI, Centro de Estudios Monetarios y Financieros); Gabriele Fiorentini (Università di Firenze and RCEA); Enrique Sentana (CEMFI, Centro de Estudios Monetarios y Financieros)
    Abstract: We propose the information matrix test to assess the constancy of mean and variance parameters in vector autoregressions. We additively decompose it into several orthogonal components: conditional heteroskedasticity and asymmetry of the innovations, and their unconditional skewness and kurtosis. Our Monte Carlo simulations explore both its finite size properties and its power against i.i.d. coefficients, persistent but stationary ones, and regime switching. Our procedures detect variation in the autoregressive coefficients and residual covariance matrix of a VAR for the US GDP growth rate and the statistical discrepancy, but they fail to detect any covariation between those two sets of coefficients.
    Keywords: GDP, GDI, Hessian matrix, information matrix test, outer product of the score.
    JEL: C32 C52 E01
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:cmf:wpaper:wp2021_2108&r=
  6. By: Stanislav Anatolyev; Vladimir Pyrlik
    Abstract: Copulas are a convenient framework to synthesize joint distributions, particularly in higher dimensions. Currently, copula-based high dimensional settings are used for as many as a few hundred variables and require large data samples for estimation to be precise. In this paper, we employ shrinkage techniques for large covariance matrices in the problem of estimation of Gaussian and t copulas whose dimensionality goes well beyond that typical in the literature. Specifically, we use the covariance matrix shrinkage of Ledoit and Wolf to estimate large matrix parameters of Gaussian and t copulas for up to thousands of variables, using up to 20 times lower sample sizes. The simulation study shows that the shrinkage estimation significantly outperforms traditional estimators, both in low and especially high dimensions. We also apply this approach to the problem of allocation of large portfolios.
    Keywords: Gaussian copula; t copula; high dimensionality; large covariance matrices; shrinkage; portfolio allocation;
    JEL: C31 C46 C55 C58
    Date: 2021–08
    URL: http://d.repec.org/n?u=RePEc:cer:papers:wp699&r=
  7. By: Victor Chernozhukov; Whitney K. Newey; Victor Quintas-Martinez; Vasilis Syrgkanis
    Abstract: Many causal and policy effects of interest are defined by linear functionals of high-dimensional or non-parametric regression functions. $\sqrt{n}$-consistent and asymptotically normal estimation of the object of interest requires debiasing to reduce the effects of regularization and/or model selection on the object of interest. Debiasing is typically achieved by adding a correction term to the plug-in estimator of the functional, that is derived based on a functional-specific theoretical derivation of what is known as the influence function and which leads to properties such as double robustness and Neyman orthogonality. We instead implement an automatic debiasing procedure based on automatically learning the Riesz representation of the linear functional using Neural Nets and Random Forests. Our method solely requires value query oracle access to the linear functional. We propose a multi-tasking Neural Net debiasing method with stochastic gradient descent minimization of a combined Riesz representer and regression loss, while sharing representation layers for the two functions. We also propose a Random Forest method which learns a locally linear representation of the Riesz function. Even though our methodology applies to arbitrary functionals, we experimentally find that it beats state of the art performance of the prior neural net based estimator of Shi et al. (2019) for the case of the average treatment effect functional. We also evaluate our method on the more challenging problem of estimating average marginal effects with continuous treatments, using semi-synthetic data of gasoline price changes on gasoline demand.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.03031&r=
  8. By: Ming Li
    Abstract: This paper proposes a random coefficient panel model where the regressors are correlated with the time-varying random coefficients in each period, a critical feature in many economic applications. We model the random coefficients as unknown functions of a fixed effect of arbitrary dimensions, a time-varying random shock that affects the choice of regressors, and an exogenous idiosyncratic shock. A sufficiency argument is used to control for the fixed effect, which enables one to construct a feasible control function for the random shock and subsequently identify the moments of the random coefficients. We propose a three-step series estimator and prove an asymptotic normality result. Simulation results show that the method can accurately estimate both the mean and the dispersion of the random coefficients. As an application, we estimate the average output elasticities for a sample of Chinese manufacturing firms.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.00982&r=
  9. By: Bao Hoang Nguyen (School of Economics, University of Queensland, Brisbane, Qld 4072, Australia); Léopold Simar (Institut de Statistique, Biostatistique et Sciences Actuarielles, Université Catholique de Louvain.); Valentin Zelenyuk (School of Economics and Centre for Efficiency and Productivity Analysis (CEPA) at The University of Queensland, Australia)
    Abstract: Asymptotic statistical inference on productivity and production efficiency, using nonparametric envelopment estimators, is now available thanks to the basic central limit theorems (CLTs) developed in Kneip et al. (2015). They provide asymptotic distributions of averages of Data Envelopment Analysis (DEA) and Free Disposal Hull (FDH) estimators of production efficiency. As shown in their Monte-Carlo experiments, due to the curse of dimensionality, the accuracy of the normal approximation is disappointing when the sample size is not large enough. Simar & Zelenyuk (2020) have suggested a simple way to improve the approximation by using a more appropriate estimator of the variances. In this paper we suggest another way to improve the approximation, by smoothing out the spurious values of efficiency estimates when they are in a neighborhood of 1. This results in sharpening the data for observations near the estimated efficient frontier. The method is very easy to implement and does not require more computations than the original method. We compare our approach using Monte-Carlo experiments, both with the basic method and with the improved method suggested in Simar & Zelenyuk (2020) and in both cases we observe significant improvements. We show also that the Simar & Zelenyuk (2020) idea can also be adapted to our sharpening method, bringing additional improvements. We illustrate the method with some real data sets.
    Keywords: Data Envelopment Analysis (DEA), Free Disposal Hull (FDH), Production Efficiency, Statistical Inference
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:qld:uqcepa:168&r=
  10. By: Phillip Heiler; Michael C. Knaus
    Abstract: Binary treatments in empirical practice are often (i) ex-post aggregates of multiple treatments or (ii) can be disaggregated into multiple treatment versions after assignment. In such cases it is unclear whether estimated heterogeneous effects are driven by effect heterogeneity or by treatment heterogeneity. This paper provides estimands to decompose canonical effect heterogeneity into the effect heterogeneity driven by different responses to underlying multiple treatments and potentially different compositions of these underlying effective treatments. This allows to avoid spurious discovery of heterogeneous effects, to detect potentially masked heterogeneity, and to evaluate the underlying assignment mechanism of treatment versions. A nonparametric method for estimation and statistical inference of the decomposition parameters is proposed. The framework allows for the use of machine learning techniques to adjust for high-dimensional confounding of the effective treatments. It can be used to conduct simple joint hypothesis tests for effect heterogeneity that consider all effective treatments simultaneously and circumvent multiple testing procedures. It requires weaker overlap assumptions compared to conventional multi-valued treatment effect analysis. The method is applied to a reevaluation of heterogeneous effects of smoking on birth weight. We find that parts of the differences between ethnic and age groups can be explained by different smoking intensities. We further reassess the gender gap in the effectiveness of the Job Corps training program and find that it is largely explained by gender differences in the type of vocational training received.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.01427&r=
  11. By: Shaomin Li; Haoyu Wei; Xiaoyu Lei
    Abstract: This paper studies the non-asymptotic merits of the double $\ell_1$-regularized for heterogeneous overdispersed count data via negative binomial regressions. Under the restricted eigenvalue conditions, we prove the oracle inequalities for Lasso estimators of two partial regression coefficients for the first time, using concentration inequalities of empirical processes. Furthermore, derived from the oracle inequalities, the consistency and convergence rate for the estimators are the theoretical guarantees for further statistical inference. Finally, both simulations and a real data analysis demonstrate that the new methods are effective.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.03552&r=
  12. By: Lucchetti, Riccardo; Valentini, Francesco
    Abstract: Giraitis, Kapetanios, and Marcellino (Journal of Econometrics, 2020) proposed a kernel-based time-varying coefficients IV estimator. By using entirely different code, We broadly replicate the simulation results and the empirical application on the Phillips Curve but we note that a small coding mistake might have affected some of the reported results. Further, we extend the results by using a different sample and many kernel functions; we find that the estimator is remarkably robust across a wide range of smoothing choices, but the effect of outliers may be less obvious than expected.
    Keywords: Instrumental variables, Time-varying parameters, Hausman test, Phillips curve
    JEL: C14 C26 C51
    Date: 2021–10–06
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:110033&r=
  13. By: Shijia Song; Handong Li
    Abstract: Constructing a more effective value at risk (VaR) prediction model has long been a goal in financial risk management. In this paper, we propose a novel parametric approach and provide a standard paradigm to demonstrate the modeling. We establish a dynamic conditional score (DCS) model based on high-frequency data and a generalized distribution (GD), namely, the GD-DCS model, to improve the forecasts of daily VaR. The model assumes that intraday returns at different moments are independent of each other and obey the same kind of GD, whose dynamic parameters are driven by DCS. By predicting the motion law of the time-varying parameters, the conditional distribution of intraday returns is determined; then, the bootstrap method is used to simulate daily returns. An empirical analysis using data from the Chinese stock market shows that Weibull-Pareto -DCS model incorporating high-frequency data is superior to traditional benchmark models, such as RGARCH, in the prediction of VaR at high risk levels, which proves that this approach contributes to the improvement of risk measurement tools.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.02953&r=
  14. By: Mike Tsionas (Montpellier Business School Université de Montpellier, Montpellier Research in Management and Lancaster University Management School); Christopher F. Parmeter (Miami Herbert Business School, University of Miami, Miami FL); Valentin Zelenyuk (School of Economics and Centre for Efficiency and Productivity Analysis (CEPA) at The University of Queensland, Australia)
    Abstract: The literature on firm efficiency has seen its share of research comparing and contrasting Data Envelopment Analysis (DEA) and Stochastic Frontier Analysis (SFA), the two workhorse estimators. These studies rely on both Monte Carlo experiments and actual data sets to examine a range of performance issues which can be used to elucidate insights on the benefits or weaknesses of one method over the other. As can be imagined, neither method is universally better than the other. The present paper proposes an alternative approach that is quite flexible in terms of functional form and distributional assumptions and it amalgamates the benefits of both DEA and SFA. Specifically, we bridge these two popular approaches via Bayesian Artificial Neural Networks. We examine the performance of this new approach using Monte Carlo experiments. The performance is found to be very good, comparable or often better than the current standards in the literature. To illustrate the new techniques, we provide an application of this approach to a recent data set of large US banks.
    Keywords: Simulation; OR in Banking; Stochastic Frontier Models; Data Envelopment Analysis; Flexible Functional Forms.
    Date: 2021–06
    URL: http://d.repec.org/n?u=RePEc:qld:uqcepa:162&r=
  15. By: Shijia Song; Handong Li
    Abstract: Under the framework of dynamic conditional score, we propose a parametric forecasting model for Value-at-Risk based on the normal inverse Gaussian distribution (Hereinafter NIG-DCS-VaR), which creatively incorporates intraday information into daily VaR forecast. NIG specifies an appropriate distribution to return and the semi-additivity of the NIG parameters makes it feasible to improve the estimation of daily return in light of intraday return, and thus the VaR can be explicitly obtained by calculating the quantile of the re-estimated distribution of daily return. We conducted an empirical analysis using two main indexes of the Chinese stock market, and a variety of backtesting approaches as well as the model confidence set approach prove that the VaR forecasts of NIG-DCS model generally gain an advantage over those of realized GARCH (RGARCH) models. Especially when the risk level is relatively high, NIG-DCS-VaR beats RGARCH-VaR in terms of coverage ability and independence.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.02492&r=
  16. By: Charles F. Manski
    Abstract: This paper extends my research applying statistical decision theory to treatment choice with sample data, using maximum regret to evaluate the performance of treatment rules. The specific new contribution is to study as-if optimization using estimates of illness probabilities in clinical choice between surveillance and aggressive treatment. Beyond its specifics, the paper sends a broad message. Statisticians and computer scientists have addressed conditional prediction for decision making in indirect ways, the former applying classical statistical theory and the latter measuring prediction accuracy in test samples. Neither approach is satisfactory. Statistical decision theory provides a coherent, generally applicable methodology.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.00864&r=
  17. By: Todd E. Clark; Florian Huber; Gary Koop; Massimiliano Marcellino; Michael Pfarrhofer
    Abstract: We develop a Bayesian non-parametric quantile panel regression model. Within each quantile, the response function is a convex combination of a linear model and a non-linear function, which we approximate using Bayesian Additive Regression Trees (BART). Cross-sectional information at the pth quantile is captured through a conditionally heteroscedastic latent factor. The non-parametric feature of our model enhances flexibility, while the panel feature, by exploiting cross-country information, increases the number of observations in the tails. We develop Bayesian Markov chain Monte Carlo (MCMC) methods for estimation and forecasting with our quantile factor BART model (QF-BART), and apply them to study growth at risk dynamics in a panel of 11 advanced economies.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.03411&r=
  18. By: Yuta Kurose
    Abstract: This study presents contemporaneous modeling of asset return and price range within the framework of stochastic volatility with leverage. A new representation of the probability density function for the price range is provided, and its accurate sampling algorithm is developed. A Bayesian estimation using Markov chain Monte Carlo (MCMC) method is provided for the model parameters and unobserved variables. MCMC samples can be generated rigorously, despite the estimation procedure requiring sampling from a density function with the sum of an infinite series. The empirical results obtained using data from the U.S. market indices are consistent with the stylized facts in the financial market, such as the existence of the leverage effect. In addition, to explore the model's predictive ability, a model comparison based on the volatility forecast performance is conducted.
    Date: 2021–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.00039&r=
  19. By: Xingwei Hu
    Abstract: In constructing an econometric or statistical model, we pick relevant features or variables from many candidates. A coalitional game is set up to study the selection problem where the players are the candidates and the payoff function is a performance measurement in all possible modeling scenarios. Thus, in theory, an irrelevant feature is equivalent to a dummy player in the game, which contributes nothing to all modeling situations. The hypothesis test of zero mean contribution is the rule to decide a feature is irrelevant or not. In our mechanism design, the end goal perfectly matches the expected model performance with the expected sum of individual marginal effects. Within a class of noninformative likelihood among all modeling opportunities, the matching equation results in a specific valuation for each feature. After estimating the valuation and its standard deviation, we drop any candidate feature if its valuation is not significantly different from zero. In the simulation studies, our new approach significantly outperforms several popular methods used in practice, and its accuracy is robust to the choice of the payoff function.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.02419&r=

This nep-ecm issue is ©2021 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.