nep-ecm New Economics Papers
on Econometrics
Issue of 2023‒01‒09
37 papers chosen by
Sune Karlsson
Örebro universitet

  1. Double Robust Bayesian Inference on Average Treatment Effects By Christoph Breunig; Ruixuan Liu; Zhengfei Yu
  2. A Unified Framework for Dynamic Treatment Effect Estimation in Interactive Fixed Effect Models By Nicholas Brown; Kyle Butts
  3. Panel Threshold Regression with Unobserved Individual-Specific Threshold Effects By Ping Yu; Shengjie Hong; Peter C. B. Phillips
  4. On consistency and sparsity for high-dimensional functional time series with application to autoregressions By Guo, Shaojun; Qiao, Xinghao
  5. A Generalized Poisson-Pseudo Maximum Likelihood Estimator By Kwon, Ohyun; Yoon, Jangsu; Yotov, Yoto
  6. Estimation of continuous-time linear DSGE models from discrete-time measurements By Bent Jesper Christensen; Luca Neri; Juan Carlos Parra-Alvarez
  7. Unified Factor Model Estimation and Inference under Short and Long Memory By Shuyao Ke; Liangjun Su; Peter C. B. Phillips
  8. Estimation and Testing in a Perturbed Multivariate Long Memory Framework By Less, Vivien; Sibbertsen, Philipp
  9. A smooth transition autoregressive model for matrix-variate time series By Andrea Bucci
  10. Inference in Cluster Randomized Trials with Matched Pairs By Yuehao Bai; Jizhou Liu; Azeem M. Shaikh; Max Tabord-Meehan
  11. Incorporating Prior Knowledge of Latent Group Structure in Panel Data Models By Boyuan Zhang
  12. Extreme Changes in Changes By Yuya Sasaki; Yulong Wang
  13. High-dimensional principal component analysis with heterogeneous missingness By Zhu, Ziwei; Wang, Tengyao; Samworth, Richard J.
  14. A Complete Framework for Model-Free Difference-In-Differences Estimation By Henderson, Daniel J.; Sperlich, Stefan
  15. Information Equivalence Among Transformations of Semiparametric Nonlinear Panel Data Models By Nicholas Brown
  16. Are Bartik Regressions Always Robust to Heterogeneous Treatment Effects? By Clément de Chaisemartin; Ziteng Lei
  17. Explosion Bubble Testing: An Overview By Skrobotov Anton
  18. Bayesian Multivariate Quantile Regression with alternative Time-varying Volatility Specifications By Matteo Iacopini; Francesco Ravazzolo; Luca Rossini
  19. Aggregation Trees By Riccardo Di Francesco
  20. Why Transform Y? A Critical Assessment of Dependent-Variable Transformations in Regression Models for Skewed and Sometimes-Zero Outcomes By John Mullahy; Edward C. Norton
  21. Strict stationarity of Poisson integer-valued ARCH processes of order infinity By Mawuli Segnon
  22. Maximum Likelihood vs. Bayesian estimation of uncertainty By Zuckerman, Daniel
  23. Finite Sample Comparison of Alternative Estimators for Fractional Gaussian Noise By Shi, Shuping; Yu, Jun; Zhang, Chen
  24. Score-based calibration testing for multivariate forecast distributions By Malte Kn\"uppel; Fabian Kr\"uger; Marc-Oliver Pohle
  25. A Nonparametric Finite Mixture Approach to Difference-in-Difference Estimation, with an Application to On-the-job Training and Wages By Oliver Cassagneau-Francis; Robert Gary-Bobo; Julie Pernaudet; Jean-Marc Robin
  26. Dominant Drivers of National Inflation By Jan Ditzen; Francesco Ravazzolo
  27. Identification of Unobservables in Observations By Yingyao Hu
  28. The Falsification Adaptive Set in Linear Models with Instrumental Variables that Violate the Exogeneity or Exclusion Restriction By Nicolas Apfel; Frank Windmeijer
  29. Bayesian inference for non-anonymous Growth Incidence Curves using Bernstein polynomials: an application to academic wage dynamics By Edwin Fourrier-Nicolai; Michel Lubrano
  30. External Instrument SVAR Analysis for Noninvertible Shocks By Forni, Mario; Gambetti, Luca; Ricco, Giovanni
  31. Parameter Estimation of the Heston Volatility Model with Jumps in the Asset Prices By Jaros{\l}aw Gruszka; Janusz Szwabi\'nski
  32. Maximum Likelihood Estimation for a Markov-Modulated Jump-Diffusion Model By Laura Eslava; Fernando Baltazar-Larios; Bor Reynoso
  33. Time series : entropy and informational energy By George Daniel Mateescu
  34. On the Non-Identification of Revenue Production Functions By David Van Dijcke
  35. An Optimal Bandwidth For Difference-in-Difference Estimation with a Continuous Treatment and an Heterogeneous Adoption Design By Clément de Chaisemartin; Xavier d'Haultfoeuille
  36. A Time Series Approach to Explainability for Neural Nets with Applications to Risk-Management and Fraud Detection By Marc Wildi; Branka Hadji Misheva
  37. Statistical inference of the value function for reinforcement learning in infinite-horizon settings By Shi, Chengchun; Zhang, Shengxing; Lu, Wenbin; Song, Rui

  1. By: Christoph Breunig; Ruixuan Liu; Zhengfei Yu
    Abstract: We study a double robust Bayesian inference procedure on the average treatment effect (ATE) under unconfoundedness. Our Bayesian approach involves a correction term for prior distributions adjusted by the propensity score. We prove asymptotic equivalence of our Bayesian estimator and efficient frequentist estimators by establishing a new semiparametric Bernstein-von Mises theorem under double robustness; i.e., the lack of smoothness of conditional mean functions can be compensated by high regularity of the propensity score and vice versa. Consequently, the resulting Bayesian point estimator internalizes the bias correction as the frequentist-type doubly robust estimator, and the Bayesian credible sets form confidence intervals with asymptotically exact coverage probability. In simulations, we find that this corrected Bayesian procedure leads to significant bias reduction of point estimation and accurate coverage of confidence intervals, especially when the dimensionality of covariates is large relative to the sample size and the underlying functions become complex. We illustrate our method in an application to the National Supported Work Demonstration.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.16298&r=ecm
  2. By: Nicholas Brown (Queen's University); Kyle Butts (University of Colorado Boulder, Economics Department)
    Abstract: We present a unifying identification strategy of dynamic average treatment effect parameters for staggered interventions when parallel trends are valid only after controlling for interactive fixed effects. This setting nests the usual parallel trends assumption, but allows treated units to have heterogeneous exposure to unobservable macroeconomic trends. We show that any estimator that is consistent for the unobservable trends up to a non-singular rotation can be used to consistently estimate heterogeneous dynamic treatment effects. This result can apply to data sets with either many or few pre-treatment time periods. We also demonstrate the robustness of two-way fixed effects imputation to certain parallel trends violations and provide a test for its consistency. A quasi-long-differencing estimator is proposed and implemented to estimate the effect of Walmart openings on local economic conditions.
    Keywords: factor model, panel treatment effect, causal inference, fixed-T
    JEL: C13 C21 C23 C26
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:qed:wpaper:1495&r=ecm
  3. By: Ping Yu (University of Hong Kong); Shengjie Hong (Tsinghua University, China); Peter C. B. Phillips (Cowles Foundation, Yale University)
    Abstract: This paper studies the estimation and inferences in panel threshold regression with unobserved individual-specific threshold effects which is important from the practical perspective and is a distinguishing feature from traditional linear panel data models. It is shown that the within-regime differencing in the static model or the within-regime first-differencing in the dynamic model cannot generate consistent estimators of the threshold, so the correlated random effects models are suggested to handle the endogeneity in such general panel threshold models. We provide a unified estimation and inference framework that is valid for both the static and dynamic models and regardless of whether the unobserved individual-specific threshold effects exist or not. Especially, we propose alternative inference methods for the model parameters, which have better theoretical properties than the existing methods. Simulation studies and an empirical application illustrate the usefulness of our new estimation and inference methodology in practice.
    Date: 2022–10
    URL: http://d.repec.org/n?u=RePEc:cwl:cwldpp:2352&r=ecm
  4. By: Guo, Shaojun; Qiao, Xinghao
    Abstract: Modelling a large collection of functional time series arises in a broad spectral of real applications. Under such a scenario, not only the number of functional variables can be diverging with, or even larger than the number of temporally dependent functional observations, but each function itself is an infinite-dimensional object, posing a challenging task. In this paper, we propose a three-step procedure to estimate high-dimensional functional time series models. To provide theoretical guarantees for the three-step procedure, we focus on multivariate stationary processes and propose a novel functional stability measure based on their spectral properties. Such stability measure facilitates the development of some useful concentration bounds on sample (auto)covariance functions, which serve as a fundamental tool for further convergence analysis in high-dimensional settings. As functional principal component analysis (FPCA) is one of the key dimension reduction techniques in the first step, we also investigate the non-asymptotic properties of the relevant estimated terms under a FPCA framework. To illustrate with an important application, we consider vector functional autoregressive models and develop a regularization approach to estimate autoregressive coefficient functions under the sparsity constraint. Using our derived non-asymptotic results, we investigate convergence properties of the regularized estimate under high-dimensional scaling. Finally, the finite-sample performance of the proposed method is examined through both simulations and a public financial dataset.
    Keywords: functional principal component analysis; functional stability measure; high-dimensional functional time series; non-asymptotics; sparsity; vector functional autoregression; Functional principal component analysis; Shaojun Guo was partially supported by the National Natural Science Foundation of China (No. 11771447)
    JEL: C1
    Date: 2023–02–01
    URL: http://d.repec.org/n?u=RePEc:ehl:lserod:114638&r=ecm
  5. By: Kwon, Ohyun (Drexel University); Yoon, Jangsu (University of Wisconsin-Milwaukee); Yotov, Yoto (Drexel University)
    Abstract: We propose a Generalized Poisson-Pseudo Maximum Likelihood (G-PPML) estimator that relaxes the PPML estimator’s assumption that the dependent variable’s conditional variance is proportional to its conditional mean. Instead, we employ an iterated Generalized Method of Moments (iGMM) to estimate the conditional variance of the dependent variable directly from the data, thus encompassing the standard estimators in international trade literature (i.e., PPML, Gamma-PML, and OLS) as special cases. With conditional variance estimates, GPPML generates coefficient estimates that are more efficient and robust to the underlying data generating process. After establishing the consistency and the asymptotic properties of the G-PPML estimator, we use Monte Carlo simulations to demonstrate that G-PPML shows decent inite-sample performance regardless of the underlying assumption about the conditional variance. Estimations of a canonical gravity model with trade data reinforce the properties of G-PPML and validate the practical importance of our methods.
    Keywords: Poisson-Pseudo Maximum Likelihood; Iterated GMM; Gravity Models
    JEL: C13 C50 F10
    Date: 2022–12–06
    URL: http://d.repec.org/n?u=RePEc:ris:drxlwp:2022_013&r=ecm
  6. By: Bent Jesper Christensen (Aarhus University, Dale T. Mortensen Center, Danish Finance Institute, CREATES); Luca Neri (University of Bologna, Dale T. Mortensen Center, Ca’ Foscari University of Venice, CREATES); Juan Carlos Parra-Alvarez (Aarhus University, Dale T. Mortensen Center, Danish Finance Institute and CREATES)
    Abstract: We provide a general state space framework for estimation of the parameters of continuous-time linear DSGE models from data that are only available at discrete points in time. Our approach relies on the exact discrete-time representation of the equilibrium dynamics, which allows avoiding discretization errors. Using the Kalman filter, we construct the exact likelihood for data sampled either as stocks or flows, and estimate frequency-invariant parameters by maximum likelihood. We address the aliasing problem arising in multivariate settings and provide conditions for precluding it, which is required for local identification of the parameters in the continuous-time economic model. We recover the unobserved structural shocks at measurement times from the reduced-form residuals in the state space representation by exploiting the underlying causal links imposed by the economic theory and the information content of the discrete-time observations. We illustrate our approach using an off-the-shelf real business cycle model. We conduct extensive Monte Carlo experiments to study the finite sample properties of the estimator based on the exact discrete-time representation, and show they are superior to those based on a naive Euler-Maruyama discretization of the economic model. Finally, we estimate the model using postwar U.S. macroeconomic data, and offer examples of applications of our approach, including historical shock decomposition at different frequencies, and estimation based on mixed-frequency data. JEL classification: C13, C32, C68, E13, E32, J22 Key words: DSGE models, continuous time, exact discrete-time representation, stock and flow variables, Kalman filter, maximum likelihood, aliasing, structural shocks
    Date: 2022–12–20
    URL: http://d.repec.org/n?u=RePEc:aah:create:2022-12&r=ecm
  7. By: Shuyao Ke (College of Economics, Jinan University, China); Liangjun Su (School of Economics and Management, Tsinghua University, China); Peter C. B. Phillips (Cowles Foundation, Yale University)
    Abstract: This paper studies a linear panel data model with interactive fixed effects wherein regressors, factors and idiosyncratic error terms are all stationary but with potential long memory. The setup involves a new factor model formulation for which weakly dependent regressors, factors and innovations are embedded as a special case. Standard methods based on principal component decomposition and least squares estimation, as in Bai (2009), are found to suffer bias correction failure because the order of magnitude of the bias is determined in a complex manner by the memory parameters. To cope with this failure and to provide a simple implementable estimation procedure, frequency domain least squares estimation is proposed. The limit distribution of this frequency domain approach is established and a hybrid selection method is developed to determine the number of factors. Simulations show that the frequency domain estimator is robust to short memory and outperforms the time domain estimator when long range dependence is present. An empirical illustration of the approach is provided, examining the long-run relationship between stock return and realized volatility.
    Date: 2022–10
    URL: http://d.repec.org/n?u=RePEc:cwl:cwldpp:2351&r=ecm
  8. By: Less, Vivien; Sibbertsen, Philipp
    Abstract: We propose a semiparametric multivariate estimator and a multivariate score-type testing procedure under a perturbed multivariate fractional process. The estimator is based on the periodogram and uses a local Whittle criterion function which is generalised by an additional constant to capture the perturbation given in the long memory process. Explicitly addressing the noise term when approximating the spectral density near the origin results in a bias reduction, but at the cost of an increase in the asymptotic variance of the estimator. Further, we introduce a multivariate testing procedure to detect spurious long memory under a perturbed fractional framework. The test statistic is based on the weighted sum of the partial derivatives of the multivariate local Whittle with noise estimator. We show consistency of the test against the alternatives of smooth trend and random level shift processes. In addition, we prove consistency and asymptotic normality of the local Whittle estimator and we derive the limiting distribution of the test. An empirical example on the squared returns and the realised volatilities from the BEL 20, S&P BSE SENSEX, and the Spanish IBEX is conducted, and shows the usefulness of the procedures.
    Keywords: Signal-plus-noise; Multivariate local Whittle; Perturbation; Spurious long memory; Semi-parametric estimation; Stochastic volatility
    JEL: C12 C13 C32
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:han:dpaper:dp-704&r=ecm
  9. By: Andrea Bucci
    Abstract: In many applications, data are observed as matrices with temporal dependence. Matrix-variate time series modeling is a new branch of econometrics. Although stylized facts in several fields, the existing models do not account for regime switches in the dynamics of matrices that are not abrupt. In this paper, we extend linear matrix-variate autoregressive models by introducing a regime-switching model capable of accounting for smooth changes, the matrix smooth transition autoregressive model. We present the estimation processes with the asymptotic properties demonstrated with simulated and real data.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.08615&r=ecm
  10. By: Yuehao Bai; Jizhou Liu; Azeem M. Shaikh; Max Tabord-Meehan
    Abstract: This paper considers the problem of inference in cluster randomized trials where treatment status is determined according to a "matched pairs" design. Here, by a cluster randomized experiment, we mean one in which treatment is assigned at the level of the cluster; by a "matched pairs" design we mean that a sample of clusters is paired according to baseline, cluster-level covariates and, within each pair, one cluster is selected at random for treatment. We study the large sample behavior of a weighted difference-in-means estimator and derive two distinct sets of results depending on if the matching procedure does or does not match on cluster size. We then propose a variance estimator which is consistent in either case. We also study the behavior of a randomization test which permutes the treatment status for clusters within pairs, and establish its finite sample and asymptotic validity for testing specific null hypotheses.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.14903&r=ecm
  11. By: Boyuan Zhang
    Abstract: The assumption of group heterogeneity has become popular in panel data models. We develop a constrained Bayesian grouped estimator that exploits researchers' prior beliefs on groups in a form of pairwise constraints, indicating whether a pair of units is likely to belong to a same group or different groups. We propose a prior to incorporate the pairwise constraints with varying degrees of confidence. The whole framework is built on the nonparametric Bayesian method, which implicitly specifies a distribution over the group partitions, and so the posterior analysis takes the uncertainty of the latent group structure into account. Monte Carlo experiments reveal that adding prior knowledge yields more accurate estimates of coefficient and scores predictive gains over alternative estimators. We apply our method to two empirical applications. In a first application to forecasting U.S. CPI inflation, we illustrate that prior knowledge of groups improves density forecasts when the data is not entirely informative. A second application revisits the relationship between a country's income and its democratic transition; we identify heterogeneous income effects on democracy with five distinct groups over ninety countries.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.16714&r=ecm
  12. By: Yuya Sasaki; Yulong Wang
    Abstract: Policy analysts are often interested in treating subpopulations in the limit, such as infants with extremely lowest birth weights. Existing changes-in-changes (CIC) estimators are tailored to middle quantiles, and do not work well for such subpopulations. This paper proposes a new CIC estimator to accurately estimate treatment effects at extreme quantiles. With its asymptotic normality, we also propose a method of statistical inference, which is simple to implement. Based on simulation studies, we propose to use our extreme CIC estimator for extreme, such as below 5\% and above 95\%, quantiles, while the conventional CIC estimator should be used for intermediate quantiles. Applying the proposed method, we study effects of income gains from the 1993 EITC reform on infant birth weights for those at the most critical conditions.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.14870&r=ecm
  13. By: Zhu, Ziwei; Wang, Tengyao; Samworth, Richard J.
    Abstract: We study the problem of high-dimensional Principal Component Analysis (PCA) with missing observations. In a simple, homogeneous observation model, we show that an existing observed-proportion weighted (OPW) estimator of the leading principal components can (nearly) attain the minimax optimal rate of convergence, which exhibits an interesting phase transition. However, deeper investigation reveals that, particularly in more realistic settings where the observation probabilities are heterogeneous, the empirical performance of the OPW estimator can be unsatisfactory; moreover, in the noiseless case, it fails to provide exact recovery of the principal components. Our main contribution, then, is to introduce a new method, which we call primePCA, that is designed to cope with situations where observations may be missing in a heterogeneous manner. Starting from the OPW estimator, primePCA iteratively projects the observed entries of the data matrix onto the column space of our current estimate to impute the missing entries, and then updates our estimate by computing the leading right singular space of the imputed data matrix. We prove that the error of primePCA converges to zero at a geometric rate in the noiseless case, and when the signal strength is not too small. An important feature of our theoretical guarantees is that they depend on average, as opposed to worst-case, properties of the missingness mechanism. Our numerical studies on both simulated and real data reveal that primePCA exhibits very encouraging performance across a wide range of scenarios, including settings where the data are not Missing Completely At Random.
    Keywords: heterogeneous missingness; high-dimensional statistics; iterative projections; missing data; principal component analysis; Tengyao Wang was supported by EPSRC grant EP/T02772X/1 and Richard J. Samworth was supported by EPSRC grants EP/P031447/1 and EP/N031938/1; 101019498
    JEL: C1
    Date: 2022–11–20
    URL: http://d.repec.org/n?u=RePEc:ehl:lserod:117647&r=ecm
  14. By: Henderson, Daniel J. (University of Alabama); Sperlich, Stefan (University of Geneva)
    Abstract: We propose a complete framework for model-free difference-in-differences analysis with covariates, where model-free means data-driven, in particular nonparametric estimation and testing, variable and scale choice. We start with searching for the preferred data setup by simultaneously choosing confounders and a scale of the outcome variable along identification conditions. The treatment effects themselves are estimated in two steps: first, the heterogeneous effects stratified along the covariates, then the average treatment effect(s) for the population(s) of interest. We provide the asymptotic statistics as well as the finite sample behavior of our methods, and suggest bootstrap procedures to calculate standard errors and p-values of significance tests. The pertinence of our methods is shown with a study of the impact of the Deferred Action for Childhood Arrivals program on human capital responses of non-citizen immigrants. We show that past results underestimated the positive impact on school attendance for individuals aged 14-18, and the positive impact on high school completion. Moreover, we find that the parametric methods fail to identify the negative impact on school attendance of college aged individuals. Practical issues including bandwidth selection, sample weights, and implementation are given in the supplement.
    Keywords: nonparametrics, causal analysis, difference-in-differences estimators, heterogeneous treatment effects
    JEL: C14 A2
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:iza:izadps:dp15799&r=ecm
  15. By: Nicholas Brown (Queen's University)
    Abstract: This paper considers transformations of nonlinear semiparametric mean functions that yield moment conditions for estimation. Such transformations are said to be information equivalent if they yield the same asymptotic efficiency bound. I derive a unified theory of algebraic equivalence for moment conditions created by a given linear transformation. The main equivalence result states that under standard regularity conditions, transformations that create conditional moment restrictions in a given empirical setting need only to have an equal rank to reach the same efficiency bound. Examples are included, where I compare feasible and infeasible transformations of both nonlinear models with multiplicative heterogeneity and linear models with arbitrary unobserved factor structures.
    Keywords: Semiparametric efficiency, nonlinear regression, generalized least squares, fixed-T
    JEL: C14 C33 C36
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:qed:wpaper:1494&r=ecm
  16. By: Clément de Chaisemartin (ECON - Département d'économie (Sciences Po) - Sciences Po - Sciences Po - CNRS - Centre National de la Recherche Scientifique); Ziteng Lei (Renmin University of China)
    Abstract: Bartik regressions use locations' differential exposure to nationwide sector-level shocks as an instrument to estimate the effect of a location-level treatment on an outcome. We show that under parallel-trends assumptions, Bartik regressions may estimate weighted sums of location-and-period-specific treatment effects, with some negative weights. Accordingly, they may not be robust to heterogeneous effects across locations or periods. We provide simple diagnostic tools researchers may use to assess the robustness of their regression. Finally, we propose alternative correlated-random-coefficient estimators that are more robust to heterogeneous effects than Bartik regressions. We use our results to revisit two empirical applications.
    Date: 2022–07–21
    URL: http://d.repec.org/n?u=RePEc:hal:wpaper:hal-03873913&r=ecm
  17. By: Skrobotov Anton (Russian Presidential Academy of National Economy and Public Administration)
    Abstract: This paper provides an overview of methods of testing for explosive bubbles in time series. Various issues associated with the inclusion of constants in regression, the problem of initial condition and the problem of possible non-stationary volatility in the dynamics of a time series are considered. Methods for dating the explosive bubbles are also discussed. Сlassification-JEL:
    Keywords: unit roots, blast bubble, non-stationary volatility, dating, bootstrap, rejection pooling, explosive process
    Date: 2021–01
    URL: http://d.repec.org/n?u=RePEc:rnp:wpaper:s21130&r=ecm
  18. By: Matteo Iacopini; Francesco Ravazzolo; Luca Rossini
    Abstract: This article proposes a novel Bayesian multivariate quantile regression to forecast the tail behavior of US macro and financial indicators, where the homoskedasticity assumption is relaxed to allow for time-varying volatility. In particular, we exploit the mixture representation of the multivariate asymmetric Laplace likelihood and the Cholesky-type decomposition of the scale matrix to introduce stochastic volatility and GARCH processes, and we provide an efficient MCMC to estimate them. The proposed models outperform the homoskedastic benchmark mainly when predicting the distribution's tails. We provide a model combination using a quantile score-based weighting scheme, which leads to improved performances, notably when no single model uniformly outperforms the other across quantiles, time, or variables.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.16121&r=ecm
  19. By: Riccardo Di Francesco (DEF, University of Rome "Tor Vergata")
    Abstract: In this paper, I propose a data-driven approach to discover heterogeneous subpopulations in a selection-on-observables framework that avoids the risk of data snooping and the drawbacks of pre-analysis plans. The approach constructs partitions of the population in a completely nonparametric fashion and can handle covariate spaces of arbitrary dimensions and arbitrary patterns of interaction among covariates. I exploit estimated unit-level treatment effects to grow and prune an “aggregation tree” that aggregates observations into groups. This approach formalizes the trade-off between parsimony and granularity implicit in the aggregation process. By varying the key parameter of the assumed cost-complexity criterion, a sequence of “optimal” partitions is generated, one for each level of granularity. The resulting sequence is nested, as previous groupings are never undone when moving to coarser levels. I illustrate the use of the proposed methodology through an empirical exercise that revisits the effects of maternal smoking on infants’ weight.
    Keywords: Causality, conditional average treatment effects, recursive partitioning, subgroups discovery, subgroup analysis
    JEL: C29 C45 C55
    Date: 2022–12–15
    URL: http://d.repec.org/n?u=RePEc:rtv:ceisrp:546&r=ecm
  20. By: John Mullahy; Edward C. Norton
    Abstract: Dependent variables that are non-negative, follow right-skewed distributions, and have large probability mass at zero arise often in empirical economics. Two classes of models that transform the dependent variable y — the natural logarithm of y plus a constant and the inverse hyperbolic sine — have been widely used in empirical work. We show that these two classes of models share several features that raise concerns about their application. The concerns are particularly prominent when dependent variables are frequently observed at zero, which in many instances is the main motivation for using them in the first place. The crux of the concern is that these models have an extra parameter that is generally not determined by theory but whose values have enormous consequences for point estimates. As these parameters go to extreme values estimated marginal effects on outcomes' natural scales approach those of either an untransformed linear regression or a normed linear probability model. Across a wide variety of simulated data, two-part models yield correct marginal effects, as do OLS on the untransformed y and Poisson regression. If researchers care about estimating marginal effects, we recommend using these simpler models that do not rely on transformations.
    JEL: C18 C20 I10
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:nbr:nberwo:30735&r=ecm
  21. By: Mawuli Segnon
    Abstract: This paper establishes necessary and sufficient conditions for the existence of a unique strictly stationary and ergodic solution for integer-valued autoregressive conditional heteroscedasticity (INARCH) processes. We also provide conditions that guarantee existence of higher order moments. The results apply to integer-valued GARCH model, and its long-memory versions with hyperbolically decaying coefficients and turn out to be instrumental on deriving large sample properties of the maximum likelihood estimators of the model parameters.
    Keywords: INARCH processes; Stationarity; Ergodicity; Lyapunov exponent; Maximum likelihood estimation
    JEL: C1 C4 C5
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:cqe:wpaper:10222&r=ecm
  22. By: Zuckerman, Daniel
    Abstract: When a physical or mathematical model is inferred from experimental data, it is essential to assess uncertainties in model parameters, if only because highly uncertain parameters effectively have not been learned from the data. This discussion compares two frameworks for estimating uncertainty: maximum likelihood (ML) and Bayesian inference (BI). We see that the ML framework is an approximation to the BI approach, in that ML uses a subset of the likelihood information whereas BI uses all of it. Interestingly, both approaches start from the same likelihood-based probabilistic framework. Both approaches require prior assumptions, which may only remain implicit in the case of ML. Both approaches require numerical care in complex systems with rough parameter-space landscapes.
    Date: 2022–11–23
    URL: http://d.repec.org/n?u=RePEc:osf:osfxxx:ajuvf&r=ecm
  23. By: Shi, Shuping (Macquarie University); Yu, Jun (Singapore Management University); Zhang, Chen (Singapore Management University)
    Abstract: The fractional Brownian motion (fBm) process is a continuous-time Gaussian process with its increment being the fractional Gaussian noise (fGn). It has enjoyed widespread empirical applications across many fields, from science to economics and finance. The dynamics of fBm and fGn are governed by a fractional parameter H ∈ (0, 1). This paper first derives an analytical expression for the spectral density of fGn and investigates the accuracy of various approximation methods for the spectral density. Next, we conduct an extensive Monte Carlo study comparing the finite sample performance and computational cost of alternative estimation methods for H under the fGn specification. These methods include the log periodogram regression method, the local Whittle method, the time-domain maximum likelihood (ML) method, the Whittle ML method, and the change-of-frequency method. We implement two versions of the Whittle method, one based on the analytical expression for the spectral density and the other based on Paxson’s approximation. Special attention is paid to highly anti-persistent processes with H close to zero, which are of empirical relevance to financial volatility modelling. Considering the trade-off between statistical and computational efficiency, we recommend using either the Whittle ML method based on Paxson’s approximation or the time-domain ML method. We model the log realized volatility dynamics of 40 financial assets in the US market from 2012 to 2019 with fBm. Although all estimation methods suggest rough volatility, the implied degree of roughness varies substantially with the estimation methods, highlighting the importance of understanding the finite sample performance of various estimation methods.
    Keywords: Fractional Brownian motion; Fractional Gaussian noise; Semiparametric method; Maximum likelihood; Whittle likelihood; Change-of-frequency; Realised volatility
    JEL: C12 C22 G01
    Date: 2022–11–22
    URL: http://d.repec.org/n?u=RePEc:ris:smuesw:2022_013&r=ecm
  24. By: Malte Kn\"uppel; Fabian Kr\"uger; Marc-Oliver Pohle
    Abstract: Multivariate distributional forecasts have become widespread in recent years. To assess the quality of such forecasts, suitable evaluation methods are needed. In the univariate case, calibration tests based on the probability integral transform (PIT) are routinely used. However, multivariate extensions of PIT-based calibration tests face various challenges. We therefore introduce a general framework for calibration testing in the multivariate case and propose two new tests that arise from it. Both approaches use proper scoring rules and are simple to implement even in large dimensions. The first employs the PIT of the score. The second is based on comparing the expected performance of the forecast distribution (i.e., the expected score) to its actual performance based on realized observations (i.e., the realized score). The tests have good size and power properties in simulations and solve various problems of existing tests. We apply the new tests to forecast distributions for macroeconomic and financial time series data.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.16362&r=ecm
  25. By: Oliver Cassagneau-Francis (UCL - University College of London [London]); Robert Gary-Bobo (UP1 UFR02 - Université Paris 1 Panthéon-Sorbonne - École d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne, CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique, CREST-THEMA - CREST - Centre de Recherche en Économie et Statistique - ENSAI - Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] - X - École polytechnique - ENSAE Paris - École Nationale de la Statistique et de l'Administration Économique - CNRS - Centre National de la Recherche Scientifique - THEMA - Théorie économique, modélisation et applications - CNRS - Centre National de la Recherche Scientifique - CY - CY Cergy Paris Université); Julie Pernaudet (University of Chicago); Jean-Marc Robin (ECON - Département d'économie (Sciences Po) - Sciences Po - Sciences Po - CNRS - Centre National de la Recherche Scientifique)
    Abstract: We develop a finite-mixture framework for nonparametric difference-indifference analysis with unobserved heterogeneity correlating treatment and outcome. Our framework includes an instrumental variable for the treatment, and we demonstrate that this allows us to relax the common-trend assumption. Outcomes can be modeled as first-order Markovian, provided at least 2 post-treatment observations of the outcome are available. We provide a nonparametric identification proof. We apply our framework to evaluate the effect of on-the-job training on wages, using novel French linked employee-employer data. Estimating our model using an EM-algorithm, we find small ATEs and ATTs on hourly wages, around 1%.
    Keywords: Finite Mixtures, Unobserved Heterogeneity, EM Algorithm, Wage Distributions, Training, Matched Employer-Employee Data E24, E32, J63, J64
    Date: 2022–10–10
    URL: http://d.repec.org/n?u=RePEc:hal:wpaper:hal-03869547&r=ecm
  26. By: Jan Ditzen (Free University of Bozen-Bolzano, Italy); Francesco Ravazzolo (Free University of Bozen-Bolzano, Italy)
    Abstract: For western economies a long-forgotten phenomenon is on the horizon: rising inflation rates. We propose a novel approach christened D^{2}ML to identify drivers of national inflation. D^{2}ML combines machine learning for model selection with time dependent data and graphical models to estimate the inverse of the covariance matrix, which is then used to identify dominant drivers. Using a dataset of 33 countries, we find that the US inflation rate and oil prices are dominant drivers of national in ation rates. For a more general framework, we carry out Monte Carlo simulations to show that our estimator correctly identifies dominant drivers.
    Keywords: Time Series, Machine Learning, LASSO, High dimensional data, Dominant Units, Inflation.
    JEL: C22 C23 C55
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:bzn:wpaper:bemps97&r=ecm
  27. By: Yingyao Hu
    Abstract: In empirical studies, the data usually don't include all the variables of interest in an economic model. This paper shows the identification of unobserved variables in observations at the population level. When the observables are distinct in each observation, there exists a function mapping from the observables to the unobservables. Such a function guarantees the uniqueness of the latent value in each observation. The key lies in the identification of the joint distribution of observables and unobservables from the distribution of observables. The joint distribution of observables and unobservables then reveal the latent value in each observation. Three examples of this result are discussed.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.02585&r=ecm
  28. By: Nicolas Apfel; Frank Windmeijer
    Abstract: For the classical linear model with an endogenous variable estimated by the method of instrumental variables (IVs) with multiple instruments, Masten and Poirier (2021) introduced the falsification adaptive set (FAS). When a model is falsified, the FAS reflects the model uncertainty that arises from falsification of the baseline model. It is the set of just-identified IV estimands, where each relevant instrument is considered as the just-identifying instrument in turn, whilst all other instruments are included as controls. It therefore applies to the case where the exogeneity assumption holds and invalid instruments violate the exclusion assumption only. We propose a generalized FAS that reflects the model uncertainty when some instruments violate the exogeneity assumption and/or some instruments violate the exclusion assumption. This FAS is the set of all possible just-identified IV estimands where the just-identifying instrument is relevant. There are a maximum of $k_{z}2^{k_{z}-1}$ such estimands, where $k_{z}$ is the number of instruments. If there is at least one relevant instrument that is valid in the sense that it satisfies the exogeneity and exclusion assumptions, then this generalized FAS is guaranteed to contain $\beta$ and therefore to be the identified set for $\beta$.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.04814&r=ecm
  29. By: Edwin Fourrier-Nicolai (University of Trento [Trento]); Michel Lubrano (AMSE - Aix-Marseille Sciences Economiques - EHESS - École des hautes études en sciences sociales - AMU - Aix Marseille Université - ECM - École Centrale de Marseille - CNRS - Centre National de la Recherche Scientifique)
    Abstract: This paper examines the question of non-anonymous Growth Incidence Curves (na-GIC) from a Bayesian inferential point of view. Building on the notion of conditional quantiles of Barnett (1976), we show that removing the anonymity axiom leads to a non-parametric inference problem. From a Bayesian point of view, an approach using Bernstein polynomials provides a simple solution and immediate confidence intervals, tests and a way to compare two na-GIC. The paper illustrates the approach to the question of academic wage formation and tries to shed some light on wether academic recruitment leads to a super stars phenomenon, that is a large increase of top wages, or not. Equipped with Bayesian na-GIC's, we show that wages at Michigan State University experienced a top compression leading to a shrinking of the wage scale. We finally analyse gender and ethnic questions in order to detect if the implemented pro-active policies were efficient.
    Keywords: Conditional quantiles, non-anonymous GIC, Bayesian inference, wage formation, gender policy, ethnic discrimination
    Date: 2022–11–30
    URL: http://d.repec.org/n?u=RePEc:hal:wpaper:hal-03880243&r=ecm
  30. By: Forni, Mario (University of Modena and Reggio Emilia, CEPR and RECent); Gambetti, Luca (University of Barcelona, BSE, University of Turin & CCA); Ricco, Giovanni (University of Warwick, OFCE-SciencesPo, and CEPR)
    Abstract: We propose a novel external-instrument SVAR procedure to identify and estimate the impulse response functions, regardless of the shock being invertible or recoverable. When the shock is recoverable, we also show how to estimate the unit variance shock and the ‘absolute’ response functions. When the shock is invertible, the method collapses to the standard proxy-SVAR procedure. We show how to test for recoverability and invertibility. We apply our techniques to a monetary policy VAR. It turns out that, using standard specifications, the monetary policy shock is not invertible, but is recoverable. When using our procedure, results are plausible even in a parsimonious specification, not including financial variables. Monetary policy has significant and sizeable effects on prices. JEL Codes: C32 ; E32.
    Keywords: Proxy-SVAR ; SVAR-IV ; Impulse response functions ; Variance Decomposition ; Historical Decomposition ; Monetary Policy Shock
    Date: 2022
    URL: http://d.repec.org/n?u=RePEc:wrk:warwec:1444&r=ecm
  31. By: Jaros{\l}aw Gruszka; Janusz Szwabi\'nski
    Abstract: Parametric estimation of stochastic differential equations (SDEs) has been a subject of intense studies already for several decades. The Heston model for instance is driven by two coupled SDEs and is often used in financial mathematics for the dynamics of the asset prices and their volatility. Calibrating it to real data would be very useful in many practical scenarios. It is very challenging however, since the volatility is not directly observable. In this paper, a complete estimation procedure of the Heston model without and with jumps in the asset prices is presented. Bayesian regression combined with the particle filtering method is used as the estimation framework. Within the framework, we propose a novel approach to handle jumps in order to neutralise their negative impact on the estimates of the key parameters of the model. An improvement of the sampling in the particle filtering method is discussed as well. Our analysis is supported by numerical simulations of the Heston model to investigate the performance of the estimators. And a practical follow-along recipe is given to allow for finding adequate estimates from any given data.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.14814&r=ecm
  32. By: Laura Eslava; Fernando Baltazar-Larios; Bor Reynoso
    Abstract: We propose a method for obtaining maximum likelihood estimates (MLEs) of a Markov-Modulated Jump-Diffusion Model (MMJDM) when the data is a discrete time sample of the diffusion process, the jumps follow a Laplace distribution, and the parameters of the diffusion are controlled by a Markov Jump Process (MJP). The data can be viewed as incomplete observation of a model with a tractable likelihood function. Therefore we use the EM-algorithm to obtain MLEs of the parameters. We validate our method with simulated data. The motivation for obtaining estimates of this model is that stock prices have distinct drift and volatility at distinct periods of time. The assumption is that these phases are modulated by macroeconomic environments whose changes are given by discontinuities or jumps in prices. This model improves on the stock prices representation of classical models such as the model of Black and Scholes or Merton's Jump-Diffusion Model (JDM). We fit the model to the stock prices of Amazon and Netflix during a 15-years period and use our method to estimate the MLEs.
    Date: 2022–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2211.17220&r=ecm
  33. By: George Daniel Mateescu (Institute for Economic Forecasting, Romanian Academy)
    Abstract: In the present work, we propose to investigate the possibility of using, in the analysis of data series, some notions similar to the entropy and the informational energy of a random distribution. Using these tools it is possible to appreciate some characteristics of the data series, exemplified in the article by the linear model.
    Keywords: entropy, frequency, histogram, informational energy
    JEL: C22 C40
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:rjr:wpiecf:221001&r=ecm
  34. By: David Van Dijcke
    Abstract: It is well-known that production functions are potentially misspecified when revenue is used as a proxy for output. In this paper, I formalize and strengthen this common knowledge by showing that neither the production function nor Hicks-neutral productivity can be identified when revenue is used as a proxy for physical output. This result holds under the standard assumptions used in the literature for a large class of production functions, including all commonly used parametric forms. Among the prevalent approaches to address this issue, I show that only those which impose assumptions on the underlying demand system can possibly identify the production function.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.04620&r=ecm
  35. By: Clément de Chaisemartin (ECON - Département d'économie (Sciences Po) - Sciences Po - Sciences Po - CNRS - Centre National de la Recherche Scientifique); Xavier d'Haultfoeuille (CREST - Centre de Recherche en Économie et Statistique - ENSAI - Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] - X - École polytechnique - ENSAE Paris - École Nationale de la Statistique et de l'Administration Économique - CNRS - Centre National de la Recherche Scientifique)
    Abstract: We propose a difference-in-difference estimator with a continuous treatment. We consider an heterogeneous adoption design where no unit is treated at period one, all units receive a strictly positive treatment dose at period two, and there are units with a treatment dose close to zero at period two. Our estimator uses "quasi-stayers", namely units with a period-two treatment below a bandwidth, as a control group to infer the counterfactual outcome evolution without treatment. We propose an optimal bandwidth that minimizes an asymptotic approximation of the estimator's mean-squared-error.
    Date: 2022–11–23
    URL: http://d.repec.org/n?u=RePEc:hal:wpaper:hal-03873937&r=ecm
  36. By: Marc Wildi; Branka Hadji Misheva
    Abstract: Artificial intelligence is creating one of the biggest revolution across technology driven application fields. For the finance sector, it offers many opportunities for significant market innovation and yet broad adoption of AI systems heavily relies on our trust in their outputs. Trust in technology is enabled by understanding the rationale behind the predictions made. To this end, the concept of eXplainable AI emerged introducing a suite of techniques attempting to explain to users how complex models arrived at a certain decision. For cross-sectional data classical XAI approaches can lead to valuable insights about the models' inner workings, but these techniques generally cannot cope well with longitudinal data (time series) in the presence of dependence structure and non-stationarity. We here propose a novel XAI technique for deep learning methods which preserves and exploits the natural time ordering of the data.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.02906&r=ecm
  37. By: Shi, Chengchun; Zhang, Shengxing; Lu, Wenbin; Song, Rui
    Abstract: Reinforcement learning is a general technique that allows an agent to learn an optimal policy and interact with an environment in sequential decision-making problems. The goodness of a policy is measured by its value function starting from some initial state. The focus of this paper was to construct confidence intervals (CIs) for a policy’s value in infinite horizon settings where the number of decision points diverges to infinity. We propose to model the action-value state function (Q-function) associated with a policy based on series/sieve method to derive its confidence interval. When the target policy depends on the observed data as well, we propose a SequentiAl Value Evaluation (SAVE) method to recursively update the estimated policy and its value estimator. As long as either the number of trajectories or the number of decision points diverges to infinity, we show that the proposed CI achieves nominal coverage even in cases where the optimal policy is not unique. Simulation studies are conducted to back up our theoretical findings. We apply the proposed method to a dataset from mobile health studies and find that reinforcement learning algorithms could help improve patient’s health status. A Python implementation of the proposed procedure is available at https://github.com/shengzhang37/SAVE.
    Keywords: bidirectional asymptotics; confidence interval; infinite horizons; reinforcement learning; value function; New Research Support Fund; DMS-1555244; DMS-2113637
    JEL: C1
    Date: 2022–07–01
    URL: http://d.repec.org/n?u=RePEc:ehl:lserod:110882&r=ecm

This nep-ecm issue is ©2023 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.