nep-ecm New Economics Papers
on Econometrics
Issue of 2023‒01‒23
twenty papers chosen by
Sune Karlsson
Örebro universitet

  1. Score-type tests for normal mixtures By Dante Amengual; Xinyue Bei; Marine Carrasco; Enrique Sentana
  2. Missing Data in Asset Pricing Panels By Joachim Freyberger; Björn Höppner; Andreas Neuhierl; Michael Weber
  3. Semiparametric Distribution Regression with Instruments and Monotonicity By Dominik Wied
  4. Simultaneous Inference of Trend in Partially Linear Time Series By Jiaqi Li; Likai Chen; Kun Ho Kim; Tianwei Zhou
  5. Near-Optimal Non-Parametric Sequential Tests and Confidence Sequences with Possibly Dependent Observations By Aurelien Bibaut; Nathan Kallus; Michael Lindon
  6. On LASSO for High Dimensional Predictive Regression By Ziwei Mei; Zhentao Shi
  7. Local Projection Based Inference under General Conditions By Ke-Li Xu
  8. Supercompliers By Matthew L. Comey; Amanda R. Eng; Zhuan Pei
  9. Causal identification with subjective outcomes By Leonard Goff
  10. What Estimators Are Unbiased For Linear Models? By Lihua Lei; Jeffrey Wooldridge
  11. Gaussian Heteroskedastic Empirical Bayes without Independence By Jiafeng Chen
  12. Estimation and forecasting using mixed-frequency DSGE models By Meyer-Gohde, Alexander; Shabalina, Ekaterina
  13. A Note on the Estimation of Job Amenities and Labor Productivity By Arnaud Dupuy; Alfred Galichon
  14. Forward Orthogonal Deviations GMM and the Absence of Large Sample Bias By Robert F. Phillips
  15. Efficient Sampling for Realized Variance Estimation in Time-Changed Diffusion Models By Timo Dimitriadis; Roxana Halbleib; Jeannine Polivka; Sina Streicher
  16. Measuring Poverty Dynamics with Synthetic Panels Based on Repeated Cross-Sections By Hai-Anh Dang; Peter Lanjouw
  17. Probabilistic quantile factor analysis By Dimitris Korobilis; Maximilian Schr\"oder
  18. The acceptable R-square in empirical modelling for social science research By Ozili, Peterson K
  19. Machine learning methods in finance: Recent applications and prospects By Hoang, Daniel; Wiegratz, Kevin
  20. How Much Should we Trust Estimates of Firm Effects and Worker Sorting? By Stephane Bonhomme; Kerstin Holzheu; Thibaut Lamadon; Elena Manresa; Magne Mogstad; Bradley Setzler

  1. By: Dante Amengual (CEMFI, Centro de Estudios Monetarios y Financieros); Xinyue Bei (Duke University); Marine Carrasco (Université de Montréal); Enrique Sentana (CEMFI, Centro de Estudios Monetarios y Financieros)
    Abstract: Testing normality against discrete normal mixtures is complex because some parameters turn increasingly underidentified along alternative ways of approaching the null, others are inequality constrained, and several higher-order derivatives become identically 0. These problems make the maximum of the alternative model log-likelihood function numerically unreliable. We propose score-type tests asymptotically equivalent to the likelihood ratio as the largest of two simple intuitive statistics that only require estimation under the null. One novelty of our approach is that we treat symmetrically both ways of writing the null hypothesis without excluding any region of the parameter space. We derive the asymptotic distribution of our tests under the null and sequences of local alternatives. We also show that their asymptotic distribution is the same whether applied to observations or standardized residuals from heteroskedastic regression models. Finally, we study their power in simulations and apply them to the residuals of Mincer earnings functions.
    Keywords: Generalized extremum tests, higher-order identifiability, likelihood ratio test, Mincer equations.
    JEL: C12 C46
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:cmf:wpaper:wp2022_2213&r=ecm
  2. By: Joachim Freyberger; Björn Höppner; Andreas Neuhierl; Michael Weber
    Abstract: Missing data for return predictors is a common problem in cross sectional asset pricing. Most papers do not explicitly discuss how they deal with missing data but conventional treatments focus on the subset of firms with no missing data for any predictor or impute the unconditional mean. Both methods have undesirable properties - they are either inefficient or lead to biased estimators and incorrect inference. We propose a simple and computationally attractive alternative using conditional mean imputations and weighted least squares, cast in a generalized method of moments (GMM) framework. This method allows us to use all observations with observed returns, it results in valid inference, and it can be applied in non-linear and high-dimensional settings. In Monte Carlo simulations, we find that it performs almost as well as the efficient but computationally costly GMM estimator in many cases. We apply our procedure to a large panel of return predictors and find that it leads to improved out-of-sample predictability.
    JEL: C14 C58 G12
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:nbr:nberwo:30761&r=ecm
  3. By: Dominik Wied
    Abstract: This paper proposes IV-based estimators for the semiparametric distribution regression model in the presence of an endogenous regressor, which are based on an extension of IV probit estimators. We discuss the causal interpretation of the estimators and two methods (monotone rearrangement and isotonic regression) to ensure a monotonically increasing distribution function. Asymptotic properties and simulation evidence are provided. An application to wage equations reveals statistically significant and heterogeneous differences to the inconsistent OLS-based estimator.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.03704&r=ecm
  4. By: Jiaqi Li; Likai Chen; Kun Ho Kim; Tianwei Zhou
    Abstract: We introduce a new methodology to conduct simultaneous inference of non-parametric trend in a partially linear time series regression model where the trend is a multivariate unknown function. In particular, we construct a simultaneous confidence region (SCR) for the trend function by extending the high-dimensional Gaussian approximation to dependent processes with continuous index sets. Our results allow for a more general dependence structure compared to previous works and are widely applicable to a variety of linear and non-linear auto-regressive processes. We demonstrate the validity of our proposed inference approach by examining the finite-sample performance in the simulation study. The method is also applied to a real example in time series: the forward premium regression, where we construct the SCR for the foreign exchange risk premium in the exchange rate data.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.10359&r=ecm
  5. By: Aurelien Bibaut; Nathan Kallus; Michael Lindon
    Abstract: Sequential testing, always-valid $p$-values, and confidence sequences promise flexible statistical inference and on-the-fly decision making. However, unlike fixed-$n$ inference based on asymptotic normality, existing sequential tests either make parametric assumptions and end up under-covering/over-rejecting when these fail or use non-parametric but conservative concentration inequalities and end up over-covering/under-rejecting. To circumvent these issues, we sidestep exact at-least-$\alpha$ coverage and focus on asymptotically exact coverage and asymptotic optimality. That is, we seek sequential tests whose probability of ever rejecting a true hypothesis asymptotically approaches $\alpha$ and whose expected time to reject a false hypothesis approaches a lower bound on all tests with asymptotic coverage at least $\alpha$, both under an appropriate asymptotic regime. We permit observations to be both non-parametric and dependent and focus on testing whether the observations form a martingale difference sequence. We propose the universal sequential probability ratio test (uSPRT), a slight modification to the normal-mixture sequential probability ratio test, where we add a burn-in period and adjust thresholds accordingly. We show that even in this very general setting, the uSPRT is asymptotically optimal under mild generic conditions. We apply the results to stabilized estimating equations to test means, treatment effects, etc. Our results also provide corresponding guarantees for the implied confidence sequences. Numerical simulations verify our guarantees and the benefits of the uSPRT over alternatives.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.14411&r=ecm
  6. By: Ziwei Mei; Zhentao Shi
    Abstract: In a high dimensional linear predictive regression where the number of potential predictors can be larger than the sample size, we consider using LASSO, a popular L1-penalized regression method, to estimate the sparse coefficients when many unit root regressors are present. Consistency of LASSO relies on two building blocks: the deviation bound of the cross product of the regressors and the error term, and the restricted eigenvalue of the Gram matrix of the regressors. In our setting where unit root regressors are driven by temporal dependent non-Gaussian innovations, we establish original probabilistic bounds for these two building blocks. The bounds imply that the rates of convergence of LASSO are different from those in the familiar cross sectional case. In practical applications given a mixture of stationary and nonstationary predictors, asymptotic guarantee of LASSO is preserved if all predictors are scale-standardized. In an empirical example of forecasting the unemployment rate with many macroeconomic time series, strong performance is delivered by LASSO when the initial specification is guided by macroeconomic domain expertise.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.07052&r=ecm
  7. By: Ke-Li Xu (Indiana University, Department of Economics)
    Abstract: This paper provides the uniform asymptotic theory for local projection (LP) regression when the true lag order of the model is unknown, possibly in nity. The theory allows for various persistence levels of the data, growing response horizons, and general conditionally heteroskedastic shocks. Based on the theory, we make two contributions. First, we show that LPs are semiparametrically efficient under classical assumptions on data and horizons if the controlled lag order diverges. Thus the commonly perceived efficiency loss of running LPs is asymptotically negligible with many controls. Second, we propose LP-based inferences for (individual and cumulated) impulse responses with robustness properties not shared by other existing methods. Inference methods using two different standard errors are considered, and neither involves HAR-type correction. The uniform validity for the first method depends on a zero fourth moment condition on shocks, while the validity for the second holds more generally for martingale-difference heteroskedastic shocks.
    Keywords: Impulse response; local projection; persistence; semiparametric efficiency; uniform inference
    Date: 2023–01
    URL: http://d.repec.org/n?u=RePEc:inu:caeprp:2023001&r=ecm
  8. By: Matthew L. Comey; Amanda R. Eng; Zhuan Pei
    Abstract: In a binary-treatment instrumental variable framework, we define supercompliers as the subpopulation whose treatment take-up positively responds to eligibility and whose outcome positively responds to take-up. Supercompliers are the only subpopulation to benefit from treatment eligibility and, hence, are of great policy interest. Given a set of jointly testable assumptions and a binary outcome, we can completely identify the characteristics of supercompliers. Specifically, we require the standard assumptions from the local average treatment effect literature along with an outcome monotonicity assumption (i.e., treatment is weakly beneficial). We can estimate and conduct inference on supercomplier characteristics using standard instrumental variable regression.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.14105&r=ecm
  9. By: Leonard Goff
    Abstract: Many survey questions elicit responses on ordered scales for which the definitions of the categories are subjective, possibly varying by individual. This paper clarifies what is learned when these subjective reports are used as an outcome in regression-based causal inference. When a continuous treatment variable is statistically independent of both i) potential outcomes; and ii) heterogeneity in reporting styles, a nonparametric regression of numerical subjective reports on that variable uncovers a positively-weighted linear combination of local causal responses, among individuals who are on the margin between adjacent response categories. Though the weights do not aggregate to unity, the ratio of regression derivatives with respect to two such explanatory variables remains quantitatively meaningful. When results are extended to discrete regressors (e.g. a binary treatment), different weighting schemes apply to different regressors, making a comparison of their magnitudes more difficult. I obtain a partial identification result for ratios that holds when there are many categories and individual reporting functions are linear. I also provide results for identification using instrumental variables.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.14622&r=ecm
  10. By: Lihua Lei; Jeffrey Wooldridge
    Abstract: The recent thought-provoking paper by Hansen [2022, Econometrica] proved that the Gauss-Markov theorem continues to hold without the requirement that competing estimators are linear in the vector of outcomes. Despite the elegant proof, it was shown by the authors and other researchers that the main result in the earlier version of Hansen's paper does not extend the classic Gauss-Markov theorem because no nonlinear unbiased estimator exists under his conditions. To address the issue, Hansen [2022] added statements in the latest version with new conditions under which nonlinear unbiased estimators exist. Motivated by the lively discussion, we study a fundamental problem: what estimators are unbiased for a given class of linear models? We first review a line of highly relevant work dating back to the 1960s, which, unfortunately, have not drawn enough attention. Then, we introduce notation that allows us to restate and unify results from earlier work and Hansen [2022]. The new framework also allows us to highlight differences among previous conclusions. Lastly, we establish new representation theorems for unbiased estimators under different restrictions on the linear model, allowing the coefficients and covariance matrix to take only a finite number of values, the higher moments of the estimator and the dependent variable to exist, and the error distribution to be discrete, absolutely continuous, or dominated by another probability measure. Our results substantially generalize the claims of parallel commentaries on Hansen [2022] and a remarkable result by Koopmann [1982].
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.14185&r=ecm
  11. By: Jiafeng Chen
    Abstract: In this note, we propose empirical Bayes methods under heteroskedastic Gaussian location models, without assuming that the unknown location parameters are independent from the known scale parameters. We derive the finite-sample convergence rate of the mean-squared error regret of our method. We also derive a minimax regret lower bound that matches the upper bound up to logarithmic factors. Moreover, we link decision objectives of other economic problems to mean-squared error control. We illustrate our method with a simulation calibrated to the Opportunity Atlas (Chetty, Friedman, Hendren, Jones and Porter, 2018) and Creating Moves to Opportunity (Bergman, Chetty, DeLuca, Hendren, Katz and Palmer, 2019).
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.14444&r=ecm
  12. By: Meyer-Gohde, Alexander; Shabalina, Ekaterina
    Abstract: In this paper, we propose a new method to forecast macroeconomic variables that combines two existing approaches to mixed-frequency data in DSGE models. The first existing approach estimates the DSGE model in a quarterly frequency and uses higher frequency auxiliary data only for forecasting (see Giannone, Monti and Reichlin (2016)). The second method transforms a quarterly state space into a monthly frequency and applies, e.g., the Kalman filter when faced missing observations (see Foroni and Marcellino (2014)). Our algorithm combines the advantages of these two existing approaches, using the information from monthly auxiliary variables to inform in-between quarter DSGE estimates and forecasts. We compare our new method with the existing methods using simulated data from the textbook 3-equation New Keynesian model (see, e.g., Galí (2008)) and real-world data with the Smets and Wouters (2007) model. With the simulated data, our new method outperforms all other methods, including forecasts from the standard quarterly model. With real world data, incorporating auxiliary variables as in our method substantially decreases forecasting errors for recessions, but casting the model in a monthly frequency delivers better forecasts in normal times.
    Keywords: Mixed-frequency data, DSGE models, Forecasting, Estimation, Temporal aggregation
    JEL: E12 E17 E37 E44 C61 C68
    Date: 2022
    URL: http://d.repec.org/n?u=RePEc:zbw:imfswp:175&r=ecm
  13. By: Arnaud Dupuy (CREA - Center for Research in Economic Analysis - Uni.lu - Université du Luxembourg, University of Luxemburg, IZA - Forschungsinstitut zur Zukunft der Arbeit - Institute of Labor Economics); Alfred Galichon (NYU - New York University [New York] - NYU - NYU System, ECON - Département d'économie (Sciences Po) - Sciences Po - Sciences Po - CNRS - Centre National de la Recherche Scientifique)
    Abstract: This paper introduces a maximum likelihood estimator of the value of job amenities and labor productivity in a single matching market based on the observation of equilibrium matches and wages. The estimation procedure simultaneously fits both the matching patterns and the wage curve. While our estimator is suited for a wide range of assignment problems, we provide an application to the estimation of the Value of a Statistical Life using compensating wage differentials for the risk of fatal injury on the job. Using US data for 2017, we estimate the Value of Statistical Life at $6.3 million ($2017).
    Keywords: Matching, Observed transfers, Structural estimation, Value of statistical life
    Date: 2022–01
    URL: http://d.repec.org/n?u=RePEc:hal:spmain:hal-03893167&r=ecm
  14. By: Robert F. Phillips
    Abstract: It is well-known that generalized method of moments (GMM) estimators of dynamic panel data models can have asymptotic bias if the number of time periods (T) and the number of cross-sectional units (n) are both large (Alvarez and Arellano, 2003). This conclusion, however, follows when all available instruments are used. This paper provides results supporting a more optimistic conclusion when less than all available instrumental variables are used. If the number of instruments used per period increases with T sufficiently slowly, the bias of GMM estimators based on the forward orthogonal deviations transformation (FOD-GMM) disappears as T and n increase, regardless of the relative rate of increase in T and n. Monte Carlo evidence is provided that corroborates this claim. Moreover, a large-n, large-T distribution result is provided for FOD-GMM.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.14075&r=ecm
  15. By: Timo Dimitriadis; Roxana Halbleib; Jeannine Polivka; Sina Streicher
    Abstract: This paper illustrates the benefits of sampling intraday returns in intrinsic time for the estimation of integrated variance through the realized variance (RV) estimator. The intrinsic time transforms the clock time in accordance with the market's activity, which we measure by trading intensity (transaction time) or spot variance (business time). We theoretically show that the RV estimator is unbiased for all sampling schemes, but most efficient under business time, also under independent market microstructure noise. Our analysis builds on the flexible assumption that asset prices follow a diffusion process that is time-changed with a doubly stochastic Poisson process. This provides a flexible stochastic model for the prices together with their transaction times that allows for separate and stochastically varying trading intensity and tick variance processes that jointly govern the spot variance. These separate model components are particularly advantageous over e.g., standard diffusion models, as they allow to exploit and disentangle the effects of the two different sources of intraday information on the theoretical properties of RV. Extensive simulations confirm our theoretical results and show that business time remains superior under different noise specifications and for noise-corrected RV estimators. An empirical application to stock data provides further evidence for the benefits of using intrinsic sampling to get efficient RV estimators.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.11833&r=ecm
  16. By: Hai-Anh Dang (World Bank); Peter Lanjouw (VU University Amsterdam)
    Abstract: Panel data are rarely available for developing countries. Departing from traditional pseudo-panel methods that require multiple rounds of cross-sectional data to study poverty mobility at the cohort level, we develop a procedure that works with as few as two survey rounds and produces point estimates of transitions along the welfare distribution at the more disaggregated household level. Validation using Monte Carlo simulations and real cross-sectional and actual panel survey data—from several countries, spanning different income levels and geographical regions—perform well under various deviations from model assumptions. The method could also inform investigation of other welfare outcome dynamics.
    Keywords: transitory and chronic poverty, income mobility, consumption, cross sections, synthetic panels, household surveys
    JEL: C53 D31 I32 O15
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:inq:inqwps:ecineq2022-632&r=ecm
  17. By: Dimitris Korobilis; Maximilian Schr\"oder
    Abstract: This paper extends quantile factor analysis to a probabilistic variant that incorporates regularization and computationally efficient variational approximations. By means of synthetic and real data experiments it is established that the proposed estimator can achieve, in many cases, better accuracy than a recently proposed loss-based estimator. We contribute to the literature on measuring uncertainty by extracting new indexes of low, medium and high economic policy uncertainty, using the probabilistic quantile factor methodology. Medium and high indexes have clear contractionary effects, while the low index is benign for the economy, showing that not all manifestations of uncertainty are the same.
    Date: 2022–12
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2212.10301&r=ecm
  18. By: Ozili, Peterson K
    Abstract: This commentary article examines the acceptable R-square in social science empirical modelling with particular focus on why a low R-square model is acceptable in empirical social science research. The paper shows that a low R-square model is not necessarily bad. This is because the goal of most social science research modelling is not to predict human behaviour. Rather, the goal is often to assess whether specific predictors or explanatory variables have a significant effect on the dependent variable. Therefore, a low R-square of at least 0.1 (or 10 percent) is acceptable on the condition that some or most of the predictors or explanatory variables are statistically significant. If this condition is not met, the low R-square model cannot be accepted. A high R-square model is also acceptable provided that there is no spurious causation in the model and there is no multi-collinearity among the explanatory variables.
    Keywords: R-square, low R-square, social science, research, empirical model, modelling, regression.
    JEL: C10 C14 C15 C30 C50 C51
    Date: 2023
    URL: http://d.repec.org/n?u=RePEc:pra:mprapa:115769&r=ecm
  19. By: Hoang, Daniel; Wiegratz, Kevin
    Abstract: We study how researchers can apply machine learning (ML) methods in finance. We first establish that the two major categories of ML (supervised and unsupervised learning) address fundamentally different problems than traditional econometric approaches. Then, we review the current state of research on ML in finance and identify three archetypes of applications: i) the construction of superior and novel measures, ii) the reduction of prediction error, and iii) the extension of the standard econometric toolset. With this taxonomy, we give an outlook on potential future directions for both researchers and practitioners. Our results suggest large benefits of ML methods compared to traditional approaches and indicate that ML holds great potential for future research in finance.
    Keywords: Machine Learning, Artificial Intelligence, Big Data
    JEL: C45 G00
    Date: 2022
    URL: http://d.repec.org/n?u=RePEc:zbw:kitwps:158&r=ecm
  20. By: Stephane Bonhomme (University of Chicago); Kerstin Holzheu (ECON - Département d'économie (Sciences Po) - Sciences Po - Sciences Po - CNRS - Centre National de la Recherche Scientifique); Thibaut Lamadon (University of Chicago, NBER - National Bureau of Economic Research [New York] - NBER - The National Bureau of Economic Research, IFS - Laboratory of the Institute for Fiscal Studies - Institute for Fiscal Studies); Elena Manresa (NYU - New York University [New York] - NYU - NYU System); Magne Mogstad (University of Chicago, NBER - National Bureau of Economic Research [New York] - NBER - The National Bureau of Economic Research, IFS - Laboratory of the Institute for Fiscal Studies - Institute for Fiscal Studies); Bradley Setzler (Penn State - Pennsylvania State University - Penn State System)
    Abstract: Many studies use matched employer-employee data to estimate a statistical model of earnings determination with worker and firm fixed effects. Estimates based on this model have produced influential yet controversial conclusions. The objective of this paper is to assess the sensitivity of these conclusions to the biases that arise because of limited mobility of workers across firms. We use employer-employee data from the US and several European countries while taking advantage of both fixed-effects and random-effects methods for biascorrection. We find that limited mobility bias is severe and that bias-correction is important.
    Date: 2022
    URL: http://d.repec.org/n?u=RePEc:hal:spmain:hal-03882713&r=ecm

This nep-ecm issue is ©2023 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.