|
on Econometrics |
By: | Paul L. E. Grieco; Charles Murry; Joris Pinkse; Stephan Sagl |
Abstract: | We propose a conformant likelihood estimator with exogeneity restrictions (CLEER) for random coefficients discrete choice demand models that is applicable in a broad range of data settings. It combines the likelihoods of two mixed logit estimators—one for consumer level data, and one for product level data—with product level exogeneity restrictions. Our estimator is both efficient and conformant: its rates of convergence will be the fastest possible given the variation available in the data. The researcher does not need to pre-test or adjust the estimator and the inference procedure is valid across a wide variety of scenarios. Moreover, it can be tractably applied to large datasets. We illustrate the features of our estimator by comparing it to alternatives in the literature. |
JEL: | C13 C18 L0 |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:33397 |
By: | Bernard M. S. van Praag; J. Peter Hop; William H. Greene |
Abstract: | In the last few decades, the study of ordinal data in which the variable of interest is not exactly observed but only known to be in a specific ordinal category has become important. In Psychometrics such variables are analysed under the heading of item response models (IRM). In Econometrics, subjective well-being (SWB) and self-assessed health (SAH) studies, and in marketing research, Ordered Probit, Ordered Logit, and Interval Regression models are common research platforms. To emphasize that the problem is not specific to a specific discipline we will use the neutral term coarsened observation. For single-equation models estimation of the latent linear model by Maximum Likelihood (ML) is routine. But, for higher -dimensional multivariate models it is computationally cumbersome as estimation requires the evaluation of multivariate normal distribution functions on a large scale. Our proposed alternative estimation method, based on the Generalized Method of Moments (GMM), circumvents this multivariate integration problem. The method is based on the assumed zero correlations between explanatory variables and generalized residuals. This is more general than ML but coincides with ML if the error distribution is multivariate normal. It can be implemented by repeated application of standard techniques. GMM provides a simpler and faster approach than the usual ML approach. It is applicable to multiple -equation models with -dimensional error correlation matrices and response categories for the equation. It also yields a simple method to estimate polyserial and polychoric correlations. Comparison of our method with the outcomes of the Stata ML procedure cmp yields estimates that are not statistically different, while estimation by our method requires only a fraction of the computing time. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.10726 |
By: | Emily Breza; Arun G. Chandrasekhar; Davide Viviano |
Abstract: | When studying policy interventions, researchers are often interested in two related goals: i) learning for which types of individuals the program has the largest effects (heterogeneity) and ii) understanding whether those patterns of treatment effects have predictive power across environments (generalizability). To that end, we develop a framework to learn from the data how to partition observations into groups of individual and environmental characteristics whose effects are generalizable for others - a set of generalizable archetypes. Our view is that implicit in the task of archetypal discovery is detecting those contexts where effects do not generalize and where researchers should collect more evidence before drawing inference on treatment effects. We introduce a method that jointly estimates when and how a prediction can be formed and when, instead, researchers should admit ignorance and elicit further evidence before making predictions. We provide both a decision-theoretic and Bayesian foundation of our procedure. We derive finite-sample (frequentist) regret guarantees, asymptotic theory for inference, and discuss computational properties. We illustrate the benefits of our procedure over existing alternatives that would fail to admit ignorance and force pooling across all units by re-analyzing a multifaceted program targeted towards the poor across six different countries. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.13355 |
By: | Matias D. Cattaneo; Gregory Fletcher Cox; Michael Jansson; Kenichi Nagasawa |
Abstract: | An increasingly important class of estimators has members whose asymptotic distribution is non-Gaussian, yet characterizable as the argmax of a Gaussian process. This paper presents high-level sufficient conditions under which such asymptotic distributions admit a continuous distribution function. The plausibility of the sufficient conditions is demonstrated by verifying them in three prominent examples, namely maximum score estimation, empirical risk minimization, and threshold regression estimation. In turn, the continuity result buttresses several recently proposed inference procedures whose validity seems to require a result of the kind established herein. A notable feature of the high-level assumptions is that one of them is designed to enable us to employ the celebrated Cameron-Martin theorem. In a leading special case, the assumption in question is demonstrably weak and appears to be close to minimal. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.13265 |
By: | Clément de Chaisemartin (Sciences Po Paris); Xavier D’Haultfoeuille (CREST-ENSAE) |
Abstract: | Consider a parameter of interest, which can be consistently estimated under some conditions. Suppose also that we can at least partly test these conditions with specification tests. We consider the common practice of conducting inference on the parameter of interest conditional on not rejecting these tests. We show that if the tested conditions hold, conditional inference is valid, though possibly conservative. This holds generally, without imposing any assumption on the asymptotic dependence between the estimator of the parameter of interest and the specification test. |
Date: | 2025–01–24 |
URL: | https://d.repec.org/n?u=RePEc:crs:wpaper:2025-03 |
By: | Christian Bayer; Luis Calderon; Moritz Kuhn |
Abstract: | We develop a new method for deriving high-frequency synthetic distributions of consumption, income, and wealth. Modern theories of macroeconomic dynamics identify the joint distribution of consumption, income, and wealth as a key determinant of aggregate dynamics. Our novel method allows us to study their distributional dynamics over time. The method can incorporate different microdata sources, regardless of their frequency and coverage of variables, to generate high-frequency synthetic distributional data. We extend existing methods by allowing for more flexible data inputs. The core of the method is to treat the distributional data as a time series of functions whose underlying factor structure follows a state-space model, which we estimate using Bayesian techniques. We show that the novel method provides the high-frequency distributional data needed to understand better the dynamics of consumption and its distribution over the business cycle. |
Keywords: | Consumption, income, and wealth inequality; Macroeconomic dynamics; Dynamic state-space model; Functional time-series data; Bayesian statistics |
JEL: | E21 E32 E37 D31 C32 C55 |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:bon:boncrc:crctr224_2025_625 |
By: | Nikolaishvili, Giorgi (Wake Forest University, Economics Department) |
Abstract: | I propose the pass-through impulse response function (PT-IRF) as a novel reduced-form empirical approach to measuring transmission channel dynamics. In essence, a PT-IRF quantifies the propagation of a shock through the Granger causality of a specified set of endogenous variables within a dynamical system. This approach has fewer informational requirements than alternative methods, such as structural parameter and empirical policy counterfactual exercises. A PT-IRF only requires the specification of a reduced-form VAR and identification of a shock of interest, bypassing the need to either build a structural model or identify multiple shocks. I demonstrate the flexibility of PT-IRFs by empirically analyzing the indirect dynamic transmission of oil price shocks to inflation and output via interest rates, as well as the indirect dynamic effect of monetary policy shocks on output via changes in credit supply. |
Keywords: | Directed graph; dynamic propagation; Granger causality; vector autoregression |
JEL: | C10 C32 C50 E52 |
Date: | 2025–01–27 |
URL: | https://d.repec.org/n?u=RePEc:ris:wfuewp:0121 |
By: | Byunghoon Kang; Seojeong Lee; Juha Song |
Abstract: | The asymptotic behavior of GMM estimators depends critically on whether the underlying moment condition model is correctly specified. Hong and Li (2023, Econometric Theory) showed that GMM estimators with nonsmooth (non-directionally differentiable) moment functions are at best $n^{1/3}$-consistent under misspecification. Through simulations, we verify the slower convergence rate of GMM estimators in such cases. For the two-step GMM estimator with an estimated weight matrix, our results align with theory. However, for the one-step GMM estimator with the identity weight matrix, the convergence rate remains $\sqrt{n}$, even under severe misspecification. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.09540 |
By: | Rainey, Carlisle |
Abstract: | Recent work emphasizes the importance of statistical power and shows that power in the social sciences tends to be extremely low. In this paper, I offer simple rules that make statistical power more approachable for substantive researchers. The rules describe how researchers can compute power using (1) features of a reference population, (2) an existing study with a similar design and outcome, and/or (3) a pilot study. In the case of balanced, between-subjects designs (perhaps controlling for pre-treatment variables), these rules are sufficient for a complete and compelling power analysis for treatment effects and interactions using only paper-and-pencil. For more complex designs, these rules can provide a useful ballpark prediction before turning to specialized software or complex simulations. Most importantly, these rules help researchers develop a sharp intuition about statistical power. For example, it can be helpful for readers and researchers to know that experiments have 80\% power to detect effects that are 2.5 times larger than the standard error and how to easily form a conservative prediction of the standard error using pilot data. These rules lower the barrier to entry for researchers new to thinking carefully about statistical power and help researchers design powerful, informative experiments. |
Date: | 2025–01–24 |
URL: | https://d.repec.org/n?u=RePEc:osf:osfxxx:5am9q |
By: | Patrick Osatohanmwen (Free University of Bozen-Bolzano, Italy) |
Abstract: | In many real-life processes, data with high positive skewness are very common. Moreover, these data tend to exhibit heterogeneous characteristics in such a manner that using one parametric univariate probability distribution becomes inadequate to model such data. When the heterogeneity of such data can be appropriately separated into two components: the main innovation component, where the bulk of data is centered, and the tail component which contains some few extreme observations, in such a way, and without a loss in generality, that the data possesses high skewness to the right, the use of hybrid models becomes very viable to model the data. In this paper, we propose a new two-component hybrid model which joins the half-normal distribution for the main innovation of a highly right-skewed data with the generalized Pareto distribution (GPD) for the observations in the data above a certain threshold. To enhance efficiency in the estimation of the parameters of the hybrid model, an unsupervised iterative algorithm (UIA) is adopted. An application of the hybrid model in modeling the absolute log returns of the S&P500 index and the intensity of rainfall which triggered some debris flow events in the South Tyrol region of Italy is carried out. |
Keywords: | Estimation algorithm; Generalized Pareto distribution; Half-normal distribution; Hybrid model; S&P500. |
JEL: | C02 |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:bzn:wpaper:bemps108 |
By: | Mathias Silva (Aix Marseille Univ, CNRS, AMSE, Marseille, France); Michel Lubrano (Aix Marseille Univ, CNRS, AMSE, Marseille, France) |
Abstract: | When estimated from survey data alone, the distribution of high incomes in a population may be misrepresented, as surveys typically provide detailed coverage of the lower part of the income distribution, but offer limited information on top incomes. Tax data, in contrast, better capture top incomes, but lack contextual information. To combine these data sources, Pareto models are often used to represent the upper tail of the income distribution. In this paper, we propose a Bayesian approach for this purpose, building on extreme value theory. Our method integrates a Pareto II tail with a semi-parametric model for the central part of the income distribution, and it selects the income threshold separating them endogenously. We incorporate external tax data through an informative prior on the Pareto II coefficient to complement survey micro-data. We find that Bayesian inference can yield a wide range of threshold estimates, which are sensitive to how the central part of the distribution is modelled. Applying our methodology to the EU-SILC micro-data set for 2008 and 2018, we find that using tax-data information from WID introduces no changes to inequality estimates for Nordic countries or The Netherlands, which rely on administrative registers for income data. However, tax data significantly revise survey-based inequality estimates in new EU member states. |
Keywords: | top income correction, Pareto II, Bayesian inference, extreme value theory, EU-SILC |
JEL: | C11 D31 D63 I31 |
Date: | 2024–10 |
URL: | https://d.repec.org/n?u=RePEc:aim:wpaimx:2429 |
By: | Avner Seror |
Abstract: | This paper introduces a network-based method to capture unobserved heterogeneity in consumer microdata. We develop a permutation-based approach that repeatedly samples subsets of choices from each agent and partitions agents into jointly rational types. Aggregating these partitions yields a network that characterizes the unobserved heterogeneity, as edges denote the fraction of times two agents belong to the same type across samples. To evaluate how observable characteristics align with the heterogeneity, we implement permutation tests that shuffle covariate labels across network nodes, thereby generating a null distribution of alignment. We further introduce various network-based measures of alignment that assess whether nodes sharing the same observable values are disproportionately linked or clustered, and introduce standardized effect sizes that measure how strongly each covariate "tilts" the entire network away from random assignment. These non-parametric effect sizes capture the global influence of observables on the heterogeneity structure. We apply the method to grocery expenditure data from the Stanford Basket Dataset. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.13721 |
By: | Bryan T. Kelly (Yale SOM; AQR Capital Management, LLC; National Bureau of Economic Research (NBER)); Semyon Malamud (Ecole Polytechnique Federale de Lausanne; Centre for Economic Policy Research (CEPR); Swiss Finance Institute); Emil Siriwardane (Harvard Business School - Finance Unit; National Bureau of Economic Research (NBER)); Hongyu Wu (Yale School of Management) |
Abstract: | We develop the concept of a Behavioral Impulse Response (BIR), which uses the dynamics of forecast errors to trace out how deviations from full-information rational expectations (FIRE) are corrected over time. BIRs based on professional forecasts of macroeconomics outcomes and corporate earnings imply that violations of FIRE occur much more frequently than suggested by existing tests. These deviations tend to correct gradually, often over several quarters, with sizable variation in correction speeds across different forecast targets and forecasters. Our theoretical analysis highlights why BIRs provide a simple yet powerful set of moments that can be used to discipline models of belief formation. |
JEL: | C52 C53 D83 D84 E7 E17 G17 G14 |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp2504 |
By: | Philippe Goulet Coulombe; Karin Klieber |
Abstract: | The use of moving averages is pervasive in macroeconomic monitoring, particularly for tracking noisy series such as inflation. The choice of the look-back window is crucial. Too long of a moving average is not timely enough when faced with rapidly evolving economic conditions. Too narrow averages are noisy, limiting signal extraction capabilities. As is well known, this is a bias-variance trade-off. However, it is a time-varying one: the optimal size of the look-back window depends on current macroeconomic conditions. In this paper, we introduce a simple adaptive moving average estimator based on a Random Forest using as sole predictor a time trend. Then, we compare the narratives inferred from the new estimator to those derived from common alternatives across series such as headline inflation, core inflation, and real activity indicators. Notably, we find that this simple tool provides a different account of the post-pandemic inflation acceleration and subsequent deceleration. |
Date: | 2025–01 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2501.13222 |
By: | Yoosoon Chang; Soyoung Kim; Joon Y. Park |
Abstract: | This paper investigates the interactions between macroeconomic aggregates and income distribution by developing a structural VAR model with functional variables. With this novel empirical approach, we are able to identify and analyze the effects of various shocks to the income distribution on macro aggregates, as well as the effects of macroeconomic shocks on the income distribution. Our main findings are as follows: First, contractionary monetary pol-icy shocks reduce income inequality when focusing solely on the redistributive effects, without considering the negative impact on aggregate income levels. This improvement is achieved by reducing the number of low and high-income families while increasing the proportion of middle-income families. However, when the aggregate income shift is also taken into account, contractionary monetary policy shocks worsen income inequality. Second, shocks to the income distribution have a substantial effect on output fluctuations. For example, income distribution shocks identified to maximize future output levels have a significant and persistent positive effect on output, contributing up to 30% at long horizons and over 50% for the lowest income percentiles. However, alternative income distribution shocks identified to minimize the future Gini index do not have any significant negative effects on output. This finding, combined with the positive effect of output-maximizing income distribution shocks on equality, suggests that properly designed redistributive policies are not subject to the often-claimed trade-off between growth and equality. Moreover, variations in income distribution are primarily explained by shocks to the income distribution itself, rather than by aggregate shocks, including monetary shocks. This highlights the need for redistributive policies to substantially alter the income distribution and reduce inequality. |
Keywords: | monetary policy, income distribution, re-distributive effects, structural vector autoregression, functional time series |
JEL: | E52 D31 C32 |
Date: | 2025–02 |
URL: | https://d.repec.org/n?u=RePEc:een:camaaa:2025-07 |