|
on Econometrics |
By: | James G. MacKinnon; Morten {\O}rregaard Nielsen; Matthew D. Webb |
Abstract: | For linear regression models with cross-section or panel data, it is natural to assume that the disturbances are clustered in two dimensions. However, the finite-sample properties of two-way cluster-robust tests and confidence intervals are often poor. We discuss several ways to improve inference with two-way clustering. Two of these are existing methods for avoiding, or at least ameliorating, the problem of undefined standard errors when a cluster-robust variance matrix estimator (CRVE) is not positive definite. One is a new method that always avoids the problem. More importantly, we propose a family of new two-way CRVEs based on the cluster jackknife. Simulations for models with two-way fixed effects suggest that, in many cases, the cluster-jackknife CRVE combined with our new method yields surprisingly accurate inferences. We provide a simple software package, twowayjack for Stata, that implements our recommended variance estimator. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.08880&r= |
By: | Kensuke Sakamoto |
Abstract: | This paper addresses the sample selection problem in panel dyadic regression analysis. Dyadic data often include many zeros in the main outcomes due to the underlying network formation process. This not only contaminates popular estimators used in practice but also complicates the inference due to the dyadic dependence structure. We extend Kyriazidou (1997)'s approach to dyadic data and characterize the asymptotic distribution of our proposed estimator. The convergence rates are $\sqrt{n}$ or $\sqrt{n^{2}h_{n}}$, depending on the degeneracy of the H\'{a}jek projection part of the estimator, where $n$ is the number of nodes and $h_{n}$ is a bandwidth. We propose a bias-corrected confidence interval and a variance estimator that adapts to the degeneracy. A Monte Carlo simulation shows the good finite performance of our estimator and highlights the importance of bias correction in both asymptotic regimes when the fraction of zeros in outcomes varies. We illustrate our procedure using data from Moretti and Wilson (2017)'s paper on migration. |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2405.17787&r= |
By: | Pedro Picchetti; Cristine C. X. Pinto; Stephanie T. Shinoki |
Abstract: | This paper investigates the econometric theory behind the newly developed difference-in-discontinuities design (DiDC). Despite its increasing use in applied research, there are currently limited studies of its properties. The method combines elements of regression discontinuity (RDD) and difference-in-differences (DiD) designs, allowing researchers to eliminate the effects of potential confounders at the discontinuity. We formalize the difference-in-discontinuity theory by stating the identification assumptions and proposing a nonparametric estimator, deriving its asymptotic properties and examining the scenarios in which the DiDC has desirable bias properties when compared to the standard RDD. We also provide comprehensive tests for one of the identification assumption of the DiDC. Monte Carlo simulation studies show that the estimators have good performance in finite samples. Finally, we revisit Grembi et al. (2016), that studies the effects of relaxing fiscal rules on public finance outcomes in Italian municipalities. The results show that the proposed estimator exhibits substantially smaller confidence intervals for the estimated effects. |
Date: | 2024–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2405.18531&r= |
By: | Federico A. Bugni; Mengsi Gao; Filip Obradovic; Amilcar Velez |
Abstract: | Randomized controlled trials (RCTs) frequently utilize covariate-adaptive randomization (CAR) (e.g., stratified block randomization) and commonly suffer from imperfect compliance. This paper studies the identification and inference for the average treatment effect (ATE) and the average treatment effect on the treated (ATT) in such RCTs with a binary treatment. We first develop characterizations of the identified sets for both estimands. Since data are generally not i.i.d. under CAR, these characterizations do not follow from existing results. We then provide consistent estimators of the identified sets and asymptotically valid confidence intervals for the parameters. Our asymptotic analysis leads to concrete practical recommendations regarding how to estimate the treatment assignment probabilities that enter in estimated bounds. In the case of the ATE, using sample analog assignment frequencies is more efficient than using the true assignment probabilities. On the contrary, using the true assignment probabilities is preferable for the ATT. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.08419&r= |
By: | Bin Peng; Liangjun Su; Yayi Yan |
Abstract: | In this paper, we propose an easy-to-implement residual-based specification testing procedure for detecting structural changes in factor models, which is powerful against both smooth and abrupt structural changes with unknown break dates. The proposed test is robust against the over-specified number of factors, and serially and cross-sectionally correlated error processes. A new central limit theorem is given for the quadratic forms of panel data with dependence over both dimensions, thereby filling a gap in the literature. We establish the asymptotic properties of the proposed test statistic, and accordingly develop a simulation-based scheme to select critical value in order to improve finite sample performance. Through extensive simulations and a real-world application, we confirm our theoretical results and demonstrate that the proposed test exhibits desirable size and power in practice. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.00941&r= |
By: | Tadao Hoshino; Takahide Yanagi |
Abstract: | This paper proposes a statistical inference method for assessing treatment effects with dyadic data. Under the assumption that the treatments follow an exchangeable distribution, our approach allows for the presence of any unobserved confounding factors that potentially cause endogeneity of treatment choice without requiring additional information other than the treatments and outcomes. Building on the literature of graphon estimation in network data analysis, we propose a neighborhood kernel smoothing method for estimating dyadic average treatment effects. We also develop a permutation inference method for testing the sharp null hypothesis. Under certain regularity conditions, we derive the rate of convergence of the proposed estimator and demonstrate the size control property of our test. We apply our method to international trade data to assess the impact of free trade agreements on bilateral trade flows. |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2405.16547&r= |
By: | Francq, Christian; Zakoian, Jean-Michel |
Abstract: | We investigate the problem of testing the finiteness of moments for a class of semi-parametric time series encompassing many commonly used specifications. The existence of positive-power moments of the strictly stationary solution is characterized by the Moment Determining Function (MDF) of the model, which depends on the parameter driving the dynamics and on the distribution of the innovations. We establish the asymptotic distribution of the empirical MDF, from which tests of moments are deduced. Alternative tests based on estimation of the Maximal Moment Exponent (MME) are studied. Power comparisons based on local alternatives and the Bahadur approach are proposed. We provide an illustration on real financial data and show that semi-parametric estimation of the MME provides an interesting alternative to Hill's nonparametric estimator of the tail index. |
Keywords: | Efficiency comparisons of tests; maximal moment exponent; stochastic recurrence equation; tail index |
JEL: | C12 C32 C58 |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:pra:mprapa:121193&r= |
By: | Duarte Gon\c{c}alves; Bruno A. Furtado |
Abstract: | This paper tackles challenges in pricing and revenue projections due to consumer uncertainty. We propose a novel data-based approach for firms facing unknown consumer type distributions. Unlike existing methods, we assume firms only observe a finite sample of consumers' types. We introduce \emph{empirically optimal mechanisms}, a simple and intuitive class of sample-based mechanisms with strong finite-sample revenue guarantees. Furthermore, we leverage our results to develop a toolkit for statistical inference on profits. Our approach allows to reliably estimate the profits associated for any particular mechanism, to construct confidence intervals, and to, more generally, conduct valid hypothesis testing. |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2405.17178&r= |
By: | Chang, Jinyuan; Chen, Cheng; Qiao, Xinghao; Yao, Qiwei |
Abstract: | Many scientific and economic applications involve the statistical learning of high-dimensional functional time series, where the number of functional variables is comparable to, or even greater than, the number of serially dependent functional observations. In this paper, we model observed functional time series, which are subject to errors in the sense that each functional datum arises as the sum of two uncorrelated components, one dynamic and one white noise. Motivated from the fact that the autocovariance function of observed functional time series automatically filters out the noise term, we propose a three-step framework by first performing autocovariance-based dimension reduction, then formulating a novel autocovariance-based block regularized minimum distance estimation to produce block sparse estimates, and based on which obtaining the final functional sparse estimates. We investigate theoretical properties of the proposed estimators, and illustrate the proposed estimation procedure with the corresponding convergence analysis via three sparse high-dimensional functional time series models. We demonstrate via both simulated and real datasets that our proposed estimators significantly outperform their competitors. |
Keywords: | block regularized minimum distance estimation; dimension reduction; functional time series; high-dimensional data; non-asymptotics; sparsity; 71991472; 72125008; 11871401; EP/V007556/1; Elsevier deal |
JEL: | C50 C13 C32 |
Date: | 2023–02–23 |
URL: | https://d.repec.org/n?u=RePEc:ehl:lserod:117910&r= |
By: | Sho Miyaji |
Abstract: | Many studies run two-way fixed effects instrumental variable (TWFEIV) regressions, leveraging variation in the timing of policy adoption across units as an instrument for treatment. This paper studies the properties of the TWFEIV estimator in staggered instrumented difference-in-differences (DID-IV) designs. We show that in settings with the staggered adoption of the instrument across units, the TWFEIV estimator can be decomposed into a weighted average of all possible two-group/two-period Wald-DID estimators. Under staggered DID-IV designs, a causal interpretation of the TWFEIV estimand hinges on the stable effects of the instrument on the treatment and the outcome over time. We illustrate the use of our decomposition theorem for the TWFEIV estimator through an empirical application. |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2405.16467&r= |
By: | Andrea Bucci |
Abstract: | This paper proposes a sequential test procedure for determining the number of regimes in nonlinear multivariate autoregressive models. The procedure relies on linearity and no additional nonlinearity tests for both multivariate smooth transition and threshold autoregressive models. We conduct a simulation study to evaluate the finite-sample properties of the proposed test in small samples. Our findings indicate that the test exhibits satisfactory size properties, with the rescaled version of the Lagrange Multiplier test statistics demonstrating the best performance in most simulation settings. The sequential procedure is also applied to two empirical cases, the US monthly interest rates and Icelandic river flows. In both cases, the detected number of regimes aligns well with the existing literature. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.02152&r= |
By: | James G. MacKinnon; Morten {\O}rregaard Nielsen; Matthew D. Webb |
Abstract: | We study cluster-robust inference for binary response models. Inference based on the most commonly-used cluster-robust variance matrix estimator (CRVE) can be very unreliable. We study several alternatives. Conceptually the simplest of these, but also the most computationally demanding, involves jackknifing at the cluster level. We also propose a linearized version of the cluster-jackknife variance matrix estimator as well as linearized versions of the wild cluster bootstrap. The linearizations are based on empirical scores and are computationally efficient. Throughout we use the logit model as a leading example. We also discuss a new Stata software package called logitjack which implements these procedures. Simulation results strongly favor the new methods, and two empirical examples suggest that it can be important to use them in practice. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.00650&r= |
By: | Bijan Mazaheri; Chandler Squires; Caroline Uhler |
Abstract: | Modern data analysis frequently relies on the use of large datasets, often constructed as amalgamations of diverse populations or data-sources. Heterogeneity across these smaller datasets constitutes two major challenges for causal inference: (1) the source of each sample can introduce latent confounding between treatment and effect, and (2) diverse populations may respond differently to the same treatment, giving rise to heterogeneous treatment effects (HTEs). The issues of latent confounding and HTEs have been studied separately but not in conjunction. In particular, previous works only report the conditional average treatment effect (CATE) among similar individuals (with respect to the measured covariates). CATEs cannot resolve mixtures of potential treatment effects driven by latent heterogeneity, which we call mixtures of treatment effects (MTEs). Inspired by method of moment approaches to mixture models, we propose "synthetic potential outcomes" (SPOs). Our new approach deconfounds heterogeneity while also guaranteeing the identifiability of MTEs. This technique bypasses full recovery of a mixture, which significantly simplifies its requirements for identifiability. We demonstrate the efficacy of SPOs on synthetic data. |
Date: | 2024–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2405.19225&r= |
By: | Johannes Hoelzemann; Ryan Webb; Erhao Xie |
Abstract: | We study the falsifiability and identification of Quantal Response Equilibrium (QRE) when each player’s utility and error distribution are relaxed to be unknown non-parametric functions. Using variations of players’ choices across a series of games, we first show that both the utility function and the distribution of errors are non-parametrically over-identified. This result further suggests a straightforward testing procedure for QRE that achieves the desired type-1 error and maintains a small type-2 error. To apply this methodology, we conduct an experimental study of the matching pennies game. Our non-parametric estimates strongly reject the conventional logit choice probability. Moreover, when the utility and the error distribution are sufficiently flexible and heterogeneous, the quantal response hypothesis cannot be rejected for 70% of participants. However, strong assumptions such as risk neutrality, logistically distributed errors and homogeneity lead to substantially higher rejection rates. |
Keywords: | Econometric and statistical methods; Economic models |
JEL: | C14 C57 C92 |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:bca:bocawp:24-24&r= |
By: | David Kohns; Noa Kallionen; Yann McLatchie; Aki Vehtari |
Abstract: | We present the ARR2 prior, a joint prior over the auto-regressive components in Bayesian time-series models and their induced $R^2$. Compared to other priors designed for times-series models, the ARR2 prior allows for flexible and intuitive shrinkage. We derive the prior for pure auto-regressive models, and extend it to auto-regressive models with exogenous inputs, and state-space models. Through both simulations and real-world modelling exercises, we demonstrate the efficacy of the ARR2 prior in improving sparse and reliable inference, while showing greater inference quality and predictive performance than other shrinkage priors. An open-source implementation of the prior is provided. |
Date: | 2024–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2405.19920&r= |
By: | Demian Pouzo; Zacharias Psaradakis; Martín Sola |
Abstract: | We consider general hidden Markov models that may include exogenous covariates and whose discrete-state-space regime sequence has transition probabilities that are functions of observable variables. We show that the parameters of the observation conditional distribution are consistently estimated by quasi-maximum-likelihood even if the Markov dependence of the hidden regime sequence is not taken into account. Some related numerical results are also discussed. |
Keywords: | Consistency; covariate-dependent transition probabilities; hidden Markov model; mixture model; quasi-maximum-likelihood; misspecified model. |
JEL: | C22 C32 |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:udt:wpecon:2024_04&r= |
By: | Jin Seo Cho (Yonsei University) |
Abstract: | This study examines the large sample behavior of an ordinary least squares (OLS) estimator when a nonlinear autoregressive distributed lag (NARDL) model is correctly specified for nonstationary data. Although the OLS estimator suffers from an asymptotically singular matrix problem, it is consistent for unknown model parameters, and follows a mixed normal distribution asymptotically. We also examine the large sample behavior of the standard Wald test defined by the OLS estimator for asymmetries in long- and short-run NARDL parameters, and further supplement it by noting that the long-run parameter estimator is not super-consistent. Using Monte Carlo simulations, we then affirm the theory on the Wald test. Finally, using the U.S. GDP and exogenous fiscal shock data provided by Romer and Romer (2010, American Economic Review), we find statistical evidence for long-and short-run symmetries between tax increase and decrease in relation to the U.S. GDP. |
Keywords: | Nonlinear autoregressive distributed lag model; OLS estimation; Singular matrix; Limit distribution; Wald test; Exogenous fiscal shocks; GDP. |
JEL: | C12 C13 C22 E62 |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:yon:wpaper:2024rwp-227&r= |
By: | Aristide Houndetoungan |
Abstract: | This paper develops a micro-founded peer effect model for count responses using a game of incomplete information. The model incorporates heterogeneity in peer effects through agents' groups based on observed characteristics. Parameter identification is established using the identification condition of linear models, which relies on the presence of friends' friends who are not direct friends in the network. I show that this condition extends to a large class of nonlinear models. The model parameters are estimated using the nested pseudo-likelihood approach, controlling for network endogeneity. I present an empirical application on students' participation in extracurricular activities. I find that females are more responsive to their peers than males, whereas male peers do not influence male students. An easy-to-use R packag--named CDatanet--is available for implementing the model. |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2405.17290&r= |
By: | Xuxing Chen; Abhishek Roy; Yifan Hu; Krishnakumar Balasubramanian |
Abstract: | We develop and analyze algorithms for instrumental variable regression by viewing the problem as a conditional stochastic optimization problem. In the context of least-squares instrumental variable regression, our algorithms neither require matrix inversions nor mini-batches and provides a fully online approach for performing instrumental variable regression with streaming data. When the true model is linear, we derive rates of convergence in expectation, that are of order $\mathcal{O}(\log T/T)$ and $\mathcal{O}(1/T^{1-\iota})$ for any $\iota>0$, respectively under the availability of two-sample and one-sample oracles, respectively, where $T$ is the number of iterations. Importantly, under the availability of the two-sample oracle, our procedure avoids explicitly modeling and estimating the relationship between confounder and the instrumental variables, demonstrating the benefit of the proposed approach over recent works based on reformulating the problem as minimax optimization problems. Numerical experiments are provided to corroborate the theoretical results. |
Date: | 2024–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2405.19463&r= |
By: | Alessandra Amendola; Vincenzo Candila; Antonio Naimoli; Giuseppe Storti |
Abstract: | In order to meet the increasingly stringent global standards of banking management and regulation, several methods have been proposed in the literature for forecasting tail risk measures such as the Value-at-Risk (VaR) and Expected Shortfall (ES). However, regardless of the approach used, there are several sources of uncertainty, including model specifications, data-related issues and the estimation procedure, which can significantly affect the accuracy of VaR and ES measures. Aiming to mitigate the influence of these sources of uncertainty and improve the predictive performance of individual models, we propose novel forecast combination strategies based on the Model Confidence Set (MCS). In particular, consistent joint VaR and ES loss functions within the MCS framework are used to adaptively combine forecasts generated by a wide range of parametric, semi-parametric, and non-parametric models. Our results reveal that the proposed combined predictors provide a suitable alternative for forecasting risk measures, passing the usual backtests, entering the set of superior models of the MCS, and usually exhibiting lower standard deviations than other model specifications. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.06235&r= |
By: | Ruoxuan Xiong; Alex Chin; Sean J. Taylor |
Abstract: | We study the design and analysis of switchback experiments conducted on a single aggregate unit. The design problem is to partition the continuous time space into intervals and switch treatments between intervals, in order to minimize the estimation error of the treatment effect. We show that the estimation error depends on four factors: carryover effects, periodicity, serially correlated outcomes, and impacts from simultaneous experiments. We derive a rigorous bias-variance decomposition and show the tradeoffs of the estimation error from these factors. The decomposition provides three new insights in choosing a design: First, balancing the periodicity between treated and control intervals reduces the variance; second, switching less frequently reduces the bias from carryover effects while increasing the variance from correlated outcomes, and vice versa; third, randomizing interval start and end points reduces both bias and variance from simultaneous experiments. Combining these insights, we propose a new empirical Bayes design approach. This approach uses prior data and experiments for designing future experiments. We illustrate this approach using real data from a ride-sharing platform, yielding a design that reduces MSE by 33% compared to the status quo design used on the platform. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.06768&r= |
By: | Jay Lu; Yao Luo; Kota Saito; Yi Xin |
Abstract: | This paper proposes an empirical model of dynamic discrete choice to allow for non-separable time preferences, generalizing the well-known Rust (1987) model. Under weak conditions, we show the existence of value functions and hence well-defined optimal choices. We construct a contraction mapping of the value function and propose an estimation method similar to Rust's nested fixed point algorithm. Finally, we apply the framework to the bus engine replacement data. We improve the fit of the data with our general model and reject the null hypothesis that Harold Zuercher has separable time preferences. Misspecifying an agent's preference as time-separable when it is not leads to biased inferences about structure parameters (such as the agent's risk attitudes) and misleading policy recommendations. |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2406.07809&r= |
By: | Joshua Nielsen (University of Colorado, Boulder); Didier Sornette (Risks-X, Southern University of Science and Technology (SUSTech); Swiss Finance Institute; ETH Zürich - Department of Management, Technology, and Economics (D-MTEC); Tokyo Institute of Technology); Maziar Raissi (University of California, Riverside) |
Abstract: | The Log-Periodic Power Law Singularity (LPPLS) model offers a general framework for capturing dynamics and predicting transition points in diverse natural and social systems. In this work, we present two calibration techniques for the LPPLS model using deep learning. First, we introduce the Mono-LPPLS-NN (M-LNN) model; for any given empirical time series, a unique M-LNN model is trained and shown to outperform state-of-the-art techniques in estimating the nonlinear parameters (tc; m; !) of the LPPLS model as evidenced by the comprehensive distribution of parameter errors. Second, we extend the M-LNN model to a more general model architecture, the Poly-LPPLS-NN (P-LNN), which is able to quickly estimate the nonlinear parameters of the LPPLS model for any given time-series of a fixed length, including previously unseen time-series during training. The Poly class of models train on many synthetic LPPLS time-series augmented with various noise structures in a supervised manner. Given enough training examples, the P-LNN models also outperform state-of-the-art techniques for estimating the parameters of the LPPLS model as evidenced by the comprehensive distribution of parameter errors. Additionally, this class of models is shown to substantially reduce the time to obtain parameter estimates. Finally, we present applications to the diagnostic and prediction of two financial bubble peaks (followed by their crash) and of a famous rockslide. These contributions provide a bridge between deep learning and the study of the prediction of transition times in complex time series. |
Keywords: | log-periodicity, finite-time singularity, prediction, change of regime, financial bubbles, landslides, deep learning |
JEL: | C00 C13 C69 G01 |
Date: | 2024–05 |
URL: | https://d.repec.org/n?u=RePEc:chf:rpseri:rp2433&r= |
By: | Enrico Wegner; Lenard Lieb; Stephan Smeekes; Ines Wilms |
Abstract: | We propose a framework for the analysis of transmission channels in a large class of dynamic models. To this end, we formulate our approach both using graph theory and potential outcomes, which we show to be equivalent. Our method, labelled Transmission Channel Analysis (TCA), allows for the decomposition of total effects captured by impulse response functions into the effects flowing along transmission channels, thereby providing a quantitative assessment of the strength of various transmission channels. We establish that this requires no additional identification assumptions beyond the identification of the structural shock whose effects the researcher wants to decompose. Additionally, we prove that impulse response functions are sufficient statistics for the computation of transmission effects. We also demonstrate the empirical relevance of TCA for policy evaluation by decomposing the effects of various monetary policy shock measures into instantaneous implementation effects and effects that likely relate to forward guidance. |
Date: | 2024–05 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2405.18987&r= |
By: | Ippei Fujiwara; Adrian Pagan |
Abstract: | McKay and Wolf (2023) describe a method for ï¬ nding counterfactuals which only requires that one know the impulse responses of shocks from a baseline structural model generating the data. A key feature in their work is the use of news shocks. This an elegant piece of theory and they indicate it can be applied empirically. We argue that one cannot recover the impulse responses from data generated by the structural model when there are news shocks as there are more shocks than observables in that case. We investigate an alternative proposal whereby some off model variables are used to ï¬ nd the requisite impulse responses and ï¬ nd that there are issues with doing that. Their theoretical result also relies upon the baseline structural model only having monetary policy operating via the interest rate channel so it excludes models that might be thought relevant for capturing data. |
Keywords: | counterfactual, news shocks, shock recovery, local projection |
JEL: | C3 E3 |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:een:camaaa:2024-44&r= |
By: | LeRoy, Stephen F |
Abstract: | Recent applied work in economics has displayed renewed interest in the problem of characterizing the causal relations that link economic variables. However, many discussions avoid explicit specification ofwhat has to be true about a formal model to justify an assertion that one variable in it causes another. Such specification is supplied here. Related topics, such as determining whether correlation implies causation, or vice-versa, and when causal coefficients can be estimated using ordinary least squares or instrumental variables regressions, are discussed. |
Keywords: | Social and Behavioral Sciences |
Date: | 2024–06–24 |
URL: | https://d.repec.org/n?u=RePEc:cdl:ucsbec:qt12q3t2vd&r= |
By: | Dongwoo Kim; Young Jun Lee |
Abstract: | This paper proposes empirically tractable multidimensional matching models, focusing on worker-job matching. We generalize the parametric model proposed by Lindenlaub (2017), which relies on the assumption of joint normality of observed characteristics of workers and jobs. In our paper, we allow unrestricted distributions of characteristics and show identification of the production technology, and equilibrium wage and matching functions using tools from optimal transport theory. Given identification, we propose efficient, consistent, asymptotically normal sieve estimators. We revisit Lindenlaub's empirical application and show that, between 1990 and 2010, the U.S. economy experienced much larger technological progress favoring cognitive abilities than the original findings suggest. Furthermore, our flexible model specifications provide a significantly better fit for patterns in the evolution of wage inequality. |
Date: | 2024–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2405.18089&r= |
By: | Milen Arro-Cannarsa; Dr. Rolf Scheufele |
Abstract: | We compare several machine learning methods for nowcasting GDP. A large mixed-frequency data set is used to investigate different algorithms such as regression based methods (LASSO, ridge, elastic net), regression trees (bagging, random forest, gradient boosting), and SVR. As benchmarks, we use univariate models, a simple forward selection algorithm, and a principal components regression. The analysis accounts for publication lags and treats monthly indicators as quarterly variables combined via blocking. Our data set consists of more than 1, 100 time series. For the period after the Great Recession, which is particularly challenging in terms of nowcasting, we find that all considered machine learning techniques beat the univariate benchmark up to 28 % in terms of out-of-sample RMSE. Ridge, elastic net, and SVR are the most promising algorithms in our analysis, significantly outperforming principal components regression. |
Keywords: | Nowcasting, Forecasting, Machine learning, Rridge, LASSO, Elastic net, Random forest, Bagging, Boosting, SVM, SVR, Large data sets |
JEL: | C53 C55 C32 |
Date: | 2024 |
URL: | https://d.repec.org/n?u=RePEc:snb:snbwpa:2024-06&r= |
By: | Pötscher, Benedikt M. |
Abstract: | In Pötscher and Preinerstorfer (2022) and in the abridged version Pötscher and Preinerstorfer (2024, published in Econometrica) we have tried to clear up the confusion introduced in Hansen (2022a) and in the earlier versions Hansen (2021a, b). Unfortunatelly, Hansen's (2024) reply to Pötscher and Preinerstorfer (2024) further adds to the confusion. While we are already somewhat tired of the matter, for the sake of the econometrics community we feel compelled to provide clarification. We also add a comment on Portnoy (2023), a "correction" to Portnoy (2022), as well as on Lei and Wooldridge (2022). |
Keywords: | Gauss-Markov Theorem, Aitken Theorem |
JEL: | C13 C20 |
Date: | 2024–06 |
URL: | https://d.repec.org/n?u=RePEc:pra:mprapa:121144&r= |
By: | Askitas, Nikos (IZA) |
Abstract: | This paper addresses the steep learning curve in Machine Learning faced by noncomputer scientists, particularly social scientists, stemming from the absence of a primer on its fundamental principles. I adopt a pedagogical strategy inspired by the adage "once you understand OLS, you can work your way up to any other estimator, " and apply it to Machine Learning. Focusing on a single-hidden-layer artificial neural network, the paper discusses its mathematical underpinnings, including the pivotal Universal Approximation Theorem—an essential "existence theorem". The exposition extends to the algorithmic exploration of solutions, specifically through "feed forward" and "back-propagation", and rounds up with the practical implementation in Python. The objective of this primer is to equip readers with a solid elementary comprehension of first principles and fire some trailblazers to the forefront of AI and causal machine learning. |
Keywords: | machine learning, deep learning, supervised learning, artificial neural network, perceptron, Python, keras, tensorflow, universal approximation theorem |
JEL: | C01 C87 C00 C60 |
Date: | 2024–05 |
URL: | https://d.repec.org/n?u=RePEc:iza:izadps:dp17014&r= |