|
on Econometrics |
By: | Puyi Fang; Zhaoxing Gao; Ruey S. Tsay |
Abstract: | This paper proposes a new approach to identifying the effective cointegration rank in high-dimensional unit-root (HDUR) time series from a prediction perspective using reduced-rank regression. For a HDUR process $\mathbf{x}_t\in \mathbb{R}^N$ and a stationary series $\mathbf{y}_t\in \mathbb{R}^p$ of interest, our goal is to predict future values of $\mathbf{y}_t$ using $\mathbf{x}_t$ and lagged values of $\mathbf{y}_t$. The proposed framework consists of a two-step estimation procedure. First, the Principal Component Analysis is used to identify all cointegrating vectors of $\mathbf{x}_t$. Second, the co-integrated stationary series are used as regressors, together with some lagged variables of $\mathbf{y}_t$, to predict $\mathbf{y}_t$. The estimated reduced rank is then defined as the effective cointegration rank of $\mathbf{x}_t$. Under the scenario that the autoregressive coefficient matrices are sparse (or of low-rank), we apply the Least Absolute Shrinkage and Selection Operator (or the reduced-rank techniques) to estimate the autoregressive coefficients when the dimension involved is high. Theoretical properties of the estimators are established under the assumptions that the dimensions $p$ and $N$ and the sample size $T \to \infty$. Both simulated and real examples are used to illustrate the proposed framework, and the empirical application suggests that the proposed procedure fares well in predicting stock returns. |
Date: | 2023–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2304.12134&r=ecm |
By: | Yassine Sbai Sassi |
Abstract: | We propose a rate optimal estimator for the linear regression model on network data with interacted (unobservable) individual effects. The estimator achieves a faster rate of convergence $N$ compared to the standard estimators' $\sqrt{N}$ rate and is efficient in cases that we discuss. We observe that the individual effects alter the eigenvalue distribution of the data's matrix representation in significant and distinctive ways. We subsequently offer a correction for the \textit{ordinary least squares}' objective function to attenuate the statistical noise that arises due to the individual effects, and in some cases, completely eliminate it. The new estimator is asymptotically normal and we provide a valid estimator for its asymptotic covariance matrix. While this paper only considers models accounting for first-order interactions between individual effects, our estimation procedure is naturally extendable to higher-order interactions and more general specifications of the error terms. |
Date: | 2023–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2304.12554&r=ecm |
By: | Daniele Girolimetto; George Athanasopoulos; Tommaso Di Fonzo; Rob J Hyndman |
Abstract: | Forecast reconciliation is a post-forecasting process that involves transforming a set of incoherent forecasts into coherent forecasts which satisfy a given set of linear constraints for a multivariate time series. In this paper we extend the current state-of-the-art cross-sectional probabilistic forecast reconciliation approach to encompass a cross-temporal framework, where temporal constraints are also applied. Our proposed methodology employs both parametric Gaussian and non-parametric bootstrap approaches to draw samples from an incoherent crosstemporal distribution. To improve the estimation of the forecast error covariance matrix, we propose using multi-step residuals, especially in the time dimension where the usual one-step residuals fail. To address high-dimensionality issues, we present four alternatives for the covariance matrix, where we exploit the twofold nature (cross-sectional and temporal) of the cross-temporal structure, and introduce the idea of overlapping residuals. We evaluate the proposed methods through a detailed simulation study that investigates their theoretical and empirical properties. We further assess the effectiveness of the proposed cross-temporal reconciliation approach by applying it to two empirical forecasting experiments, using the Australian GDP and the Australian Tourism Demand datasets. For both applications, we show that the optimal cross-temporal reconciliation approaches significantly outperform the incoherent base forecasts in terms of the Continuous Ranked Probability Score and the Energy Score. Overall, our study expands and unifies the notation for cross-sectional, temporal and cross-temporal reconciliation, thus extending and deepening the probabilistic cross-temporal framework. The results highlight the potential of the proposed cross-temporal forecast reconciliation methods in improving the accuracy of probabilistic forecasting models. |
Keywords: | coherent, GDP, linear constraints, multivariate time series, temporal aggregation, tourism flows |
Date: | 2023 |
URL: | http://d.repec.org/n?u=RePEc:msh:ebswps:2023-6&r=ecm |
By: | Paul Ho; Thomas A. Lubik; Christian Matthes |
Abstract: | Macroeconomists construct impulse responses using many competing time series models and different statistical paradigms (Bayesian or frequentist). We adapt optimal linear prediction pools to efficiently combine impulse response estimators for the effects of the same economic shock from this vast class of possible models. We thus alleviate the need to choose one specific model, obtaining weights that are typically positive for more than one model. Three Monte Carlo simulations and two monetary shock empirical applications illustrate how the weights leverage the strengths of each model by (i) trading off properties of each model depending on variable, horizon, and application and (ii) accounting for the full predictive distribution rather than being restricted to specific moments. |
Keywords: | prediction pools; model averaging; impulse responses; misspecification |
JEL: | C32 C52 |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:fip:fedrwp:95601&r=ecm |
By: | Virbickaite, Audrone (CUNEF Universidad); Nguyen, Hoang (Örebro University School of Business); Tran, Minh-Ngoc (Discipline of Business Analytics, The University of Sydney Business School) |
Abstract: | This study explores the benefits of incorporating fat-tailed innovations, asymmetric volatility response, and an extended information set into crude oil return modeling and forecasting. To this end, we utilize standard volatility models such as Generalized Autoregressive Conditional Heteroskedastic (GARCH), Generalized Autoregressive Score (GAS), and Stochastic Volatility (SV), along with Mixed Data Sampling (MIDAS) regressions, which enable us to incorporate the impacts of relevant financial/macroeconomic news into asset price movements. For inference and prediction, we employ an innovative Bayesian estimation approach called the density-tempered sequential Monte Carlo method. Our findings indicate that the inclusion of exogenous variables is beneficial for GARCH-type models while offering only a marginal improvement for GAS and SV-type models. Notably, GAS-family models exhibit superior performance in terms of in-sample fit, out-of-sample forecast accuracy, as well as Value-at-Risk and Expected Shortfall prediction. |
Keywords: | ES; GARCH; GAS; log marginal likelihood; MIDAS; SV; VaR |
JEL: | C22 C52 C58 G32 |
Date: | 2023–04–14 |
URL: | http://d.repec.org/n?u=RePEc:hhs:oruesi:2023_007&r=ecm |
By: | Jarek Duda |
Abstract: | The real life time series are usually nonstationary, bringing a difficult question of model adaptation. Classical approaches like GARCH assume arbitrary type of dependence. To prevent such bias, we will focus on recently proposed agnostic philosophy of moving estimator: in time $t$ finding parameters optimizing e.g. $F_t=\sum_{\tau |
Date: | 2023–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2304.03069&r=ecm |
By: | Hernandez Amaro, Pavel; Durbán Reguera, María Luz; Aguilera Morillo, Maria Del Carmen; Esteban Gonzalez, Cristobal; Arostegui, Inma |
Abstract: | Motivated by the increasingly common technology for collecting data, like cellphones, smartwatches, etc, functional data analysis has been intensively studied in recent decades, and along with it, functional regression models. However, the majority of functional data methods in general and functional regression models, in particular, are based on the fact that the observed datapresent the same domain. When the data have variable domain it needs to be aligned or registered in order to be fitted with the usual modeling techniques adding computational burden. To avoid this, a model that contemplates the variable domain features of the data is needed, but this type of models are scarce and its estimation method presents some limitations. In this article, we propose a new scalar-on-function regression model for variable domain functional data that eludes the need for alignment and a new estimation methodology that we extend to other variable domain regression models. |
Keywords: | Variable Domain Functional Data; B-Splines; Mixed Models; Copd |
Date: | 2025–05–05 |
URL: | http://d.repec.org/n?u=RePEc:cte:wsrepe:37255&r=ecm |
By: | Kyungsub Lee |
Abstract: | This study examines the use of a recurrent neural network for estimating the parameters of a Hawkes model based on high-frequency financial data, and subsequently, for computing volatility. Neural networks have shown promising results in various fields, and interest in finance is also growing. Our approach demonstrates significantly faster computational performance compared to traditional maximum likelihood estimation methods while yielding comparable accuracy in both simulation and empirical studies. Furthermore, we demonstrate the application of this method for real-time volatility measurement, enabling the continuous estimation of financial volatility as new price data keeps coming from the market. |
Date: | 2023–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2304.11883&r=ecm |
By: | Carranza, Aldo Gael (Stanford U); Krishnamurthy, Sanath Kumar (Stanford U); Athey, Susan (Stanford U) |
Abstract: | Contextual bandit algorithms often estimate reward models to inform decision-making. However, true rewards can contain action- independent redundancies that are not relevant for decision-making. We show it is more data- efficient to estimate any function that explains the reward differences between actions, that is, the treatment effects. Motivated by this obser- vation, building on recent work on oracle-based bandit algorithms, we provide the first reduction of contextual bandits to general-purpose hetero- geneous treatment effect estimation, and we de- sign a simple and computationally efficient algo- rithm based on this reduction. Our theoretical and experimental results demonstrate that hetero- geneous treatment effect estimation in contextual bandits offers practical advantages over reward estimation, including more efficient model esti- mation and greater flexibility to model misspeci- fication. |
Date: | 2023–02 |
URL: | http://d.repec.org/n?u=RePEc:ecl:stabus:4081&r=ecm |
By: | Mathias Silva (Aix-Marseille Univ, CNRS, AMSE, Marseille, France.) |
Abstract: | Several representativeness issues affect the available data sources in studying populations' income distributions. High-income under-reporting and non-response issues have been evidenced to be particularly significant in the literature, due to their consequence in underestimating income growth and inequality. This paper bridges several past parametric modelling attempts to account for high-income data issues in making parametric inference on income distributions at the population level. A unified parametric framework integrating parametric income distribution models and popular data replacing and reweighting corrections is developped. To exploit this framework for empirical analysis, an Approximate Bayesian Computation approach is developped. This approach updates prior beliefs on the population income distribution and the high-income data issues pressumably affecting the available data by attempting to reproduce the observed income distribution under simulations from the parametric model. Applications on simulated and EU-SILC data illustrate the performance of the approach in studying population-level mean incomes and inequality from data potentially affected by these high-income issues. |
Keywords: | 'Missing rich', GB2, Bayesian inference |
JEL: | D31 C18 C11 |
Date: | 2023–05 |
URL: | http://d.repec.org/n?u=RePEc:aim:wpaimx:2311&r=ecm |
By: | Lihua Lei; Roshni Sahoo; Stefan Wager |
Abstract: | Practitioners often use data from a randomized controlled trial to learn a treatment assignment policy that can be deployed on a target population. A recurring concern in doing so is that, even if the randomized trial was well-executed (i.e., internal validity holds), the study participants may not represent a random sample of the target population (i.e., external validity fails)--and this may lead to policies that perform suboptimally on the target population. We consider a model where observable attributes can impact sample selection probabilities arbitrarily but the effect of unobservable attributes is bounded by a constant, and we aim to learn policies with the best possible performance guarantees that hold under any sampling bias of this type. In particular, we derive the partial identification result for the worst-case welfare in the presence of sampling bias and show that the optimal max-min, max-min gain, and minimax regret policies depend on both the conditional average treatment effect (CATE) and the conditional value-at-risk (CVaR) of potential outcomes given covariates. To avoid finite-sample inefficiencies of plug-in estimates, we further provide an end-to-end procedure for learning the optimal max-min and max-min gain policies that does not require the separate estimation of nuisance parameters. |
Date: | 2023–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2304.11735&r=ecm |
By: | Csaba Burger (Magyar Nemzeti Bank (the Central Bank of Hungary)); Mihály Berndt (Clarity Consulting Kft) |
Abstract: | Supervised machine learning methods, in which no error labels are present, are increasingly popular methods for identifying potential data errors. Such algorithms rely on the tenet of a ‘ground truth’ in the data, which in other words assumes correctness in the majority of the cases. Points deviating from such relationships, outliers, are flagged as potential data errors. This paper implements an outlier-based error-spotting algorithm using gradient boosting, and presents a blueprint for the modelling pipeline. More specifically, it underpins three main modelling hypotheses with empirical evidence, which are related to (1) missing value imputation, (2) the loss-function choice and (3) the location of the error. By doing so, it uses a cross sectional view on the loan-to-value and its related columns of the Credit Registry (Hitelregiszter) of the Central Bank of Hungary (MNB), and introduces a set of synthetic error types to test its hypotheses. The paper shows that gradient boosting is not materially impacted by the choice of the imputation method, hence, replacement with a constant, the computationally most efficient, is recommended. Second, the Huber-loss function, which is piecewise quadratic up until the Huber-slope parameter and linear above it, is better suited to cope with outlier values; it is therefore better in capturing data errors. Finally, errors in the target variable are captured best, while errors in the predictors are hardly found at all. These empirical results may generalize to other cases, depending on data specificities, and the modelling pipeline described underscores significant modelling decisions. |
Keywords: | data quality, machine learning, gradient boosting, central banking, loss functions, missing values |
JEL: | C5 C81 E58 |
Date: | 2023 |
URL: | http://d.repec.org/n?u=RePEc:mnb:opaper:2023/148&r=ecm |
By: | Monica Billio (University of Ca’ Foscari [Venice, Italy]); Lorenzo Frattarolo (JRC - European Commission - Joint Research Centre [Ispra]); Dominique Guégan (University of Ca’ Foscari [Venice, Italy], CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique, UP1 - Université Paris 1 Panthéon-Sorbonne) |
Abstract: | We use a recently proposed fast test of copula radial symmetry based on multiplier bootstrap and obtain an equivalent randomization test. The literature shows the statistical superiority of the randomization approach in the bivariate case. We extend the comparison of statistical performance focusing on the high-dimensional regime in a simulation study. We document radial asymmetry in the joint distribution of the percentage changes of sectorial industrial production indices of the European Union. |
Keywords: | copula, reflection symmetry, radial symmetry, empirical process, test, copula reflection symmetry radial symmetry empirical process test |
Date: | 2022–01–07 |
URL: | http://d.repec.org/n?u=RePEc:hal:cesptp:hal-04085236&r=ecm |
By: | Oliver Cassagneau-Francis (ECON - Département d'économie (Sciences Po) - Sciences Po - Sciences Po - CNRS - Centre National de la Recherche Scientifique) |
Abstract: | Recent work has highlighted the significant variation in returns to higher education across individuals. We develop a novel methodology-exploiting recent advances in the identification of mixture models-which groups individuals according to their prior ability and estimates the wage returns to a university degree by group. We prove the non-parametric identification of our model. Applying our method to data from a UK cohort study, our findings reflect recent evidence that skills and ability are multidimensional. Our flexible model allows the returns to university to vary across the (multi-dimensional) ability distribution, a flexibility missing from commonly used additive models, but which we show is empirically important. The returns to higher education are 3-4 times larger than the returns to prior cognitive and non-cognitive abilities. Returns are generally increasing in ability for both men and women, but vary non-monotonically across the ability distribution. |
Keywords: | Mixture models, Distributions, Treatment effects, Higher education, Wages, Human capital, Cognitive and non-cognitive abilities |
Date: | 2022–05–19 |
URL: | http://d.repec.org/n?u=RePEc:hal:spmain:hal-04067399&r=ecm |
By: | A. Hennessy, Christopher; Goodhart, C. A. E. |
Abstract: | We develop a simple structural model to illustrate how penalized regressions generate Goodhart bias when training data are clean but covariates are manipulated at known cost by future agents. With quadratic (extremely steep) manipulation costs, bias is proportional to Ridge (Lasso) penalization. If costs depend on absolute or percentage manipulation, the following algorithm yields manipulation-proof prediction: Within training data, evaluate candidate coefficients at their respective incentive-compatible manipulation configuration. We derive analytical coefficient adjustments: slopes (intercept) shift downward if costs depend on percentage (absolute) manipulation. Statisticians ignoring manipulation costs select socially suboptimal penalization. Model averaging reduces these manipulation costs. |
JEL: | J1 |
Date: | 2023–03–21 |
URL: | http://d.repec.org/n?u=RePEc:ehl:lserod:118656&r=ecm |
By: | Ahmad W. Bitar (UTT - Université de Technologie de Troyes, CentraleSupélec); Nathan de Carvalho (UPCité - Université Paris Cité, CentraleSupélec, Engie Global Markets); Valentin Gatignol (Qube Research and Technologies, CentraleSupélec) |
Abstract: | In this technical report , we aim to combine different protfolio allocation techniques with covariance matrix estimators to meet two types of clients' requirements: client A who wants to invest money wisely, not taking too much risk, and not willing to pay too much in rebalancing fees; and client B who wants to make money quickly, benefit from market's short-term volatility, and ready to pay rebalancing fees. Four portfolio techniques are considered (mean-variance, robust portfolio, minimum-variance, and equi-risk budgeting), and four covariance estimators are applied (sample covariance, ordinary least squares (OLS) covariance, cross-validated eigenvalue shrinkage covariance, and eigenvalue clipping). Some comparisons between the covariance estimators in terms of eigenvalue stability and four metrics (i.e. expected risk, gross leverage, Sharpe ratio and effective diversification) exhibit the superiority of the eigenvalue clipping covariance estimator. The experiments on the Russel1000 dataset show that the minimum-variance with eigenvalue clipping is the model suitable for client A, whereas robust portfolio with eigenvalue clipping is the one suitable for client B. |
Keywords: | Robust portfolio, minimum-variance, eigenvalue clipping, OLS covariance |
Date: | 2023–03–26 |
URL: | http://d.repec.org/n?u=RePEc:hal:wpaper:hal-04046454&r=ecm |
By: | Wei Tian (School of Economics, UNSW Business School, UNSW); Seojeong Lee (Department of Economics, Seoul National University); Valentyn Panchenko (School of Economics, UNSW Business School, UNSW) |
Abstract: | We propose a generalization of the synthetic control method to a multiple-outcome framework, which improves the reliability of treatment effect estimation. This is done by supplementing the conventional pre-treatment time dimension with the extra dimension of related outcomes in computing the synthetic control weights. Our generalization can be particularly useful for studies evaluating the effect of a treatment on multiple outcome variables. To illustrate our method, we estimate the effects of non-pharmaceutical interventions (NPIs) on various outcomes in Sweden in the first 3 quarters of 2020. Our results suggest that if Sweden had implemented stricter NPIs like the other European countries by March, then there would have been about 70% fewer cumulative COVID-19 infection cases and deaths by July, and 20% fewer deaths from all causes in early May, whereas the impacts of the NPIs were relatively mild on the labor market and economic outcomes. |
Keywords: | Synthetic control, Policy evaluation, Causal inference, Public health |
JEL: | C32 C54 I18 |
Date: | 2023–03 |
URL: | http://d.repec.org/n?u=RePEc:swe:wpaper:2023-05&r=ecm |
By: | Ajit Desai |
Abstract: | This article provides a curated review of selected papers published in prominent economics journals that use machine learning (ML) tools for research and policy analysis. The review focuses on three key questions: (1) when ML is used in economics, (2) what ML models are commonly preferred, and (3) how they are used for economic applications. The review highlights that ML is particularly used in processing nontraditional and unstructured data, capturing strong nonlinearity, and improving prediction accuracy. Deep learning models are suitable for nontraditional data, whereas ensemble learning models are preferred for traditional datasets. While traditional econometric models may suffice for analyzing low-complexity data, the increasing complexity of economic data due to rapid digitalization and the growing literature suggest that ML is becoming an essential addition to the econometrician's toolbox. |
Date: | 2023–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2304.00086&r=ecm |
By: | Simon Smith; Allan Timmermann; Jonathan H. Wright |
Abstract: | We revisit time-variation in the Phillips curve, applying new Bayesian panel methods with breakpoints to US and European Union disaggregate data. Our approach allows us to accurately estimate both the number and timing of breaks in the Phillips curve. It further allows us to determine the existence of clusters of industries, cities, or countries whose Phillips curves display similar patterns of instability and to examine lead-lag patterns in how individual inflation series change. We find evidence of a marked flattening in the Phillips curves for US sectoral data and among EU countries, particularly poorer ones. Conversely, evidence of a flattening is weaker for MSA-level data and for the wage Phillips curve. US regional data and EU data point to a kink in the price Phillips curve which remains relatively steep when the economy is running hot. |
JEL: | C11 C22 E51 E52 |
Date: | 2023–04 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:31153&r=ecm |