nep-ecm New Economics Papers
on Econometrics
Issue of 2019‒11‒04
23 papers chosen by
Sune Karlsson
Örebro universitet

  1. The Second-order Asymptotic Properties of Asymmetric Least Squares Estimation By Tae-Hwy Lee; Aman Ullah; He Wang
  2. Center-Outward R-Estimation for Semiparametric VARMA Models By Marc Hallin; Davide La Vecchia; H Liu
  3. Stein-like Shrinkage Estimation of Panel Data Models with Common Correlated Effects By Tae-Hwy Lee; Bai Huang; Aman Ullah
  4. Nonparametric Estimation of Marginal Effects in Regression-spline Random Effects Models By Aman Ullah; Shujie Ma; Jeffrey Racine
  5. Testing for Attrition Bias in Field Experiments By Sarojini Hirshleifer; Dalia Ghanem; Karen Ortiz-Becerra
  6. Time–Varying Coefficient Spatial Autoregressive Panel Data Model with Fixed Effects By Xuan Liang; Jiti Gao; Xiaodong Gong
  7. Estimating a Large Covariance Matrix in Time-varying Factor Models By Jaeheon Jung
  8. Find what you are looking for: A data-driven covariance matrix estimation By Sven Husmann; Antoniya Shivarova; Rick Steinert
  9. Nonparametric Estimation of the Marginal Effect in Fixed-Effect Panel Data Models By Aman Ullah; Yoonseok Lee; Debasri Mukherjee
  10. A Combined Random Effect and Fixed Effect Forecast for Panel Data Models By Tae-Hwy Lee; Bai Huang; Aman Ullah
  11. Implications of Partial Information for Applied Macroeconomic Modelling By Adrian Pagan; Tim Robinson
  12. Combined Estimation of Semiparametric Panel Data Models By Tae-Hwy Lee; Bai Huang; Aman Ullah
  13. Rising to the Challenge: Bayesian Estimation and Forecasting Techniques for Macroeconomic Agent-Based Models By Domenico Delli Gatti; Jakob Grazzini
  14. Variable Selection in Sparse Semiparametric Single Index Models By Tae-Hwy Lee; Jianghao Chu; Aman Ullah
  15. What Time Use Surveys Can (And Cannot) Tell Us about Labor Supply By Ruoyao Shi; Cheng Chou
  16. Bootstrap Aggregating and Random Forest By Tae-Hwy Lee; Aman Ullah; Ran Wang
  17. Component-wise AdaBoost Algorithms for High-dimensional Binary Classi fication and Class Probability Prediction By Tae-Hwy Lee; Jianghao Chu; Aman Ullah
  18. How Informative is High-Frequency data for Tail Risk Estimation and Forecasting? By Halbleib, Roxana; Dimitriadis, Timo
  19. Dual IV: A Single Stage Instrumental Variable Regression By Krikamol Muandet; Arash Mehrjou; Si Kai Lee; Anant Raj
  20. Truncated priors for tempered hierarchical Dirichlet process vector autoregression By Sergei Seleznev
  21. Boosting By Tae-Hwy Lee; Jianghao Chu; Aman Ullah; Ran Wang
  22. Testing Forecast Rationality for Measures of Central Tendency By Timo Dimitriadis; Andrew J. Patton; Patrick Schmidt
  23. Sparsity and Stability for Minimum-Variance Portfolios By Sven Husmann; Antoniya Shivarova; Rick Steinert

  1. By: Tae-Hwy Lee (Department of Economics, University of California Riverside); Aman Ullah (UCR); He Wang (UCR)
    Abstract: The higher-order asymptotic properties provide better approximation of the bias for a class of estimators. The first-order asymptotic properties of the asymmetric least squares (ALS) estimator have been investigated by Newey and Powell (1987). This paper develops the second-order asymptotic properties (bias and mean squared error) of the ALS estimator, extending the second-order asymptotic results for the symmetric least squares (LS) estimators of Rilstone, Srivastava and Ullah (1996). The LS gives the mean regression function while the ALS gives the "expectile" regression function, a generalization of the usual regression function. The second-order bias result enables an improved bias correction and thus an improved ALS estimation in finite sample. In particular, we show that the second-order bias is much larger as the asymmetry is stronger, and therefore the benefit of the second-order bias correction is greater when we are interested in extreme expectiles which are used as a risk measure in financial economics. The higher-order MSE result for the ALS estimation also enables us to better understand the sources of estimation uncertainty. The Monte Carlo simulation confirms the benefits of the second-order asymptotic theory and indicates that the second-order bias is larger at the extreme low and high expectiles.
    Keywords: asymmetric least squares, expectile, delta function, second-order bias, Monte Carlo.
    JEL: C13 C33 C52
    Date: 2018–12
    URL: http://d.repec.org/n?u=RePEc:ucr:wpaper:201910&r=all
  2. By: Marc Hallin; Davide La Vecchia; H Liu
    Abstract: We propose a new class of estimators for semiparametric VARMA models with the innovation density playing the role of nuisance parameter. Our estimators are R-estimators based on the multivariate concepts of center-outward ranks and signs recently proposed by Hallin~(2017). We show how these concepts, combined with Le Cam's asymptotic theory of statistical experiments, yield a robust yet flexible and powerful class of estimation procedures for multivariate time series. We develop the relevant asymptotic theory of our R-estimators, establishing their root-n consistency and asymptotic normality under a broad class of innovation densities including, e.g. multimodal mixtures of Gaussians or and multivariate skew-t distributions. An implementation algorithm is provided in the supplementary material, available online. A Monte Carlo study compares our R-estimators with the routinely-applied Gaussian quasi-likelihood ones; the latter appear to be quite significantly outperformed away from elliptical innovations. Numerical results also provide evidence of considerable robustness gains. Two real data examples conclude the paper.
    Keywords: Multivariate ranks, Distribution-freeness, Local asymptotic normality, Measure transportation, Quasi likelihood estimation, Skew innovation density
    Date: 2019–10
    URL: http://d.repec.org/n?u=RePEc:eca:wpaper:2013/294809&r=all
  3. By: Tae-Hwy Lee (Department of Economics, University of California Riverside); Bai Huang (CUFE); Aman Ullah (UCR)
    Abstract: This paper examines the asymptotic properties of the Stein-type shrinkage combined (averaging) estimation of panel data models. We introduce a combined estimation when the fixed effects (FE) estimator is inconsistent due to endogeneity arising from the correlated common effects in the regression error and regressors. In this case the FE estimator and the CCEP estimator of Pesaran (2006) are combined. This can be viewed as the panel data model version of the shrinkage to combine the OLS and 2SLS estimators as the CCEP estimator is a 2SLS or control function estimator that controls for the endogeneity arising from the correlated common effects. The asymptotic theory, Monte Carlo simulation, and empirical applications are presented. According to our calculation of the asymptotic risk, the Stein-like shrinkage estimator is more efficient estimation than the CCEP estimator.
    Keywords: Endogeneity, Panel data, Fixed effect, Common correlated effects, Shrinkage, Model averaging, Local asymptotics, Hausman test.
    JEL: C13 C33 C52
    Date: 2018–09
    URL: http://d.repec.org/n?u=RePEc:ucr:wpaper:201905&r=all
  4. By: Aman Ullah (Department of Economics, University of California Riverside); Shujie Ma (Department of Statistics, University of California Riverside); Jeffrey Racine (Department of Economics, McMaster University)
    Abstract: We consider a B-spline regression approach towards nonparametric modelling of a random effects (error component) model. We focus our attention on the estimation of marginal effects (derivatives) and their asymptotic properties. Theoretical underpinnings are provided, finite-sample performance is evaluated via Monte Carlo simulation, and an application that examines the contribution of different types of public infrastructure on private production is investigated using panel data comprising the 48 contiguous states in the US over the period 1970-1986.
    JEL: C14 C23
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:ucr:wpaper:201920&r=all
  5. By: Sarojini Hirshleifer (Department of Economics, University of California Riverside); Dalia Ghanem (UC Davis); Karen Ortiz-Becerra (UC Davis)
    Abstract: We approach attrition in field experiments with baseline outcome data as an identification problem in a panel model. A systematic review of the literature indicates that there is no consensus on how to test for attrition bias. We establish identifying assumptions for treatment effects for both the respondent subpopulation and the study population. We then derive their sharp testable implications on the baseline outcome distribution and propose randomization procedures to test them. We demonstrate that the most commonly used test does not control size in general when internal validity holds. Simulations and applications illustrate the empirical relevance of our analysis.
    Keywords: attrition, field experiments, randomized experiments, randomized controlled trials, internal validity, Kolmogorov-Smirnov, Cramer-von-Mises, randomization tests
    JEL: C12 C21 C33 C93
    Date: 2019–08
    URL: http://d.repec.org/n?u=RePEc:ucr:wpaper:201919&r=all
  6. By: Xuan Liang; Jiti Gao; Xiaodong Gong
    Abstract: This paper develops a time-varying coefficient spatial autoregressive panel data model with the individual fixed effects to capture the nonlinear effects of the regressors, which vary over the time. To effectively estimate the model, we propose a method that incorporates the nonparametric local linear method and the concentrated quasi-maximum likelihood estimation method to obtain consistent estimators for the spatial coefficient and the time-varying coefficient function. The asymptotic properties of these estimators are derived as well, showing the regular sqrt(NT)-rate of convergence for the parametric parameters and the common sqrt(NTh)-rate of convergence for the nonparametric component, respectively. Monte Carlo simulations are conducted to illustrate the finite sample performance of our proposed method. Meanwhile, we apply our method to study the Chinese labor productivity to identify the spatial influences and the time-varying spillover effects among 185 Chinese cities with comparison to the results on a subregion East China.
    Keywords: concentrated quasi-maximum likelihood estimation, local linear estimation, time–varying coefficient.
    JEL: C21 C23
    Date: 2019
    URL: http://d.repec.org/n?u=RePEc:msh:ebswps:2019-26&r=all
  7. By: Jaeheon Jung
    Abstract: This paper deals with the time-varying high dimensional covariance matrix estimation. We propose two covariance matrix estimators corresponding with a time-varying approximate factor model and a time-varying approximate characteristic-based factor model, respectively. The models allow the factor loadings, factor covariance matrix, and error covariance matrix to change smoothly over time. We study the rate of convergence of each estimator. Our simulation and empirical study indicate that time-varying covariance matrix estimators generally perform better than time-invariant covariance matrix estimators. Also, if characteristics are available that genuinely explain true loadings, the characteristics can be used to estimate loadings more precisely in finite samples; their helpfulness increases when loadings rapidly change.
    Date: 2019–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1910.11965&r=all
  8. By: Sven Husmann; Antoniya Shivarova; Rick Steinert
    Abstract: The global minimum-variance portfolio is a typical choice for investors because of its simplicity and broad applicability. Although it requires only one input, namely the covariance matrix of asset returns, estimating the optimal solution remains a challenge. In the presence of high-dimensionality in the data, the sample estimator becomes ill-conditioned, which negates the positive effect of diversification in an out-of-sample setting. To address this issue, we review recent covariance matrix estimators and extend the literature by suggesting a multi-fold cross-validation technique. In detail, conducting an extensive empirical analysis with four datasets based on the S\&P 500, we evaluate how the data-driven choice of specific tuning parameters within the proposed cross-validation approach affects the out-of-sample performance of the global minimum-variance portfolio. In particular, for cases in which the efficiency of a covariance estimator is strongly influenced by the choice of a tuning parameter, we detect a clear relationship between the optimality criterion for its selection within the cross-validation and the evaluated performance measure. Finally, we show that using cross-validation can improve the performance of highly efficient estimators even when the data-driven covariance parameter deviates from its theoretically optimal value.
    Date: 2019–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1910.13960&r=all
  9. By: Aman Ullah (Department of Economics, University of California Riverside); Yoonseok Lee (Syracuse University); Debasri Mukherjee (Western Michigan University)
    Abstract: This paper considers multivariate local linear least squares estimation of panel data models when fixed effects present. One step estimation of the local marginal effect is of the main interest. A within-group type nonparametric estimator is developed, where the fixed effects are eliminated by subtracting individual-specific locally weighted time average (i.e., using the local within transformation). It is shown that the local-within-transformation-based estimator satisfies the standard properties of the local linear estimator. In comparison, the nonparametric estimators based on the conventional (i.e., global) within transformation or first difference result in biased estimators, where the bias does not degenerate even with large samples. The new estimator is used to examine the nonlinear relationship between income and nitrogen-oxide level (i.e., the environmental Kuznets curve) based on the U.S. state-level panel data.
    Keywords: Nonparametric estimation, panel data, fixed effects, multivariate, local linear least squares, local within transformation, environmental Kuznets curve.
    JEL: C14 C23
    Date: 2018–09
    URL: http://d.repec.org/n?u=RePEc:ucr:wpaper:201901&r=all
  10. By: Tae-Hwy Lee (Department of Economics, University of California Riverside); Bai Huang (CUFE); Aman Ullah (UCR)
    Abstract: When some of the regressors in a panel data model are correlated with the random individual effects, the random effect (RE) estimator becomes inconsistent while the fixed effect (FE) estimator is consistent. Depending on the various degree of such correlation, we can combine the RE estimator and FE estimator to form a combined estimator which can be better than each of the FE and RE estimators. In this paper, we are interested in whether the combined estimator may be used to form a combined forecast to improve upon the RE forecast (forecast made using the RE estimator) and the FE forecast (forecast using the FE estimator) in out-of-sample forecasting. Our simulation experiment shows that the combined forecast does dominate the FE forecast for all degrees of endogeneity in terms of mean squared forecast errors (MSFE), demonstrating that the theoretical results of the risk dominance for the in-sample estimation carry over to the out-of-sample forecasting. It also shows that the combined forecast can reduce MSFE relative to the RE forecast for moderate to large degrees of endogeneity and for large degrees of heterogeneity in individual effects.
    Keywords: Endogeneity, Panel Data, Fixed Effect, Random Effect, Hausman test, Combined Estimator, Combined Forecast.
    JEL: C13 C33 C52
    Date: 2018–12
    URL: http://d.repec.org/n?u=RePEc:ucr:wpaper:201906&r=all
  11. By: Adrian Pagan (School of Economics, University of Sydney, CAMA, Australian National University); Tim Robinson (Melbourne Institute: Applied Economic & Social Research, The University of Melbourne)
    Abstract: Implications of partial information for applied macroeconomic modelling along four dimensions are shown, and analysis provided on how they can be addressed. First, when permanent shocks are present a Vector Error-Correction Model including latent, as well as observed, variables is required to capture macroeconomic dynamics. Second, the assumption in Dynamic Stochastic General Equilibrium models that shocks are autocorrelated provides identifying information usable in Structural Vector AutoRe-gressions. Third, estimating models with more shocks than observed variables must yield correlated estimated structural shocks. Fourth, including measurement error, as commonly specified, implies a lack of co-integration between variables, even when actually present
    Keywords: SVAR; Partial Information; Identification; Measurement Error; DSGE.
    JEL: E37 C51 C52
    Date: 2019–10
    URL: http://d.repec.org/n?u=RePEc:iae:iaewps:wp2019n12&r=all
  12. By: Tae-Hwy Lee (Department of Economics, University of California Riverside); Bai Huang (Central University of Finance and Economics); Aman Ullah (University of California Riverside)
    Abstract: The combined estimation for the semiparametric panel data models is proposed. The properties of estimators for the semiparametric panel data models with random effects (RE) and fixed effects (FE) are examined. When the RE estimator suffers from endogeneity due to the individual effects correlated with the regressors, the semiparametric RE and FE estimators may be adaptively combined, with the combining weights depending on the degree of endogeneity. The asymptotic distributions of these three estimators (RE, FE, and combined estimators) for the semiparametric panel data models are derived using a local asymptotic framework. These three estimators are then compared in asymptotic risk. The semiparametric combined estimator has strictly smaller asymptotic risk than the semiparametric fixed effect estimator. The Monte Carlo study shows that the semiparametric combined estimator outperforms semiparametric FE and RE estimators except when the degrees of endogeneity and heterogeneity of the individual effects are very small. Also presented is an empirical application where the effect of public sector capital in the private economy production function is examined using the US state level panel data.
    Keywords: Endogeneity, Panel Data, Semiparametric FE estimator, Semiparametric RE estimator, Semiparametric Combined Estimator, Local Asymptotics, Hausman Test.
    JEL: C13 C33 C52
    Date: 2018–07
    URL: http://d.repec.org/n?u=RePEc:ucr:wpaper:201915&r=all
  13. By: Domenico Delli Gatti; Jakob Grazzini
    Abstract: We propose two novel methods to “bring ABMs to the data”. First, we put forward a new Bayesian procedure to estimate the numerical values of ABM parameters that takes into account the time structure of simulated and observed time series. Second, we propose a method to forecast aggregate time series using data obtained from the simulation of an ABM. We apply our methodological contributions to a medium-scale macro agent-based model. We show that the estimated model is capable of reproducing features of observed data and of forecasting one-period ahead output-gap and investment with a remarkable degree of accuracy.
    Keywords: agent-based models, estimation, forecasting
    JEL: C11 C13 C53 C63
    Date: 2019
    URL: http://d.repec.org/n?u=RePEc:ces:ceswps:_7894&r=all
  14. By: Tae-Hwy Lee (Department of Economics, University of California Riverside); Jianghao Chu (UCR); Aman Ullah (UCR)
    Abstract: In this paper we consider the "Regularization of Derivative Expectation Operator" (Rodeo) of Lafferty and Wasserman (2008) and propose a modified Rodeo algorithm for semiparametric single index models in big data environment with many regressors. The method assumes sparsity that many of the regressors are irrelevant. It uses a greedy algorithm, in that, to estimate the semiparametric single index model (SIM) of Ichimura (1993), all coefficients of the regressors are initially set to start from near zero, then we test iteratively if the derivative of the regression function estimator with respect to each coefficient is significantly different from zero. The basic idea of the modified Rodeo algorithm for SIM (to be called SIM-Rodeo) is to view the local bandwidth selection as a variable selection scheme which amplifies the coefficients for relevant variables while keeping the coefficients of irrelevant variables relatively small or at the initial starting values near zero. For sparse semiparametric single index models, the SIM-Rodeo algorithm is shown to attain consistency in variable selection. In addition, the algorithm is fast to finish the greedy steps. We compare SIM-Rodeo with SIM-Lasso method in Zeng et al. (2012). Our simulation results demonstrate that the proposed SIM-Rodeo method is consistent for variable selection and show that it has smaller integrated mean squared errors than SIM-Lasso.
    Keywords: Single index model (SIM), Variable selection, Rodeo, SIM-Rodeo, Lasso, SIM-Lasso.
    JEL: C25 C44 C53 C55
    Date: 2018–09
    URL: http://d.repec.org/n?u=RePEc:ucr:wpaper:201908&r=all
  15. By: Ruoyao Shi (Department of Economics, University of California Riverside); Cheng Chou (University of Leicester)
    Abstract: It has been widely acknowledged that the measurement of labor supply in the Current Population Survey (CPS) and other conventional microeconomic surveys has nonclassical measurement error, which will bias the estimates of crucial parameters in labor economics, such as labor supply elasticity. Time diary studies, such as the American Time Use Survey (ATUS), only have accurate measurement of hours worked on a single day, hence the weekly hours worked are unobserved. Despite the missing data problem, we provide several consistent estimators of the parameters in weekly labor supply equation using the information in the time use surveys. The consistency of our estimators does not require more conditions beyond those for a usual two stage least square (2SLS) estimator when the true weekly hours worked are observed. We also show that it is impossible to recover the weekly number of hours worked or its distribution function from time use surveys like the ATUS. In our empirical application we find considerable evidence of nonclassical measurement error in the hours worked in the CPS, and illustrate the consequences of using mismeasured weekly hours worked in empirical studies.
    Keywords: measurement error, missing data, instrumental variable, asymptotic efficiency, labor supply
    JEL: C13 C21 C26 C81 J22
    Date: 2019–01
    URL: http://d.repec.org/n?u=RePEc:ucr:wpaper:201912&r=all
  16. By: Tae-Hwy Lee (Department of Economics, University of California Riverside); Aman Ullah (University of California, Riverside); Ran Wang (University of California, Riverside)
    Abstract: Bootstrap Aggregating (Bagging) is an ensemble technique for improving the robustness of forecasts. Random Forest is a successful method based on Bagging and Decision Trees. In this chapter, we explore Bagging, Random Forest, and their variants in various aspects of theory and practice. We also discuss applications based on these methods in economic forecasting and inference.
    Keywords: bagging, decision trees, random forests, forecasting
    JEL: C2 C3 C4 C5
    Date: 2019–07
    URL: http://d.repec.org/n?u=RePEc:ucr:wpaper:201918&r=all
  17. By: Tae-Hwy Lee (Department of Economics, University of California Riverside); Jianghao Chu (UCR); Aman Ullah (UCR)
    Abstract: Freund and Schapire (1997) introduced "Discrete AdaBoost" (DAB) which has been mysteriously effective for the high-dimensional binary classi cation or binary prediction. In an effort to understand the myth, Friedman, Hastie and Tibshirani (FHT, 2000) show that DAB can be understood as statistical learning which builds an additive logistic regression model via Newton-like updating minimization of the exponential loss. From this statistical point of view, FHT proposed three modi fications of DAB, namely, Real AdaBoost (RAB), LogitBoost (LB), and Gentle AdaBoost (GAB). All of DAB, RAB, LB, GAB solve for the logistic regression via different algorithmic designs and different objective functions. The RAB algorithm uses class probability estimates to construct real-valued contributions of the weak learner, LB is an adaptive Newton algorithm by stagewise optimization of the Bernoulli likelihood, and GAB is an adaptive Newton algorithm via stagewise optimization of the exponential loss. The same authors of FHT published an influential textbook, The Elements of Statistical Learn- ing (ESL, 2001 and 2008). A companion book An Introduction to Statistical Learning (ISL) by James et al. (2013) was published with applications in R. However, both ESL and ISL (e.g., sections 4.5 and 4.6) do not cover these four AdaBoost algorithms while FHT provided some simulation and empirical studies to compare these methods. Given numerous potential applications, we believe it would be useful to collect the R libraries of these AdaBoost algorithms, as well as more recently developed extensions to Ad- aBoost for probability prediction with examples and illustrations. Therefore, the goal of this chapter is to do just that, i.e., (i) to provide a user guide of these alternative AdaBoost algorithms with step-by-step tutorial of using R (in a way similar to ISL, e.g., Section 4.6), (ii) to compare AdaBoost with alternative machine learning classi fication tools such as the deep neural network (DNN), logistic regression with LASSO and SIM-RODEO, and (iii) to demonstrate the empirical applications in economics, such as prediction of business cycle turning points and directional prediction of stock price indexes. We revisit Ng (2014) who used DAB for prediction of the business cycle turning points by comparing the results from RAB, LB, GAB, DNN, logistic regression and SIM-RODEO.
    Keywords: AdaBoost, R, Binary classi cation, Logistic regression, DAB, RAB, LB, GAB, DNN
    Date: 2018–07
    URL: http://d.repec.org/n?u=RePEc:ucr:wpaper:201907&r=all
  18. By: Halbleib, Roxana; Dimitriadis, Timo
    JEL: C1 C4 C5
    Date: 2019
    URL: http://d.repec.org/n?u=RePEc:zbw:vfsc19:203669&r=all
  19. By: Krikamol Muandet; Arash Mehrjou; Si Kai Lee; Anant Raj
    Abstract: We present a novel single-stage procedure for instrumental variable (IV) regression called DualIV which simplifies traditional two-stage regression via a dual formulation. We show that the common two-stage procedure can alternatively be solved via generalized least squares. Our formulation circumvents the first-stage regression which can be a bottleneck in modern two-stage procedures for IV regression. We also show that our framework is closely related to the generalized method of moments (GMM) with specific assumptions. This highlights the fundamental connection between GMM and two-stage procedures in IV literature. Using the proposed framework, we develop a simple kernel-based algorithm with consistency guarantees. Lastly, we give empirical results illustrating the advantages of our method over the existing two-stage algorithms.
    Date: 2019–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1910.12358&r=all
  20. By: Sergei Seleznev (Bank of Russia, Russian Federation)
    Abstract: We construct priors for the tempered hierarchical Dirichlet process vector autoregression model (tHDP-VAR) that in practice do not lead to explosive forecasting dynamics. Additionally, we show that tHDP-VAR and its variational Bayesian approximation with heuristics demonstrate competitive or even better forecasting performance on US and Russian datasets.
    Keywords: Bayesian nonparametrics, forecasting, hierarchical Dirichlet process, infinite hidden Markov model.
    JEL: C11 C32 C53 E37
    Date: 2019–10
    URL: http://d.repec.org/n?u=RePEc:bkr:wpaper:wps47&r=all
  21. By: Tae-Hwy Lee (Department of Economics, University of California Riverside); Jianghao Chu (University of California, Riverside); Aman Ullah (University of California, Riverside); Ran Wang (University of California, Riverside)
    Abstract: In the era of Big Data, selecting relevant variables from a potentially large pool of candidate variables becomes a newly emerged concern in macroeconomic researches, especially when the data available is high-dimensional, i.e. the number of explanatory variables (p) is greater than the length of the sample size (n). Common approaches include factor models, the principal component analysis and regularized regressions. However, these methods require additional assumptions that are hard to verify and/or introduce biases or aggregated factors which complicate the interpretation of the estimated outputs. This chapter reviews an alternative solution, namely Boosting, which is able to estimate the variables of interest consistently under fairly general conditions given a large set of explanatory variables. Boosting is fast and easy to implement which makes it one of the most popular machine learning algorithms in academia and industry.
    Keywords: Boosting, AdaBoost, Gradient Boosting, Functional Gradient Descent, Decision Tree, Shrinkage
    JEL: C2 C3 C4 C5
    Date: 2019–05
    URL: http://d.repec.org/n?u=RePEc:ucr:wpaper:201917&r=all
  22. By: Timo Dimitriadis; Andrew J. Patton; Patrick Schmidt
    Abstract: Rational respondents to economic surveys may report as a point forecast any measure of the central tendency of their (possibly latent) predictive distribution, for example the mean, median, mode, or any convex combination thereof. We propose tests of forecast rationality when the measure of central tendency used by the respondent is unknown. These tests require us to overcome an identification problem when the measures of central tendency are equal or in a local neighborhood of each other, as is the case for (exactly or nearly) symmetric and unimodal distributions. As a building block, we also present novel tests for the rationality of mode forecasts. We apply our tests to survey forecasts of individual income, Greenbook forecasts of U.S. GDP, and random walk forecasts for exchange rates. We find that the Greenbook and random walk forecasts are best rationalized as mean, or near-mean forecasts, while the income survey forecasts are best rationalized as mode forecasts.
    Date: 2019–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1910.12545&r=all
  23. By: Sven Husmann; Antoniya Shivarova; Rick Steinert
    Abstract: The popularity of modern portfolio theory has decreased among practitioners because of its unfavorable out-of-sample performance. Estimation errors tend to affect the optimal weight calculation noticeably, especially when a large number of assets is considered. To overcome these issues, many methods have been proposed in recent years, although most only address a small set of practically relevant questions related to portfolio allocation. This study therefore sheds light on different covariance estimation techniques, combines them with sparse model approaches, and includes a turnover constraint that induces stability. We use two datasets - comprising 319 and 100 companies of the S&P 500, respectively - to create a realistic and reproducible data foundation for our empirical study. To the best of our knowledge, this study is the first to show that it is possible to maintain the low-risk profile of efficient estimation methods while automatically selecting only a subset of assets and further inducing low portfolio turnover. Moreover, we provide evidence that using the LASSO as the sparsity-generating model is insufficient to lower turnover when the involved tuning parameter can change over time.
    Date: 2019–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1910.11840&r=all

This nep-ecm issue is ©2019 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.