nep-ecm New Economics Papers
on Econometrics
Issue of 2021‒12‒13
fifteen papers chosen by
Sune Karlsson
Örebro universitet

  1. Bayesian Approaches to Shrinkage and Sparse Estimation By Dimitris Korobilis; Kenichi Shimizu
  2. Maximum Likelihood Estimation of Differentiated Products Demand Systems By Greg Lewis; Bora Ozaltun; Georgios Zervas
  3. Testing a Constant Mean Function Using Functional Regression By JIN SEO CHO; MENG HUANG; HALBERT WHITE
  4. Non-standard errors By Albert J. Menkveld; Anna Dreber; Felix Holzmeister; Juergen Huber; Magnus Johannesson; Michael Kirchler; Sebastian Neussüs; Michael Razen; Utz Weitzel; Christian T. Brownlees; Javier Gil-Bazo; et al.
  5. Online Estimation and Optimization of Utility-Based Shortfall Risk By Arvind S. Menon; Prashanth L. A.; Krishna Jagannathan
  6. Interactive Effects Panel Data Models with General Factors and Regressors By Bin Peng; Liangjun Su; Joakim Westerlund; Yanrong Yang
  7. Machine Learning, Behavioral Targeting and Regression Discontinuity Designs By Narayanan, Sridhar; Kalyanam, Kirthi
  8. The Performance of Recent Methods for Estimating Skill Prices in Panel Data By Michael J. B\"ohm; Hans-Martin von Gaudecker
  9. On Recoding Ordered Treatments as Binary Indicators By Evan K. Rose; Yotam Shem-Tov
  10. Is the empirical out-of-sample variance an informative risk measure for the high-dimensional portfolios? By Taras Bodnar; Nestor Parolya; Erik Thors\'en
  11. A Universal End-to-End Approach to Portfolio Optimization via Deep Learning By Chao Zhang; Zihao Zhang; Mihai Cucuringu; Stefan Zohren
  12. Deep Structural Estimation:With an Application to Option Pricing By Hui Chen; Antoine Didisheim; Simon Scheidegger
  13. Non-asymptotic estimation of risk measures using stochastic gradient Langevin dynamics By Jiarui Chu; Ludovic Tangpi
  14. Semi-nonparametric Estimation of Operational Risk Capital with Extreme Loss Events By Heng Z. Chen; Stephen R. Cosslett
  15. Optimal Model Selection in Contextual Bandits with Many Classes via Offline Oracles By Krishnamurthy, Sanath Kumar; Athey, Susan

  1. By: Dimitris Korobilis; Kenichi Shimizu
    Abstract: In all areas of human knowledge, datasets are increasing in both size and complexity, creating the need for richer statistical models. This trend is also true for economic data, where high-dimensional and nonlinear/nonparametric inference is the norm in several fields of applied econometric work. The purpose of this paper is to introduce the reader to the world of Bayesian model determination, by surveying modern shrinkage and variable selection algorithms and methodologies. Bayesian inference is a natural probabilistic framework for quantifying uncertainty and learning about model parameters, and this feature is particularly important for inference in modern models of high dimensions and increased complexity. We begin with a linear regression setting in order to introduce various classes of priors that lead to shrinkage/sparse estimators of comparable value to popular penalized likelihood estimators (e.g. ridge, lasso). We explore various methods of exact and approximate inference, and discuss their pros and cons. Finally, we explore how priors developed for the simple regression setting can be extended in a straightforward way to various classes of interesting econometric models. In particular, the following case-studies are considered, that demonstrate application of Bayesian shrinkage and variable selection strategies to popular econometric contexts: i) vector autoregressive models; ii) factor models; iii) time-varying parameter regressions; iv) confounder selection in treatment effects models; and v) quantile regression models. A MATLAB package and an accompanying technical manual allow the reader to replicate many of the algorithms described in this review.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:gla:glaewp:2021_19&r=
  2. By: Greg Lewis; Bora Ozaltun; Georgios Zervas
    Abstract: We discuss estimation of the differentiated products demand system of Berry et al (1995) (BLP) by maximum likelihood estimation (MLE). We derive the maximum likelihood estimator in the case where prices are endogenously generated by firms that set prices in Bertrand-Nash equilibrium. In Monte Carlo simulations the MLE estimator outperforms the best-practice GMM estimator on both bias and mean squared error when the model is correctly specified. This remains true under some forms of misspecification. In our simulations, the coverage of the ML estimator is close to its nominal level, whereas the GMM estimator tends to under-cover. We conclude the paper by estimating BLP on the car data used in the original Berry et al (1995) paper, obtaining similar estimates with considerably tighter standard errors.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.12397&r=
  3. By: JIN SEO CHO (Yonsei Univ); MENG HUANG (PNC); HALBERT WHITE (University of California)
    Abstract: In this paper, we study functional ordinary least squares estimator and its properties in testing the hypothesis of a constant zero mean function or an unknown constant non-zero mean function. We exploit the recent work by Cho, Phillips, and Seo (2021) and show that the associated Wald test statistics have standard chi-square limiting null distributions, standard non-central chi-square distributions for local alternatives converging to zero at a √n rate, and are consistent against global alternatives. These properties permit computationally convenient tests of hypotheses involving nuisance parameters. In particular, we develop new alternatives to tests for regression misspecification, that involves nuisance parameters identified only under the alternative. In Monte Carlo studies, we find that our tests have well behaved levels. We also find that functional ordinary least squares tests can have power better than existing methods that do not exploit this covariance structure, like the specification testing procedures of Bierens (1982, 1990) or Stinchcombe and White(1998). Finally, we apply our methodology to the probit models for voting turnout that are estimated by Wolfinger and Resenstone (1980) and Nagler (1991) and test whether the models are correctly specified or not.
    Keywords: Davies Test; Functional Data; Misspecification; Nuisance Parameters; Wald Test; Voting Turnout.
    Date: 2021–12
    URL: http://d.repec.org/n?u=RePEc:yon:wpaper:2021rwp-190&r=
  4. By: Albert J. Menkveld; Anna Dreber; Felix Holzmeister; Juergen Huber; Magnus Johannesson; Michael Kirchler; Sebastian Neussüs; Michael Razen; Utz Weitzel; Christian T. Brownlees; Javier Gil-Bazo; et al.
    Abstract: In statistics, samples are drawn from a population in a data-generating process (DGP). Standard errors measure the uncertainty in sample estimates of population parameters. In science, evidence is generated to test hypotheses in an evidence-generating process (EGP). We claim that EGP variation across researchers adds uncertainty: non-standard errors. To study them, we let 164 teams test six hypotheses on the same sample. We find that non-standard errors are sizeable, on par with standard errors. Their size (i) co-varies only weakly with team merits, reproducibility, or peer rating, (ii) declines significantly after peer-feedback, and (iii) is underestimated by participants.
    Keywords: non-standard errors, multi-analyst approach, liquidity
    JEL: C12 C18 G1 G14
    Date: 2021–12
    URL: http://d.repec.org/n?u=RePEc:upf:upfgen:1807&r=
  5. By: Arvind S. Menon; Prashanth L. A.; Krishna Jagannathan
    Abstract: Utility-Based Shortfall Risk (UBSR) is a risk metric that is increasingly popular in financial applications, owing to certain desirable properties that it enjoys. We consider the problem of estimating UBSR in a recursive setting, where samples from the underlying loss distribution are available one-at-a-time. We cast the UBSR estimation problem as a root finding problem, and propose stochastic approximation-based estimations schemes. We derive non-asymptotic bounds on the estimation error in the number of samples. We also consider the problem of UBSR optimization within a parameterized class of random variables. We propose a stochastic gradient descent based algorithm for UBSR optimization, and derive non-asymptotic bounds on its convergence.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.08805&r=
  6. By: Bin Peng; Liangjun Su; Joakim Westerlund; Yanrong Yang
    Abstract: This paper considers a model with general regressors and unobservable factors. An estimator based on iterated principal components is proposed, which is shown to be not only asymptotically normal and oracle efficient, but under certain conditions also free of the otherwise so common asymptotic incidental parameters bias. Interestingly, the conditions required to achieve unbiasedness become weaker the stronger the trends in the factors, and if the trending is strong enough unbiasedness comes at no cost at all. In particular, the approach does not require any knowledge of how many factors there are, or whether they are deterministic or stochastic. The order of integration of the factors is also treated as unknown, as is the order of integration of the regressors, which means that there is no need to pre-test for unit roots, or to decide on which deterministic terms to include in the model.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.11506&r=
  7. By: Narayanan, Sridhar (Stanford University); Kalyanam, Kirthi (Santa Clara University)
    Abstract: The availability of behavioral data on customers and advances in machine learning methods have enabled scoring and targeting of customers in a variety of domains, including pricing, advertising, recommendation and personal selling. Typically, such targeting involves first training a machine learning algorithm on a training dataset, using that algorithm to score current or potential customers, and when the score crosses a threshold, a treatment such as an offer, an advertisement or a recommendation is assigned. In this paper, we highlight regression discontinuity designs (RDD) as a low-cost alternative to obtaining causal estimates in settings where machine learning is used for behavioral targeting. Our investigation leads to several new insights. Under appropriate conditions, RDD recovers the local average treatment effect (LATE). Further, we show that RDD recovers the average treatment effect (ATE) when: (1) The score is orthogonal to the slope of the treatment and (2) When the selection threshold is equal to the mean value of the score. We also show that RDD can estimate the bounds on the ATE even if we are unable to get point estimates of the ATE. That RDD can estimate ATE or bounds on ATE is a novel perspective that has been understudied in the literature. We also distinguish between two types of scoring: Intercept versus slope based and highlight the practical value of RDD in each context. Finally, we apply RDD in an empirical context where a machine learning based score was used to select consumers for retargeted display advertising. We obtain LATE estimates of the impact of the retargeted advertising program on both online and offline purchases, and also estimate bounds on the ATE. Our LATE estimates and ATE bounds add to the understanding of the effectiveness of retargeting programs in particular on offline purchases which has received less attention.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:ecl:stabus:3992&r=
  8. By: Michael J. B\"ohm; Hans-Martin von Gaudecker
    Abstract: This paper explores different methods to estimate prices paid per efficiency unit of labor in panel data. We study the sensitivity of skill price estimates to different assumptions regarding workers' choice problem, identification strategies, the number of occupations considered, skill accumulation processes, and estimation strategies. In order to do so, we conduct careful Monte Carlo experiments designed to generate similar features as in German panel data. We find that once skill accumulation is appropriately modelled, skill price estimates are generally robust to modelling choices when the number of occupations is small, i.e., switches between occupations are rare. When switching is important, subtle issues emerge and the performance of different methods varies more strongly.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.12459&r=
  9. By: Evan K. Rose; Yotam Shem-Tov
    Abstract: Researchers using instrumental variables to investigate the effects of ordered treatments (e.g., years of education, months of healthcare coverage) often recode treatment into a binary indicator for any exposure (e.g., any college, any healthcare coverage). The resulting estimand is difficult to interpret unless the instruments only shift compliers from no treatment to some positive quantity and not from some treatment to more -- i.e., there are extensive margin compliers only (EMCO). When EMCO holds, recoded endogenous variables capture a weighted average of treatment effects across complier groups that can be partially unbundled into each group's treated and untreated means. Invoking EMCO along with the standard Local Average Treatment Effect assumptions is equivalent to assuming choices are determined by a simple two-factor selection model in which agents first decide whether to participate in treatment at all and then decide how much. The instruments must only impact relative utility in the first step. Although EMCO constrains unobserved counterfactual choices, it places testable restrictions on the joint distribution of outcomes, treatments, and instruments.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.12258&r=
  10. By: Taras Bodnar; Nestor Parolya; Erik Thors\'en
    Abstract: The main contribution of this paper is the derivation of the asymptotic behaviour of the out-of-sample variance, the out-of-sample relative loss, and of their empirical counterparts in the high-dimensional setting, i.e., when both ratios $p/n$ and $p/m$ tend to some positive constants as $m\to\infty$ and $n\to\infty$, where $p$ is the portfolio dimension, while $n$ and $m$ are the sample sizes from the in-sample and out-of-sample periods, respectively. The results are obtained for the traditional estimator of the global minimum variance (GMV) portfolio, for the two shrinkage estimators introduced by \cite{frahm2010} and \cite{bodnar2018estimation}, and for the equally-weighted portfolio, which is used as a target portfolio in the specification of the two considered shrinkage estimators. We show that the behaviour of the empirical out-of-sample variance may be misleading is many practical situations. On the other hand, this will never happen with the empirical out-of-sample relative loss, which seems to provide a natural normalization of the out-of-sample variance in the high-dimensional setup. As a result, an important question arises if this risk measure can safely be used in practice for portfolios constructed from a large asset universe.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.12532&r=
  11. By: Chao Zhang; Zihao Zhang; Mihai Cucuringu; Stefan Zohren
    Abstract: We propose a universal end-to-end framework for portfolio optimization where asset distributions are directly obtained. The designed framework circumvents the traditional forecasting step and avoids the estimation of the covariance matrix, lifting the bottleneck for generalizing to a large amount of instruments. Our framework has the flexibility of optimizing various objective functions including Sharpe ratio, mean-variance trade-off etc. Further, we allow for short selling and study several constraints attached to objective functions. In particular, we consider cardinality, maximum position for individual instrument and leverage. These constraints are formulated into objective functions by utilizing several neural layers and gradient ascent can be adopted for optimization. To ensure the robustness of our framework, we test our methods on two datasets. Firstly, we look at a synthetic dataset where we demonstrate that weights obtained from our end-to-end approach are better than classical predictive methods. Secondly, we apply our framework on a real-life dataset with historical observations of hundreds of instruments with a testing period of more than 20 years.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.09170&r=
  12. By: Hui Chen; Antoine Didisheim; Simon Scheidegger
    Abstract: We propose a novel structural estimation framework in which we train a surrogateof an economic model with deep neural networks. Our methodology alleviates the curse of dimensionality and speeds up the evaluation and parameter estimation by orders of magnitudes, which significantly enhances one's ability to conduct analyses that require frequent parameter re-estimation. As an empirical application, we compare two popular option pricing models (the Heston and the Bates model with double-exponential jumps)against a non-parametric random forest model. We document that: a) the Bates model produces better out-of-sample pricing on average, but both structural models fail to outperform random forest for large areas of the volatility surface; b) random forest is more competitive at short horizons (e.g., 1-day), for short-dated options (with less than 7 days to maturity), and on days with poor liquidity; c) both structural models outperform random forest in out-of-sample delta hedging; d) the Heston model's relative performance has deteriorated significantly after the 2008 financial crisis.
    Keywords: Deep Learning, Structural Estimation, Option Pricing, Parameter Stability
    JEL: C45 C52 C58 C61 G17
    Date: 2021–02
    URL: http://d.repec.org/n?u=RePEc:lau:crdeep:21.14&r=
  13. By: Jiarui Chu; Ludovic Tangpi
    Abstract: In this paper we will study the approximation of arbitrary law invariant risk measures. As a starting point, we approximate the average value at risk using stochastic gradient Langevin dynamics, which can be seen as a variant of the stochastic gradient descent algorithm. Further, the Kusuoka's spectral representation allows us to bootstrap the estimation of the average value at risk to extend the algorithm to general law invariant risk measures. We will present both theoretical, non-asymptotic convergence rates of the approximation algorithm and numerical simulations.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.12248&r=
  14. By: Heng Z. Chen; Stephen R. Cosslett
    Abstract: Operational risk modeling using the parametric models can lead to a counter-intuitive estimate of value at risk at 99.9% as economic capital due to extreme events. To address this issue, a flexible semi-nonparametric (SNP) model is introduced using the change of variables technique to enrich the family of distributions that can be used for modeling extreme events. The SNP models are proved to have the same maximum domain of attraction (MDA) as the parametric kernels, and it follows that the SNP models are consistent with the extreme value theory - peaks over threshold method but with different shape and scale parameters. By using the simulated datasets generated from a mixture of distributions with varying body-tail thresholds, the SNP models in the Fr\'echet and Gumbel MDAs are shown to fit the datasets satisfactorily through increasing the number of model parameters, resulting in similar quantile estimates at 99.9%. When applied to an actual operational risk loss dataset from a major international bank, the SNP models yield a sensible capital estimate that is around 2 to 2.5 times as large as the single largest loss event.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2111.11459&r=
  15. By: Krishnamurthy, Sanath Kumar (Stanford University); Athey, Susan (Stanford University)
    Abstract: We study the problem of model selection for contextual bandits, in which the algorithm must balance the bias-variance trade-off for model estimation while also balancing the exploration-exploitation trade-off. In this paper, we propose the first reduction of model selection in contextual bandits to offline model selection oracles, allowing for flexible general purpose algorithms with computational requirements no worse than those for model selection for regression. Our main result is a new model selection guarantee for stochastic contextual bandits. When one of the classes in our set is realizable, up to a logarithmic dependency on the number of classes, our algorithm attains optimal realizability-based regret bounds for that class under one of two conditions: if the time-horizon is large enough, or if an assumption that helps with detecting misspecification holds. Hence our algorithm adapts to the complexity of this unknown class. Even when this realizable class is known, we prove improved regret guarantees in early rounds by relying on simpler model classes for those rounds and hence further establish the importance of model selection in contextual bandits.
    Date: 2021–06
    URL: http://d.repec.org/n?u=RePEc:ecl:stabus:3971&r=

This nep-ecm issue is ©2021 by Sune Karlsson. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.