Econometrics
http://lists.repec.org/mailman/listinfo/nep-ecm
Econometrics
2023-11-20
Nonparametric Regression with Dyadic Data
http://d.repec.org/n?u=RePEc:arx:papers:2310.12825&r=ecm
This paper studies the identification and estimation of a nonparametric nonseparable dyadic model where the structural function and the distribution of the unobservable random terms are assumed to be unknown. The identification and the estimation of the distribution of the unobservable random term are also proposed. I assume that the structural function is continuous and strictly increasing in the unobservable heterogeneity. I propose suitable normalization for the identification by allowing the structural function to have some desirable properties such as homogeneity of degree one in the unobservable random term and some of its observables. The consistency and the asymptotic distribution of the estimators are proposed. The finite sample properties of the proposed estimators in a Monte-Carlo simulation are assessed.
Brice Romuald Gueyap Kounga
2023-10
Bootstrap Hausdorff Confidence Regions for Average Treatment Effect Identified Sets
http://d.repec.org/n?u=RePEc:msh:ebswps:2023-9&r=ecm
This paper introduces a new bootstrap approach to the construction of confidence regions for Average Treatment Effect (ATE) identified sets. Minimum Hausdorff distance bootstrap confidence regions are developed and shown to be valid under suitable regularity. A novel measure of the discrepancy between a confidence region and the target identified set is advanced that contains two components analogous to conventional hypothesis test Type I and Type II errors. Monte Carlo experimentation is employed to compare the behaviour of the new confidence regions with an existing state of the art approach and the impact of different features on the properties of the alternative techniques are investigated. Properties arising from the application of quasi-maximum likelihood estimation as a tool for conducting inference on ATEs are also examined.
Donald S. Poskitt
Xueyan Zhao
binary models, bounds, coverage, partial identification
2023
Inference for Rank-Rank Regressions
http://d.repec.org/n?u=RePEc:arx:papers:2310.15512&r=ecm
Slope coefficients in rank-rank regressions are popular measures of intergenerational mobility, for instance in regressions of a child's income rank on their parent's income rank. In this paper, we first point out that commonly used variance estimators such as the homoskedastic or robust variance estimators do not consistently estimate the asymptotic variance of the OLS estimator in a rank-rank regression. We show that the probability limits of these estimators may be too large or too small depending on the shape of the copula of child and parent incomes. Second, we derive a general asymptotic theory for rank-rank regressions and provide a consistent estimator of the OLS estimator's asymptotic variance. We then extend the asymptotic theory to other regressions involving ranks that have been used in empirical work. Finally, we apply our new inference methods to three empirical studies. We find that the confidence intervals based on estimators of the correct variance may sometimes be substantially shorter and sometimes substantially longer than those based on commonly used variance estimators. The differences in confidence intervals concern economically meaningful values of mobility and thus lead to different conclusions when comparing mobility in U.S. commuting zones with mobility in other countries.
Denis Chetverikov
Daniel Wilhelm
2023-10
A Semiparametric Instrumented Difference-in-Differences Approach to Policy Learning
http://d.repec.org/n?u=RePEc:arx:papers:2310.09545&r=ecm
Recently, there has been a surge in methodological development for the difference-in-differences (DiD) approach to evaluate causal effects. Standard methods in the literature rely on the parallel trends assumption to identify the average treatment effect on the treated. However, the parallel trends assumption may be violated in the presence of unmeasured confounding, and the average treatment effect on the treated may not be useful in learning a treatment assignment policy for the entire population. In this article, we propose a general instrumented DiD approach for learning the optimal treatment policy. Specifically, we establish identification results using a binary instrumental variable (IV) when the parallel trends assumption fails to hold. Additionally, we construct a Wald estimator, novel inverse probability weighting (IPW) estimators, and a class of semiparametric efficient and multiply robust estimators, with theoretical guarantees on consistency and asymptotic normality, even when relying on flexible machine learning algorithms for nuisance parameters estimation. Furthermore, we extend the instrumented DiD to the panel data setting. We evaluate our methods in extensive simulations and a real data application.
Pan Zhao
Yifan Cui
2023-10
Survey calibration for causal inference: a simple method to balance covariate distributions
http://d.repec.org/n?u=RePEc:arx:papers:2310.11969&r=ecm
This paper proposes a simple method for balancing distributions of covariates for causal inference based on observational studies. The method makes it possible to balance an arbitrary number of quantiles (e.g., medians, quartiles, or deciles) together with means if necessary. The proposed approach is based on the theory of calibration estimators (Deville and S\"arndal 1992), in particular, calibration estimators for quantiles, proposed by Harms and Duchesne (2006). By modifying the entropy balancing method and the covariate balancing propensity score method, it is possible to balance the distributions of the treatment and control groups. The method does not require numerical integration, kernel density estimation or assumptions about the distributions; valid estimates can be obtained by drawing on existing asymptotic theory. Results of a simulation study indicate that the method efficiently estimates average treatment effects on the treated (ATT), the average treatment effect (ATE), the quantile treatment effect on the treated (QTT) and the quantile treatment effect (QTE), especially in the presence of non-linearity and mis-specification of the models. The proposed methods are implemented in an open source R package jointCalib.
Maciej Ber\k{e}sewicz
2023-10
Data-Driven Fixed-Point Tuning for Truncated Realized Variations
http://d.repec.org/n?u=RePEc:arx:papers:2311.00905&r=ecm
Many methods for estimating integrated volatility and related functionals of semimartingales in the presence of jumps require specification of tuning parameters for their use. In much of the available theory, tuning parameters are assumed to be deterministic, and their values are specified only up to asymptotic constraints. However, in empirical work and in simulation studies, they are typically chosen to be random and data-dependent, with explicit choices in practice relying on heuristics alone. In this paper, we consider novel data-driven tuning procedures for the truncated realized variations of a semimartingale with jumps, which are based on a type of stochastic fixed-point iteration. Being effectively automated, our approach alleviates the need for delicate decision-making regarding tuning parameters, and can be implemented using information regarding sampling frequency alone. We show our methods can lead to asymptotically efficient estimation of integrated volatility and exhibit superior finite-sample performance compared to popular alternatives in the literature.
B. Cooper Boniece
Jos\'e E. Figueroa-L\'opez
Yuchen Han
2023-11
On Gaussian Process Priors in Conditional Moment Restriction Models
http://d.repec.org/n?u=RePEc:arx:papers:2311.00662&r=ecm
This paper studies quasi-Bayesian estimation and uncertainty quantification for an unknown function that is identified by a nonparametric conditional moment restriction model. We derive contraction rates for a class of Gaussian process priors and provide conditions under which a Bernstein-von Mises theorem holds for the quasi-posterior distribution. As a consequence, we show that optimally-weighted quasi-Bayes credible sets have exact asymptotic frequentist coverage. This extends classical result on the frequentist validity of optimally weighted quasi-Bayes credible sets for parametric generalized method of moments (GMM) models.
Sid Kankanala
2023-11
Variational Inference for GARCH-family Models
http://d.repec.org/n?u=RePEc:arx:papers:2310.03435&r=ecm
The Bayesian estimation of GARCH-family models has been typically addressed through Monte Carlo sampling. Variational Inference is gaining popularity and attention as a robust approach for Bayesian inference in complex machine learning models; however, its adoption in econometrics and finance is limited. This paper discusses the extent to which Variational Inference constitutes a reliable and feasible alternative to Monte Carlo sampling for Bayesian inference in GARCH-like models. Through a large-scale experiment involving the constituents of the S&P 500 index, several Variational Inference optimizers, a variety of volatility models, and a case study, we show that Variational Inference is an attractive, remarkably well-calibrated, and competitive method for Bayesian learning.
Martin Magris
Alexandros Iosifidis
2023-10
Bounds on Treatment Effects under Stochastic Monotonicity Assumption in Sample Selection Models
http://d.repec.org/n?u=RePEc:arx:papers:2311.00439&r=ecm
This paper discusses the partial identification of treatment effects in sample selection models when the exclusion restriction fails and the monotonicity assumption in the selection effect does not hold exactly, both of which are key challenges in applying the existing methodologies. Our approach builds on Lee's (2009) procedure, who considers partial identification under the monotonicity assumption, but we assume only a stochastic (and weaker) version of monotonicity, which depends on a prespecified parameter $\vartheta$ that represents researchers' belief in the plausibility of the monotonicity. Under this assumption, we show that we can still obtain useful bounds even when the monotonic behavioral model does not strictly hold. Our procedure is useful when empirical researchers anticipate that a small fraction of the population will not behave monotonically in selection; it can also be an effective tool for performing sensitivity analysis or examining the identification power of the monotonicity assumption. Our procedure is easily extendable to other related settings; we also provide the identification result of the marginal treatment effects setting as an important application. Moreover, we show that the bounds can still be obtained even in the absence of the knowledge of $\vartheta$ under the semiparametric models that nest the classical probit and logit selection models.
Yuta Okamoto
2023-11
Causal Interpretation of Structural IV Estimands
http://d.repec.org/n?u=RePEc:nbr:nberwo:31799&r=ecm
We study the causal interpretation of instrumental variables (IV) estimands of nonlinear, multivariate structural models with respect to rich forms of model misspecification. We focus on guaranteeing that the researcher's estimator is sharp zero consistent, meaning that the researcher concludes that the endogenous variable has no causal effect on the outcome whenever this is actually the case. Sharp zero consistency generally requires the researcher's estimator to satisfy a condition that we call strong exclusion. When a researcher has access to excluded, exogenous variables, strong exclusion can often be achieved by appropriate choice of estimator and instruments. Failure of strong exclusion can lead to large bias in estimates of causal effects in realistic situations. Our results cover many settings of interest including models of differentiated goods demand with endogenous prices and models of production with endogenous inputs.
Isaiah Andrews
Nano Barahona
Matthew Gentzkow
Ashesh Rambachan
Jesse M. Shapiro
2023-10
Functional gradient descent boosting for additive non‐linear spatial autoregressive model (gaussian and probit)
http://d.repec.org/n?u=RePEc:hal:journl:hal-04229868&r=ecm
In this working paper, I aim to establish a connection between the traditional mod- els of spatial econometrics and machine learning algorithms. The objective is to determine, within the context of big data, which variables should be incorporated into autoregressive nonlinear models and in what forms: linear, nonlinear, spatially varying, or with interactions with other variables. To address these questions, I propose an extension of boosting algorithms (Friedman, 2001; Buhlmann et al., 2007) to semi-parametric autoregressive models (SAR, SDM, SEM, and SARAR), formulated as additive models with smoothing splines functions. This adaptation primarily relies on estimating the spatial parameter using the Quasi-Maximum Like- lihood (QML) method, following the examples set by Basile and Gress (2004) and Su and Jin (2010). To simplify the calculation of the spatial multiplier, I propose two extensions. The first is based on the direct application of the Closed Form Estimator (CFE), recently proposed by Smirnov (2020). Additionally, I suggest a Flexible Instrumental Variable Approach/control function approach (Marra and Radice, 2010; Basile et al., 2014) for SAR models, which dynamically constructs the instruments based on the functioning of the functional gradient descent boosting algorithm. The proposed estimators can be easily extended to incorporate decision trees instead of smoothing splines, allowing for the identification of more complex variable interactions. For discrete choice models with spatial dependence, I extend the SAR probit model approximation method proposed by Martinetti and Geniaux (2018) to the nonlinear case using the boosting algorithm and smoothing splines. Using synthetic data, I study the finite sample properties of the proposed estimators for both Gaussian and probit cases. Finally, inspired by the work of Debarsy and LeSage (2018, 2022), I extend the Gaussian case of the nonlinear SAR model to a more complex spatial autoregressive multiplier, involving multiple spatial weight matrices. This extension helps determine the most geographically relevant spatial weight matrix. To illustrate the efficacy of functional gradient descent boosting for additive nonlinear spatial autoregressive models, I employ real data from a large dataset on house prices in France, assessing the out-sample accuracy.
Ghislain Geniaux
Spatial Autoregressive model, gradient boosting,
2023-05-25
Testing for equivalence of pre-trends in Difference-in-Differences estimation
http://d.repec.org/n?u=RePEc:arx:papers:2310.15796&r=ecm
The plausibility of the ``parallel trends assumption'' in Difference-in-Differences estimation is usually assessed by a test of the null hypothesis that the difference between the average outcomes of both groups is constant over time before the treatment. However, failure to reject the null hypothesis does not imply the absence of differences in time trends between both groups. We provide equivalence tests that allow researchers to find evidence in favor of the parallel trends assumption and thus increase the credibility of their treatment effect estimates. While we motivate our tests in the standard two-way fixed effects model, we discuss simple extensions to settings in which treatment adoption is staggered over time.
Holger Dette
Martin Schumann
2023-10
Dynamic Realized Minimum Variance Portfolio Models
http://d.repec.org/n?u=RePEc:arx:papers:2310.13511&r=ecm
This paper introduces a dynamic minimum variance portfolio (MVP) model using nonlinear volatility dynamic models, based on high-frequency financial data. Specifically, we impose an autoregressive dynamic structure on MVP processes, which helps capture the MVP dynamics directly. To evaluate the dynamic MVP model, we estimate the inverse volatility matrix using the constrained $\ell_1$-minimization for inverse matrix estimation (CLIME) and calculate daily realized non-normalized MVP weights. Based on the realized non-normalized MVP weight estimator, we propose the dynamic MVP model, which we call the dynamic realized minimum variance portfolio (DR-MVP) model. To estimate a large number of parameters, we employ the least absolute shrinkage and selection operator (LASSO) and predict the future MVP and establish its asymptotic properties. Using high-frequency trading data, we apply the proposed method to MVP prediction.
Donggyu Kim
Minseog Oh
2023-10
Threshold Endogeneity in Threshold VARs: An Application to Monetary State Dependence
http://d.repec.org/n?u=RePEc:fip:fedkrw:96762&r=ecm
A new method refines the threshold vector autoregressive model used to study the effects of monetary policy. We contribute a new method for dealing with the problem of endogeneity of the threshold variable in threshold vector auto-regression (TVAR) models. Drawing on copula theory enables us to capture the dependence structure between the threshold variable and the vector of TVAR innovations, independently of the marginal distribution of the threshold variable. A Monte Carlo demonstrates that our method works well, and that ignoring threshold endogeneity leads to biased estimates of the threshold parameter and the variance-covariance error structure, thus invalidating dynamic analysis. As an application, we assess the effects of interest rate shocks on output and inflation: when “expected” inflation exceeds 3.6 percent, the effects of monetary policy are faster and stronger than otherwise.
Dimitris Christopoulos
Peter McAdam
Elias Tzavalis
VAR models; threshold models; monetary policy
2023-07-28
Machine Learning for Staggered Difference-in-Differences and Dynamic Treatment Effect Heterogeneity
http://d.repec.org/n?u=RePEc:arx:papers:2310.11962&r=ecm
We combine two recently proposed nonparametric difference-in-differences methods, extending them to enable the examination of treatment effect heterogeneity in the staggered adoption setting using machine learning. The proposed method, machine learning difference-in-differences (MLDID), allows for estimation of time-varying conditional average treatment effects on the treated, which can be used to conduct detailed inference on drivers of treatment effect heterogeneity. We perform simulations to evaluate the performance of MLDID and find that it accurately identifies the true predictors of treatment effect heterogeneity. We then use MLDID to evaluate the heterogeneous impacts of Brazil's Family Health Program on infant mortality, and find those in poverty and urban locations experienced the impact of the policy more quickly than other subgroups.
Julia Hatamyar
Noemi Kreif
Rudi Rocha
Martin Huber
2023-10
Co-Training Realized Volatility Prediction Model with Neural Distributional Transformation
http://d.repec.org/n?u=RePEc:arx:papers:2310.14536&r=ecm
This paper shows a novel machine learning model for realized volatility (RV) prediction using a normalizing flow, an invertible neural network. Since RV is known to be skewed and have a fat tail, previous methods transform RV into values that follow a latent distribution with an explicit shape and then apply a prediction model. However, knowing that shape is non-trivial, and the transformation result influences the prediction model. This paper proposes to jointly train the transformation and the prediction model. The training process follows a maximum-likelihood objective function that is derived from the assumption that the prediction residuals on the transformed RV time series are homogeneously Gaussian. The objective function is further approximated using an expectation-maximum algorithm. On a dataset of 100 stocks, our method significantly outperforms other methods using analytical or naive neural-network transformations.
Xin Du
Kai Moriyama
Kumiko Tanaka-Ishii
2023-10
Causal clustering: design of cluster experiments under network interference
http://d.repec.org/n?u=RePEc:arx:papers:2310.14983&r=ecm
This paper studies the design of cluster experiments to estimate the global treatment effect in the presence of spillovers on a single network. We provide an econometric framework to choose the clustering that minimizes the worst-case mean-squared error of the estimated global treatment effect. We show that the optimal clustering can be approximated as the solution of a novel penalized min-cut optimization problem computed via off-the-shelf semi-definite programming algorithms. Our analysis also characterizes easy-to-check conditions to choose between a cluster or individual-level randomization. We illustrate the method's properties using unique network data from the universe of Facebook's users and existing network data from a field experiment.
Davide Viviano
Lihua Lei
Guido Imbens
Brian Karrer
Okke Schrijvers
Liang Shi
2023-10
Household portfolio choices under (non-)linear income risk: an empirical framework
http://d.repec.org/n?u=RePEc:bde:wpaper:2327&r=ecm
This paper develops a flexible, semi-structural framework to empirically quantify the non-linear transmission of income shocks to household portfolio choice decisions both at the extensive and intensive margins. I model stock market participation and portfolio allocation rules as age-dependent functions of persistent and transitory earnings components, wealth and unobserved taste shifters. I establish non-parametric identification and propose a tractable, simulation-based estimation algorithm, building on recent developments in the sample selection literature. Using recent waves of PSID data, I find heterogeneous income and wealth effects on both extensive and intensive margins, over the wealth and life-cycle dimensions. These results suggest that preferences are heterogeneous across the wealth distribution and over the life cycle. Moreover, in impulse response exercises, I find sizeable extensive margin responses to persistent income shocks. Finally, I find heterogeneity in participation costs across households in the wealth distribution.
Julio Gálvez
stock market participation, non-linear income persistence, sample selection, quantile selection models, latent variables
2023-09
Sparse quantile regression via ℓ０-penalty
http://d.repec.org/n?u=RePEc:hit:econdp:2023-03&r=ecm
HONDA, Toshio
本田, 敏雄
selection consistency, high-dimensional information criteria, B-spline basis, additive models, varying coefficient models
2023-11-01
Conditional Normalization in Time Series Analysis
http://d.repec.org/n?u=RePEc:msh:ebswps:2023-10&r=ecm
Collections of time series that are formed via aggregation are prevalent in many fields. These are commonly referred to as hierarchical time series and may be constructed cross-sectionally across different variables, temporally by aggregating a single series at different frequencies, or may even be generalised beyond aggregation as time series that respect linear constraints. When forecasting such time series, a desirable condition is for forecasts to be coherent, that is to respect the constraints. The past decades have seen substantial growth in this field with the development of reconciliation methods that not only ensure coherent forecasts but can also improve forecast accuracy. This paper serves as both an encyclopaedic review of forecast reconciliation and an entry point for researchers and practitioners dealing with hierarchical time series. The scope of the article includes perspectives on forecast reconciliation from machine learning, Bayesian statistics and probabilistic forecasting as well as applications in economics, energy, tourism, retail demand and demography.
Puwasala Gamakumara
Edgar Santos-Fernandez
Priyanga Dilini Talagala
Rob J Hyndman
Kerrie Mengersen
Catherine Leigh
aggregation, coherence, cross-temporal, hierarchical time series, grouped time series, temporal aggregation
2023
Dynamic Factor Models: a Genealogy
http://d.repec.org/n?u=RePEc:eca:wpaper:2013/364359&r=ecm
Dynamic factor models have been developed out of the need of analyzing and forecasting time series in increasingly high dimensions. While mathematical statisticians faced with inference problems in high-dimensional observation spaces were focusing on the so-called spiked-model-asymptotics, econometricians adopted an entirely and considerably more effective asymptotic approach, rooted in the factor models originally considered in psychometrics. The so-called dynamic factor model methods, in two decades, has grown into a wide and successful body of techniques that are widely used in central banks, financial institutions, economic and statistical institutes. The objective of this chapter is not an extensive survey of the topic but a sketch of its historical growth, with emphasis on the various assumptions and interpretations, and a family tree of its main variants.
Matteo Barigozzi
Marc Hallin
High-dimensional time series, factor models, panel data, forecasting
2023-10
Production Function Estimation with Multi-Destination Firms
http://d.repec.org/n?u=RePEc:ces:ceswps:_10716&r=ecm
We develop a procedure to estimate production functions, elasticities of demand, and productivity when firms endogenously select into multiple destination markets where they compete imperfectly, and when researchers observe output denominated only in value. We show that ignoring the multi-destination dimension (i.e., exporting) yields biased and inconsistent inference. Our estimator extends the two-stage procedure of Gandhi et al. (2020) to this setting, which allows for cross-market complementarities. In Monte Carlo simulations, we show that our estimator is consistent and performs well in finite samples. Using French manufacturing data, we find average total returns to scale greater than 1, average returns to variable inputs less than 1, price elasticities of demand between -21.5 and -3.4, and learning-by-exporting effects between 0 and 4% per year. Alternative estimation procedures yield unrealistic estimates of returns to scale, demand elasticities, or both.
Geoffrey Barrows
Hélène Ollivier
Ariell Reshef
production function, learning by exporting, trade, productivity
2023
Analyzing Bounded Count Data
http://d.repec.org/n?u=RePEc:nbr:nberwo:31814&r=ecm
This paper presents and assesses analytical strategies that respect the bounded count structures of outcomes that are encountered often in health and other applications. The paper's main motivation is that the applied econometrics literature lacks a comprehensive discussion and critique of strategies for analyzing and understand such data. The paper's goal is to provide a treatment of prominent issues arising in such analyses, with particular focus on evaluations in which bounded count outcomes are of interest, and on econometric modeling of their probability and moment structures. Hopefully the paper will provide a toolkit for researchers so they may better appreciate the range of questions that might be asked of such data and the merits and limitations of the analytical methods they might contemplate to study them. It will be seen that the choice of analytical method is often consequential: questions of interest may be unanswerable when some familiar analytical methods are deployed in some circumstances.
John Mullahy
2023-10
Transparency challenges in policy evaluation with causal machine learning -- improving usability and accountability
http://d.repec.org/n?u=RePEc:arx:papers:2310.13240&r=ecm
Causal machine learning tools are beginning to see use in real-world policy evaluation tasks to flexibly estimate treatment effects. One issue with these methods is that the machine learning models used are generally black boxes, i.e., there is no globally interpretable way to understand how a model makes estimates. This is a clear problem in policy evaluation applications, particularly in government, because it is difficult to understand whether such models are functioning in ways that are fair, based on the correct interpretation of evidence and transparent enough to allow for accountability if things go wrong. However, there has been little discussion of transparency problems in the causal machine learning literature and how these might be overcome. This paper explores why transparency issues are a problem for causal machine learning in public policy evaluation applications and considers ways these problems might be addressed through explainable AI tools and by simplifying models in line with interpretable AI principles. It then applies these ideas to a case-study using a causal forest model to estimate conditional average treatment effects for a hypothetical change in the school leaving age in Australia. It shows that existing tools for understanding black-box predictive models are poorly suited to causal machine learning and that simplifying the model to make it interpretable leads to an unacceptable increase in error (in this application). It concludes that new tools are needed to properly understand causal machine learning models and the algorithms that fit them.
Patrick Rehill
Nicholas Biddle
2023-10
BVARs and Stochastic Volatility
http://d.repec.org/n?u=RePEc:arx:papers:2310.14438&r=ecm
Bayesian vector autoregressions (BVARs) are the workhorse in macroeconomic forecasting. Research in the last decade has established the importance of allowing time-varying volatility to capture both secular and cyclical variations in macroeconomic uncertainty. This recognition, together with the growing availability of large datasets, has propelled a surge in recent research in building stochastic volatility models suitable for large BVARs. Some of these new models are also equipped with additional features that are especially desirable for large systems, such as order invariance -- i.e., estimates are not dependent on how the variables are ordered in the BVAR -- and robustness against COVID-19 outliers. Estimation of these large, flexible models is made possible by the recently developed equation-by-equation approach that drastically reduces the computational cost of estimating large systems. Despite these recent advances, there remains much ongoing work, such as the development of parsimonious approaches for time-varying coefficients and other types of nonlinearities in large BVARs.
Joshua Chan
2023-10
Estimation of VaR with jump process: application in corn and soybean markets
http://d.repec.org/n?u=RePEc:arx:papers:2311.00832&r=ecm
Value at Risk (VaR) is a quantitative measure used to evaluate the risk linked to the potential loss of investment or capital. Estimation of the VaR entails the quantification of prospective losses in a portfolio of investments, using a certain likelihood, under normal market conditions within a specific time period. The objective of this paper is to construct a model and estimate the VaR for a diversified portfolio consisting of multiple cash commodity positions driven by standard Brownian motions and jump processes. Subsequently, a thorough analytical estimation of the VaR is conducted for the proposed model. The results are then applied to two distinct commodities -- corn and soybean -- enabling a comprehensive comparison of the VaR values in the presence and absence of jumps.
Minglian Lin
Indranil SenGupta
William Wilson
2023-11
Towards Enhanced Local Explainability of Random Forests: a Proximity-Based Approach
http://d.repec.org/n?u=RePEc:arx:papers:2310.12428&r=ecm
We initiate a novel approach to explain the out of sample performance of random forest (RF) models by exploiting the fact that any RF can be formulated as an adaptive weighted K nearest-neighbors model. Specifically, we use the proximity between points in the feature space learned by the RF to re-write random forest predictions exactly as a weighted average of the target labels of training data points. This linearity facilitates a local notion of explainability of RF predictions that generates attributions for any model prediction across observations in the training set, and thereby complements established methods like SHAP, which instead generates attributions for a model prediction across dimensions of the feature space. We demonstrate this approach in the context of a bond pricing model trained on US corporate bond trades, and compare our approach to various existing approaches to model explainability.
Joshua Rosaler
Dhruv Desai
Bhaskarjit Sarmah
Dimitrios Vamvourellis
Deran Onay
Dhagash Mehta
Stefano Pasquali
2023-10