nep-cmp New Economics Papers
on Computational Economics
Issue of 2023‒05‒08
eleven papers chosen by

  1. OFTER: An Online Pipeline for Time Series Forecasting By Nikolas Michael; Mihai Cucuringu; Sam Howison
  2. Greenhouse gases emissions: estimating corporate non-reported emissions using interpretable machine learning By Jeremi Assael; Thibaut Heurtebize; Laurent Carlier; François Soupé
  3. Mastering Pair Trading with Risk-Aware Recurrent Reinforcement Learning By Weiguang Han; Jimin Huang; Qianqian Xie; Boyi Zhang; Yanzhao Lai; Min Peng
  4. Dissecting the explanatory power of ESG features on equity returns by sector, capitalization, and year with interpretable machine learning By Jérémi Assael; Laurent Carlier; Damien Challet
  5. Extremum Monte Carlo Filters: Real-Time Signal Extraction via Simulation and Regression By Francisco Blasques; Siem Jan Koopman; Karim Moussa
  6. Finding Anomalies in China By Hou, Kewei; Qiao, Fang; Zhang, Xiaoyan
  7. Reinforcement learning for optimization of energy trading strategy By {\L}ukasz Lepak; Pawe{\l} Wawrzy\'nski
  8. On the Connection between Temperature and Volatility in Ideal Agent Systems By Christoph J. B\"orner; Ingo Hoffmann; John H. Stiebel
  9. The Cost of Influence:How Gifts to Physicians Shape Prescriptions and Drug Costs By Melissa Newham; Marica Valente
  10. The Economic Effect of Gaining a New Qualification Later in Life By Finn Lattimore; Daniel M. Steinberg; Anna Zhu
  11. Torch-Choice: A PyTorch Package for Large-Scale Choice Modelling with Python By Tianyu Du; Ayush Kanodia; Susan Athey

  1. By: Nikolas Michael; Mihai Cucuringu; Sam Howison
    Abstract: We introduce OFTER, a time series forecasting pipeline tailored for mid-sized multivariate time series. OFTER utilizes the non-parametric models of k-nearest neighbors and Generalized Regression Neural Networks, integrated with a dimensionality reduction component. To circumvent the curse of dimensionality, we employ a weighted norm based on a modified version of the maximal correlation coefficient. The pipeline we introduce is specifically designed for online tasks, has an interpretable output, and is able to outperform several state-of-the art baselines. The computational efficacy of the algorithm, its online nature, and its ability to operate in low signal-to-noise regimes, render OFTER an ideal approach for financial multivariate time series problems, such as daily equity forecasting. Our work demonstrates that while deep learning models hold significant promise for time series forecasting, traditional methods carefully integrating mainstream tools remain very competitive alternatives with the added benefits of scalability and interpretability.
    Date: 2023–04
  2. By: Jeremi Assael (BNPP CIB GM Lab - BNP Paribas CIB Global Markets Data & AI Lab, MICS - Mathématiques et Informatique pour la Complexité et les Systèmes - CentraleSupélec - Université Paris-Saclay); Thibaut Heurtebize (BNP Paribas Asset Management, Quantitative Research Group, Research Lab); Laurent Carlier (BNPP CIB GM Lab - BNP Paribas CIB Global Markets Data & AI Lab); François Soupé (BNP Paribas Asset Management, Quantitative Research Group, Research Lab)
    Abstract: As of 2022, greenhouse gases (GHG) emissions reporting and auditing are not yet compulsory for all companies, and methodologies of measurement and estimation are not unified. We propose a machine learning-based model to estimate scope 1 and scope 2 GHG emissions of companies not reporting them yet. Our model, designed to be transparent and completely adapted to this use case, is able to estimate emissions for a large universe of companies. It shows good out-of-sample global performances as well as good out-of-sample granular performances when evaluating it by sectors, countries, or revenue buckets. We also compare the model results to those of other providers and find our estimates to be more accurate. Explainability tools based on Shapley values allow the constructed model to be fully interpretable, the user being able to understand which factors split explains the GHG emissions for each particular company.
    Keywords: sustainability, disclosure, greenhouse gas emissions, machine learning, interpretability, carbon emissions, scope 1, scope 2, interpretable machine learning
    Date: 2023–02–13
  3. By: Weiguang Han; Jimin Huang; Qianqian Xie; Boyi Zhang; Yanzhao Lai; Min Peng
    Abstract: Although pair trading is the simplest hedging strategy for an investor to eliminate market risk, it is still a great challenge for reinforcement learning (RL) methods to perform pair trading as human expertise. It requires RL methods to make thousands of correct actions that nevertheless have no obvious relations to the overall trading profit, and to reason over infinite states of the time-varying market most of which have never appeared in history. However, existing RL methods ignore the temporal connections between asset price movements and the risk of the performed trading. These lead to frequent tradings with high transaction costs and potential losses, which barely reach the human expertise level of trading. Therefore, we introduce CREDIT, a risk-aware agent capable of learning to exploit long-term trading opportunities in pair trading similar to a human expert. CREDIT is the first to apply bidirectional GRU along with the temporal attention mechanism to fully consider the temporal correlations embedded in the states, which allows CREDIT to capture long-term patterns of the price movements of two assets to earn higher profit. We also design the risk-aware reward inspired by the economic theory, that models both the profit and risk of the tradings during the trading period. It helps our agent to master pair trading with a robust trading preference that avoids risky trading with possible high returns and losses. Experiments show that it outperforms existing reinforcement learning methods in pair trading and achieves a significant profit over five years of U.S. stock data.
    Date: 2023–04
  4. By: Jérémi Assael (BNPP CIB GM Lab - BNP Paribas CIB Global Markets Data & AI Lab, MICS - Mathématiques et Informatique pour la Complexité et les Systèmes - CentraleSupélec - Université Paris-Saclay); Laurent Carlier (BNPP CIB GM Lab - BNP Paribas CIB Global Markets Data & AI Lab); Damien Challet (MICS - Mathématiques et Informatique pour la Complexité et les Systèmes - CentraleSupélec - Université Paris-Saclay)
    Abstract: We systematically investigate the links between price returns and Environment, Social and Governance (ESG) features in the European market. We propose a cross-validation scheme with random company-wise validation to mitigate the relative initial lack of quantity and quality of ESG data, which allows us to use most of the latest and best data to both train and validate our models. Boosted trees successfully explain a part of annual price returns not accounted by the market factor. We check with benchmark features that ESG features do contain significantly more information than basic fundamental features alone. The most relevant sub-ESG feature encodes controversies. Finally, we find opposite effects of better ESG scores on the price returns of small and large capitalization companies: better ESG scores are generally associated with larger price returns for the latter, and reversely for the former.
    Keywords: ESG features, sustainable investing, interpretable machine learning, model selection, asset management, equity returns, ESG data
    Date: 2023–03
  5. By: Francisco Blasques (Vrije Universiteit Amsterdam); Siem Jan Koopman (Vrije Universiteit Amsterdam); Karim Moussa (Vrije Universiteit Amsterdam)
    Abstract: This paper introduces a novel simulation-based filtering method for general state space models. It allows for the computation of time-varying conditional means, quantiles, and modes, but also for the prediction of latent variables in general. The method relies on generating artificial samples of data from the joint distribution implied by the model and on estimating the conditional quantities of interest via extremum estimation. We call this procedure Extremum Monte Carlo and define a corresponding class of filters for signal extraction. The method can be applied to any model from which data can be simulated and is not liable to the curse of dimensionality. Furthermore, the use of extremum estimation allows for a wide range of conditioning sets, including data with missing entries and unequal spacing. The filtering method also places the computational burden predominantly in the off-line phase, which makes it particularly suitable for real-time applications. We present illustrations for some challenging problems characterized by nonlinearity, high-dimensionality, and intractable density functions.
    Keywords: Nonlinear non-Gaussian state space models, Least squares Monte Carlo, Real-time filtering, Intractable densities, Curse of dimensionality
    Date: 2023–03–24
  6. By: Hou, Kewei (Ohio State U); Qiao, Fang (U of International Business and Economics, Beijing); Zhang, Xiaoyan (Tsinghua U)
    Abstract: To study the cross-section of returns in the Chinese stock market, we follow the anomaly literature and construct 454 strategies between 2000 and 2020, based on 208 firm-level trading and accounting signals. With the conventional single-testing t-statistic cutoff of 1.96, 101 strategies have significant value-weighted raw returns, and 20 remain significant after risk adjustments. To avoid false discoveries, we recalibrate the t-statistic cutoff to 2.85 to accommodate multiple testing. 36 strategies survive the higher hurdle rate in value-weighted raw returns, while none remains significant after risk adjustments. When we use machine learning techniques to combine information from multiple signals, the resulting composite strategies mostly have significant returns after risk adjustments, even with the higher t-statistic cutoff. We relate Chinese anomaly returns to aggregate economic conditions and find that they comove with financial market development, accounting quality, market liquidity, and government regulations.
    JEL: G1 G12
    Date: 2023–01
  7. By: {\L}ukasz Lepak; Pawe{\l} Wawrzy\'nski
    Abstract: An increasing part of energy is produced from renewable sources by a large number of small producers. The efficiency of these sources is volatile and, to some extent, random, exacerbating the energy market balance problem. In many countries, that balancing is performed on day-ahead (DA) energy markets. In this paper, we consider automated trading on a DA energy market by a medium size prosumer. We model this activity as a Markov Decision Process and formalize a framework in which a ready-to-use strategy can be optimized with real-life data. We synthesize parametric trading strategies and optimize them with an evolutionary algorithm. We also use state-of-the-art reinforcement learning algorithms to optimize a black-box trading strategy fed with available information from the environment that can impact future prices.
    Date: 2023–03
  8. By: Christoph J. B\"orner; Ingo Hoffmann; John H. Stiebel
    Abstract: Models for spin systems known from statistical physics are applied by analogy in econometrics in the form of agent-based models. Researchers suggest that the state variable temperature $T$ corresponds to volatility $\sigma$ in capital market theory problems. To the best of our knowledge, this has not yet been theoretically derived, for example, for an ideal agent system. In the present paper, we derive the exact algebraic relation between $T$ and $\sigma$ for an ideal agent system and discuss implications and limitations.
    Date: 2023–03
  9. By: Melissa Newham; Marica Valente
    Abstract: This paper studies how gifts – monetary or in-kind payments – from drug firms to physicians in the US affect prescriptions and drug costs. We estimate heterogeneous treatment effects by combining physician-level data on antidiabetic prescriptions and payments with causal inference and machine learning methods.We find that payments cause physicians to prescribe more brand drugs, resulting in a cost increase of $ 30 per dollar received. Responses differ widely across physicians, and are primarily explained by variation in patients’ out-of-pocket costs. A gift ban is estimated to decrease drug costs by 3-4 %. Taken together, these novel findings reveal how payments shape prescription choices and drive up costs.
    Keywords: public health, payments to physicians, gift ban, heterogeneous treatment effects, causal machine learning
    JEL: I11 I18 M31
    Date: 2023–03
  10. By: Finn Lattimore; Daniel M. Steinberg; Anna Zhu
    Abstract: Pursuing educational qualifications later in life is an increasingly common phenomenon within OECD countries since technological change and automation continues to drive the evolution of skills needed in many professions. We focus on the causal impacts to economic returns of degrees completed later in life, where motivations and capabilities to acquire additional education may be distinct from education in early years. We find that completing and additional degree leads to more than \$3000 (AUD, 2019) per year compared to those who do not complete additional study. For outcomes, treatment and controls we use the extremely rich and nationally representative longitudinal data from the Household Income and Labour Dynamics Australia survey is used for this work. To take full advantage of the complexity and richness of this data we use a Machine Learning (ML) based methodology to estimate the causal effect. We are also able to use ML to discover sources of heterogeneity in the effects of gaining additional qualifications, for example those younger than 45 years of age when obtaining additional qualifications tend to reap more benefits (as much as \$50 per week more) than others.
    Date: 2023–04
  11. By: Tianyu Du; Ayush Kanodia; Susan Athey
    Abstract: The $\texttt{torch-choice}$ is an open-source library for flexible, fast choice modeling with Python and PyTorch. $\texttt{torch-choice}$ provides a $\texttt{ChoiceDataset}$ data structure to manage databases flexibly and memory-efficiently. The paper demonstrates constructing a $\texttt{ChoiceDataset}$ from databases of various formats and functionalities of $\texttt{ChoiceDataset}$. The package implements two widely used models, namely the multinomial logit and nested logit models, and supports regularization during model estimation. The package incorporates the option to take advantage of GPUs for estimation, allowing it to scale to massive datasets while being computationally efficient. Models can be initialized using either R-style formula strings or Python dictionaries. We conclude with a comparison of the computational efficiencies of $\texttt{torch-choice}$ and $\texttt{mlogit}$ in R as (1) the number of observations increases, (2) the number of covariates increases, and (3) the expansion of item sets. Finally, we demonstrate the scalability of $\texttt{torch-choice}$ on large-scale datasets.
    Date: 2023–04

General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.