nep-cmp New Economics Papers
on Computational Economics
Issue of 2021‒11‒22
thirteen papers chosen by



  1. Power of machine learning algorithms for predicting dropouts from a German telemonitoring program using standardized claims data By Hofer, Florian; Birkner, Benjamin; Spindler, Martin
  2. Empirical asset pricing and ensemble machine learning By Zhang, Hongwei
  3. A Scalable Inference Method For Large Dynamic Economic Systems By Pratha Khandelwal; Philip Nadler; Rossella Arcucci; William Knottenbelt; Yi-Ke Guo
  4. Algorithmic and human collusion By Werner, Tobias
  5. Double generative adversarial networks for conditional independence testing By Shi, Chengchun; Xu, Tianlin; Bergsma, Wicher; Li, Lexin
  6. ABIDES-Gym: Gym Environments for Multi-Agent Discrete Event Simulation and Application to Financial Markets By Selim Amrouni; Aymeric Moulin; Jared Vann; Svitlana Vyetrenko; Tucker Balch; Manuela Veloso
  7. Fraud detection in the era of Machine Learning: a household insurance case By Denisa BANULESCU-RADU; Meryem YANKOL-SCHALCK
  8. Deep Calibration of Interest Rates Model By Mohamed Ben Alaya; Ahmed Kebaier; Djibril Sarr
  9. Towards Realistic Market Simulations: a Generative Adversarial Networks Approach By Andrea Coletta; Matteo Prata; Michele Conti; Emanuele Mercanti; Novella Bartolini; Aymeric Moulin; Svitlana Vyetrenko; Tucker Balch
  10. Forecasting with VAR-teXt and DFM-teXt Models:exploring the predictive power of central bank communication By Leonardo Nogueira Ferreira
  11. Trading via Selective Classification By Nestoras Chalkidis; Rahul Savani
  12. Tax evasion, behavioral microsimulation models and flat-rate tax reforms. Analysis for Italy By Andrea Albarea; Michele Bernasconi; Anna Marenzi; Dino Rizzi
  13. Microscopic Simulation of Decentralized Dispatching Strategies in Railways By van Lieshout, R.N.; van den Akker, J.M.; R. Mendes Borges; T. Druijf; Quaglietta, E.

  1. By: Hofer, Florian; Birkner, Benjamin; Spindler, Martin
    Abstract: Background: Statutory health insurers in Germany offer a variety of disease management, prevention and health promotion programs to their insurees. Identifying patients with a high probability of leaving these programs prematurely helps insurers to offer better support to those at the highest risk of dropping out, potentially reducing costs and improving health outcomes for the most vulnerable. Objective: To evaluate whether machine learning methods outperform linear regression in predicting dropouts from a telemonitoring program. Methods: Use of linear regression and machine learning to predict dropouts from a telemonitoring program for patients with COPD by using information derived from claims data only. Different feature sets are used to compare model performance between and within different methods. Repeated 10-fold cross-validation with downsampling followed by grid searches was applied to tune relevant hyperparameters. Results: Random forest performed best with the highest AUC of 0.60. Applying logistic regression resulted in higher predictive power with regard to the correct classification of dropouts compared to neural networks with a sensitivity of 56%. All machine learning algorithms outperformed linear regression with respect to specificity. Overall predictive performance of all methods was only modest at best. Conclusion: Using features derived from claims data only, machine learning methods performed similar in comparison to linear regression in predicting dropouts from a telemonitoring program. However, as our data set contained information from only 1,302 individuals, our results may not be generalizable to the broader population.
    Date: 2021
    URL: http://d.repec.org/n?u=RePEc:zbw:hcherp:202124&r=
  2. By: Zhang, Hongwei (Tilburg University, School of Economics and Management)
    Date: 2021
    URL: http://d.repec.org/n?u=RePEc:tiu:tiutis:15134355-ab64-47b0-b581-518bc381fb87&r=
  3. By: Pratha Khandelwal; Philip Nadler; Rossella Arcucci; William Knottenbelt; Yi-Ke Guo
    Abstract: The nature of available economic data has changed fundamentally in the last decade due to the economy's digitisation. With the prevalence of often black box data-driven machine learning methods, there is a necessity to develop interpretable machine learning methods that can conduct econometric inference, helping policymakers leverage the new nature of economic data. We therefore present a novel Variational Bayesian Inference approach to incorporate a time-varying parameter auto-regressive model which is scalable for big data. Our model is applied to a large blockchain dataset containing prices, transactions of individual actors, analyzing transactional flows and price movements on a very granular level. The model is extendable to any dataset which can be modelled as a dynamical system. We further improve the simple state-space modelling by introducing non-linearities in the forward model with the help of machine learning architectures.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.14346&r=
  4. By: Werner, Tobias
    Abstract: As self-learning pricing algorithms become popular, there are growing concerns among academics and regulators that algorithms could learn to collude tacitly on non-competitive prices and thereby harm competition. I study popular reinforcement learning algorithms and show that they develop collusive behavior in a simulated market environment. To derive a counterfactual that resembles traditional tacit collusion, I conduct market experiments with human participants in the same environment. Across different treatments, I vary the market size and the number of firms that use a self-learned pricing algorithm. I provide evidence that oligopoly markets can become more collusive if algorithms make pricing decisions instead of humans. In two-firm markets, market prices are weakly increasing in the number of algorithms in the market. In three-firm markets, algorithms weaken competition if most firms use an algorithm and human sellers are inexperienced.
    Keywords: Artificial Intelligence,Collusion,Experiment,Human-Machine Interaction
    JEL: C90 D83 L13 L41
    Date: 2021
    URL: http://d.repec.org/n?u=RePEc:zbw:dicedp:372&r=
  5. By: Shi, Chengchun; Xu, Tianlin; Bergsma, Wicher; Li, Lexin
    Abstract: In this article, we study the problem of high-dimensional conditional independence testing, a key building block in statistics and machine learning. We propose an inferential procedure based on double generative adversarial networks (GANs). Specifically, we first introduce a double GANs framework to learn two generators of the conditional distributions. We then integrate the two generators to construct a test statistic, which takes the form of the maximum of generalized covariance measures of multiple transformation functions. We also employ data-splitting and cross-fitting to minimize the conditions on the generators to achieve the desired asymptotic properties, and employ multiplier bootstrap to obtain the corresponding p-value. We show that the constructed test statistic is doubly robust, and the resulting test both controls type-I error and has the power approaching one asymptotically. Also notably, we establish those theoretical guarantees under much weaker and practically more feasible conditions compared to the existing tests, and our proposal gives a concrete example of how to utilize some state-of-the-art deep learning tools, such as GANs, to help address a classical but challenging statistical problem. We demonstrate the efficacy of our test through both simulations and an application to an anti-cancer drug dataset.
    Keywords: conditional independence; double-robustness; generalized covariance measure; generative adversarial networks; multiplier bootstrap
    JEL: C1
    Date: 2021–11–02
    URL: http://d.repec.org/n?u=RePEc:ehl:lserod:112550&r=
  6. By: Selim Amrouni; Aymeric Moulin; Jared Vann; Svitlana Vyetrenko; Tucker Balch; Manuela Veloso
    Abstract: Model-free Reinforcement Learning (RL) requires the ability to sample trajectories by taking actions in the original problem environment or a simulated version of it. Breakthroughs in the field of RL have been largely facilitated by the development of dedicated open source simulators with easy to use frameworks such as OpenAI Gym and its Atari environments. In this paper we propose to use the OpenAI Gym framework on discrete event time based Discrete Event Multi-Agent Simulation (DEMAS). We introduce a general technique to wrap a DEMAS simulator into the Gym framework. We expose the technique in detail and implement it using the simulator ABIDES as a base. We apply this work by specifically using the markets extension of ABIDES, ABIDES-Markets, and develop two benchmark financial markets OpenAI Gym environments for training daily investor and execution agents. As a result, these two environments describe classic financial problems with a complex interactive market behavior response to the experimental agent's action.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.14771&r=
  7. By: Denisa BANULESCU-RADU; Meryem YANKOL-SCHALCK
    Keywords: , Fraud detection, Household insurance, Machine learning, Logistic LASSO, XGBoost,, Imbalanced data, SHAP
    Date: 2021
    URL: http://d.repec.org/n?u=RePEc:leo:wpaper:2904&r=
  8. By: Mohamed Ben Alaya; Ahmed Kebaier; Djibril Sarr
    Abstract: For any financial institution it is a necessity to be able to apprehend the behavior of interest rates. Despite the use of Deep Learning that is growing very fastly, due to many reasons (expertise, ease of use, ...) classic rates models such as CIR, or the Gaussian family are still being used widely. We propose to calibrate the five parameters of the G2++ model using Neural Networks. To achieve that, we construct synthetic data sets of parameters drawn uniformly from a reference set of parameters calibrated from the market. From those parameters, we compute Zero-Coupon and Forward rates and their covariances and correlations. Our first model is a Fully Connected Neural network and uses only covariances and correlations. We show that covariances are more suited to the problem than correlations. The second model is a Convulutional Neural Network using only Zero-Coupon rates with no transformation. The methods we propose perform very quickly (less than 0.3 seconds for 2 000 calibrations) and have low errors and good fitting.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.15133&r=
  9. By: Andrea Coletta; Matteo Prata; Michele Conti; Emanuele Mercanti; Novella Bartolini; Aymeric Moulin; Svitlana Vyetrenko; Tucker Balch
    Abstract: Simulated environments are increasingly used by trading firms and investment banks to evaluate trading strategies before approaching real markets. Backtesting, a widely used approach, consists of simulating experimental strategies while replaying historical market scenarios. Unfortunately, this approach does not capture the market response to the experimental agents' actions. In contrast, multi-agent simulation presents a natural bottom-up approach to emulating agent interaction in financial markets. It allows to set up pools of traders with diverse strategies to mimic the financial market trader population, and test the performance of new experimental strategies. Since individual agent-level historical data is typically proprietary and not available for public use, it is difficult to calibrate multiple market agents to obtain the realism required for testing trading strategies. To addresses this challenge we propose a synthetic market generator based on Conditional Generative Adversarial Networks (CGANs) trained on real aggregate-level historical data. A CGAN-based "world" agent can generate meaningful orders in response to an experimental agent. We integrate our synthetic market generator into ABIDES, an open source simulator of financial markets. By means of extensive simulations we show that our proposal outperforms previous work in terms of stylized facts reflecting market responsiveness and realism.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.13287&r=
  10. By: Leonardo Nogueira Ferreira
    Abstract: This paper explores the complementarity between traditional econometrics and machine learning and applies the resulting model – the VAR-teXt – to central bank communication. The VAR-teXt is a vector autoregressive (VAR) model augmented with information retrieved from text, turned into quantitative data via a Latent Dirichlet Allocation (LDA) model, whereby the number of topics (or textual factors) is chosen based on their predictive performance. A Markov chain Monte Carlo (MCMC) sampling algorithm for the estimation of the VAR-teXt that takes into account the fact that the textual factors are estimates is also provided. The approach is then extended to dynamic factor models (DFM) generating the DFM-teXt. Results show that textual factors based on Federal Open Market Committee (FOMC) statements are indeed useful for forecasting.
    Date: 2021–11
    URL: http://d.repec.org/n?u=RePEc:bcb:wpaper:559&r=
  11. By: Nestoras Chalkidis; Rahul Savani
    Abstract: A binary classifier that tries to predict if the price of an asset will increase or decrease naturally gives rise to a trading strategy that follows the prediction and thus always has a position in the market. Selective classification extends a binary or many-class classifier to allow it to abstain from making a prediction for certain inputs, thereby allowing a trade-off between the accuracy of the resulting selective classifier against coverage of the input feature space. Selective classifiers give rise to trading strategies that do not take a trading position when the classifier abstains. We investigate the application of binary and ternary selective classification to trading strategy design. For ternary classification, in addition to classes for the price going up or down, we include a third class that corresponds to relatively small price moves in either direction, and gives the classifier another way to avoid making a directional prediction. We use a walk-forward train-validate-test approach to evaluate and compare binary and ternary, selective and non-selective classifiers across several different feature sets based on four classification approaches: logistic regression, random forests, feed-forward, and recurrent neural networks. We then turn these classifiers into trading strategies for which we perform backtests on commodity futures markets. Our empirical results demonstrate the potential of selective classification for trading.
    Date: 2021–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:2110.14914&r=
  12. By: Andrea Albarea (Department of Economics, University Of Venice CÃ Foscari); Michele Bernasconi (Department of Economics, University Of Venice CÃ Foscari); Anna Marenzi (Department of Economics, University Of Venice CÃ Foscari); Dino Rizzi (Department of Economics, University Of Venice CÃ Foscari)
    Abstract: It is sometimes argued that a flat-rate tax reform can reduce tax noncompliance. The argument is, however, inconsistent with the so-called Yitzhaki’ s puzzle of the classical expected utility (EU) model. The latter predicts an increase, rather than a reduction, in tax evasion following a cut in the tax rates resulting from a flat-rate reform. We study the impact of a flat-rate tax in a microsimulation tax-benefit model of Italy which allows us to analyse various hypotheses of tax evasion behavior. In addition to the EU model, we analyse expected utility with rank dependent probabilities (EURDP) and the model of reference dependent (RD) preference, the most favourable to overturn Yitzhaki’ s puzzle. Our simulations show that a flat-rate tax would barely reduce overall evasion in Italy in all models considered. Redistributive effects are in all cases large.
    Keywords: Fiscal reforms, tax evasion, reference dependent preferences
    JEL: H20 H26 H30
    Date: 2021
    URL: http://d.repec.org/n?u=RePEc:ven:wpaper:2021:26&r=
  13. By: van Lieshout, R.N.; van den Akker, J.M.; R. Mendes Borges; T. Druijf; Quaglietta, E.
    Abstract: This paper analyzes the effectiveness of decentralized strategies for dispatching rolling stock and train drivers in a railway system. Such strategies give operators a robust alternative in case centralized control fails due to an abundance of infrastructure or rolling stock disruptions or information system malfunctions. We test the performance of four rolling stock and two driver dispatching strategies in a microscopic simulation. Our test case is a part of the Dutch railway network, containing eleven stations linked by four train lines. We find that with the decentralized dispatching strategies, target frequencies of the lines are approximately met and train services are highly regular without large delays. Especially strategies that allow rolling stock to switch between lines result in a high performance
    Keywords: decentralized control, local dispatching, microscopic simulation, rescheduling
    Date: 2021–11–01
    URL: http://d.repec.org/n?u=RePEc:ems:eureir:136996&r=

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.