nep-cmp New Economics Papers
on Computational Economics
Issue of 2022‒06‒27
twenty papers chosen by

  1. Deep Learning vs. Gradient Boosting: Benchmarking state-of-the-art machine learning algorithms for credit scoring By Marc Schmitt
  2. Researcher reasoning meets computational capacity: Machine learning for social science By Lundberg, Ian; Brand, Jennie E.; Jeon, Nanum
  3. What constitutes a machine-learning-driven business model? A taxonomy of B2B start-ups with machine learning at their core By Vetter, Oliver A.; Hoffmann, Felix Sebastian; Pumplun, Luisa; Buxmann, Peter
  4. Mack-Net model: Blending Mack's model with Recurrent Neural Networks By Eduardo Ramos-P\'erez; Pablo J. Alonso-Gonz\'alez; Jos\'e Javier N\'u\~nez-Vel\'azquez
  5. Benchmarking Econometric and Machine Learning Methodologies in Nowcasting By Daniel Hopp
  6. Neural Optimal Stopping Boundary By A. Max Reppen; H. Mete Soner; Valentin Tissot-Daguette
  7. HARNet: A Convolutional Neural Network for Realized Volatility Forecasting By Rafael Reisenhofer; Xandro Bayer; Nikolaus Hautsch
  8. How Communication Makes the Difference between a Cartel and Tacit Collusion: A Machine Learning Approach By Maximilian Andres; Lisa Bruttel; Jana Friedrichsen
  9. On the Evolution of Product Portfolio of Cooperatives versus IOFs: An Agent-Based Analysis of the Single Origin Constraint By Deng, W.; Hendrikse, G.W.J.
  10. Numerical Methods for Macroeconomists By Jeremy Greenwood; Ricardo Marto
  11. Machine learning in international trade research - evaluating the impact of trade agreements By Holger Breinlich; Valentina Corradi; Nadia Rocha; Michele Ruta; J.M.C. Santos Silva; Tom Zylkin
  12. RLOP: RL Methods in Option Pricing from a Mathematical Perspective By Ziheng Chen
  13. Polytope Fraud Theory By Dongshuai Zhao; Zhongli Wang; Florian Schweizer-Gamborino; Didier Sornette
  14. Prescriptive maintenance with causal machine learning By Toon Vanderschueren; Robert Boute; Tim Verdonck; Bart Baesens; Wouter Verbeke
  15. Research on the correlation between text emotion mining and stock market based on deep learning By Chenrui Zhang
  16. Efficient Score Computation and Expectation-Maximization Algorithm in Regime-Switching Models By Chaojun Li; Shi Qiu
  17. Does the evaluation stand up to evaluation?: A first-principle approach to the evaluation of classifiers By Dyrland, Kjetil; Lundervold, Alexander Selvikvåg; Porta Mana, PierGianLuca
  18. Property Tax Reform: Implications for Housing Prices and Economic Productivity By Jason Nassios; James Giesecke
  19. A Branch-and-Cut Algorithm for Chance-Constrained Multi-Area Reserve Sizing By Cho, Jehum; Papavasiliou, Anthony
  20. Probabilistic forecasting of German electricity imbalance prices By Micha{\l} Narajewski

  1. By: Marc Schmitt
    Abstract: Artificial intelligence (AI) and machine learning (ML) have become vital to remain competitive for financial services companies around the globe. The two models currently competing for the pole position in credit risk management are deep learning (DL) and gradient boosting machines (GBM). This paper benchmarked those two algorithms in the context of credit scoring using three distinct datasets with different features to account for the reality that model choice/power is often dependent on the underlying characteristics of the dataset. The experiment has shown that GBM tends to be more powerful than DL and has also the advantage of speed due to lower computational requirements. This makes GBM the winner and choice for credit scoring. However, it was also shown that the outperformance of GBM is not always guaranteed and ultimately the concrete problem scenario or dataset will determine the final model choice. Overall, based on this study both algorithms can be considered state-of-the-art for binary classification tasks on structured datasets, while GBM should be the go-to solution for most problem scenarios due to easier use, significantly faster training time, and superior accuracy.
    Date: 2022–05
  2. By: Lundberg, Ian; Brand, Jennie E. (UCLA); Jeon, Nanum
    Abstract: Computational power and digital data have created new opportunities to explore and understand the social world. A special synergy is possible when social scientists combine human attention to certain aspects of the problem with the power of algorithms to automate other aspects of the problem. We review selected exemplary applications where machine learning amplifies researcher coding, summarizes complex data, relaxes statistical assumptions, and targets researcher attention. We then seek to reduce perceived barriers to machine learning by summarizing several fundamental building blocks and their grounding in classical statistics. We present a few guiding principles and promising approaches where we see particular potential for machine learning to transform social science inquiry. We conclude that machine learning tools are accessible, worthy of attention, and ready to yield new discoveries.
    Date: 2022–05–23
  3. By: Vetter, Oliver A.; Hoffmann, Felix Sebastian; Pumplun, Luisa; Buxmann, Peter
    Date: 2022–06–18
  4. By: Eduardo Ramos-P\'erez; Pablo J. Alonso-Gonz\'alez; Jos\'e Javier N\'u\~nez-Vel\'azquez
    Abstract: In general insurance companies, a correct estimation of liabilities plays a key role due to its impact on management and investing decisions. Since the Financial Crisis of 2007-2008 and the strengthening of regulation, the focus is not only on the total reserve but also on its variability, which is an indicator of the risk assumed by the company. Thus, measures that relate profitability with risk are crucial in order to understand the financial position of insurance firms. Taking advantage of the increasing computational power, this paper introduces a stochastic reserving model whose aim is to improve the performance of the traditional Mack's reserving model by applying an ensemble of Recurrent Neural Networks. The results demonstrate that blending traditional reserving models with deep and machine learning techniques leads to a more accurate assessment of general insurance liabilities.
    Date: 2022–05
  5. By: Daniel Hopp
    Abstract: Nowcasting can play a key role in giving policymakers timelier insight to data published with a significant time lag, such as final GDP figures. Currently, there are a plethora of methodologies and approaches for practitioners to choose from. However, there lacks a comprehensive comparison of these disparate approaches in terms of predictive performance and characteristics. This paper addresses that deficiency by examining the performance of 12 different methodologies in nowcasting US quarterly GDP growth, including all the methods most commonly employed in nowcasting, as well as some of the most popular traditional machine learning approaches. Performance was assessed on three different tumultuous periods in US economic history: the early 1980s recession, the 2008 financial crisis, and the COVID crisis. The two best performing methodologies in the analysis were long short-term memory artificial neural networks (LSTM) and Bayesian vector autoregression (BVAR). To facilitate further application and testing of each of the examined methodologies, an open-source repository containing boilerplate code that can be applied to different datasets is published alongside the paper, available at:
    Date: 2022–05
  6. By: A. Max Reppen; H. Mete Soner; Valentin Tissot-Daguette
    Abstract: A method based on deep artificial neural networks and empirical risk minimization is developed to calculate the boundary separating the stopping and continuation regions in optimal stopping. The algorithm parameterizes the stopping boundary as the graph of a function and introduces relaxed stopping rules based on fuzzy boundaries to facilitate efficient optimization. Several financial instruments, some in high dimensions, are analyzed through this method, demonstrating its effectiveness. The existence of the stopping boundary is also proved under natural structural assumptions.
    Date: 2022–05
  7. By: Rafael Reisenhofer; Xandro Bayer; Nikolaus Hautsch
    Abstract: Despite the impressive success of deep neural networks in many application areas, neural network models have so far not been widely adopted in the context of volatility forecasting. In this work, we aim to bridge the conceptual gap between established time series approaches, such as the Heterogeneous Autoregressive (HAR) model, and state-of-the-art deep neural network models. The newly introduced HARNet is based on a hierarchy of dilated convolutional layers, which facilitates an exponential growth of the receptive field of the model in the number of model parameters. HARNets allow for an explicit initialization scheme such that before optimization, a HARNet yields identical predictions as the respective baseline HAR model. Particularly when considering the QLIKE error as a loss function, we find that this approach significantly stabilizes the optimization of HARNets. We evaluate the performance of HARNets with respect to three different stock market indexes. Based on this evaluation, we formulate clear guidelines for the optimization of HARNets and show that HARNets can substantially improve upon the forecasting accuracy of their respective HAR baseline models. In a qualitative analysis of the filter weights learnt by a HARNet, we report clear patterns regarding the predictive power of past information. Among information from the previous week, yesterday and the day before, yesterday's volatility makes by far the most contribution to today's realized volatility forecast. Moroever, within the previous month, the importance of single weeks diminishes almost linearly when moving further into the past.
    Date: 2022–05
  8. By: Maximilian Andres; Lisa Bruttel; Jana Friedrichsen
    Abstract: This paper sheds new light on the role of communication for cartel formation. Using machine learning to evaluate free-form chat communication among firms in a laboratory experiment, we identify typical communication patterns for both explicit cartel formation and indirect attempts to collude tacitly. We document that firms are less likely to communicate explicitly about price fixing and more likely to use indirect messages when sanctioning institutions are present. This effect of sanctions on communication reinforces the direct cartel-deterring effect of sanctions as collusion is more difficult to reach and sustain without an explicit agreement. Indirect messages have no, or even a negative, effect on prices.
    Keywords: cartel, collusion, communication, machine learning, experiment
    JEL: C92 D43 L41
    Date: 2022
  9. By: Deng, W.; Hendrikse, G.W.J.
    Abstract: An agent-based model is developed to address the relationship between the ownership structure of an enterprise and the evolution of its product portfolio. The coherence and evolution of a product portfolio is operationalized by transition rules regarding the Moore environment. The distinguishing feature of a cooperative is the single origin constraint according to Cook (1997), which is modelled as a cooperative assigning an infinite lifetime to the first product in its product portfolio, while all other products have finite lifetime. All product of an investor-owned firm (IOF) are assumed to have finite lifetime. Our simulation results show that the single origin constraint pulls the activities of the cooperative in one cluster centered around the first activity, while the IOF’s product portfolio develops in a centrifugal way. The cooperative and the IOF are more diversified in a mixed duopoly.
    Keywords: Diversification, agent-based model, coherence, single origin constraint, cooperatives
    Date: 2022–05–30
  10. By: Jeremy Greenwood (University of Pennsylvania); Ricardo Marto (University of Pennsylvania)
    Abstract: This primer will cover some of the numerical methods that are used in modern macroeconomics. You will learn how to: (1) solve nonlinear equations via bisection and Newton's method; (2) compute maximization problems via golden section search, discretization, and the particle swarm algorithm; (3) simulate difference equations using the extended path and multiple shooting algorithms; (4) differentiate and integrate functions numerically; (5) conduct Monte Carlo simulations by drawing random variables; (6) construct Markov chains; (7) interpolate functions and smooth data; (8) compute dynamic programming problems; (9) solve for policy functions using the Coleman, endogenous grid, and parameterized expectation algorithms; (10) study the Aiyagari heterogeneous agent model. This will be done while studying economic problems, such as the determination of labor supply, economic growth, and business cycle analysis. Calculus is an integral part of the primer and some elementary probability theory will be drawn upon. The MATLAB programming language will be used. It is time to move into the modern age and learn these techniques. Besides, using computers to solve economic models is fun. The primer is self contained so little prior knowledge is required.
    Keywords: Aiyagari model, calibration, Coleman algorithm, difference equations, dynamic programming, endogenous grid method, interpolating functions, linearization, Markov chains, maximization problems, Monte Carlo simulation, nonlinear equations, numerical differentiation and integration, parameterized expectations, random number generation
    JEL: E10 E17
    Date: 2022–06
  11. By: Holger Breinlich; Valentina Corradi; Nadia Rocha; Michele Ruta; J.M.C. Santos Silva; Tom Zylkin
    Abstract: Modern trade agreements contain a large number of provisions in addition to tariff reductions, in areas as diverse as services trade, competition policy, trade-related investment measures, or public procurement. Existing research has struggled with overfitting and severe multicollinearity problems when trying to estimate the effects of these provisions on trade flows. Building on recent developments in the machine learning and variable selection literature, this paper proposes data-driven methods for selecting the most important provisions and quantifying their impact on trade flows, without the need of making ad hoc assumptions on how to aggregate individual provisions. The analysis finds that provisions related to antidumping, competition policy, technical barriers to trade, and trade facilitation are associated with enhancing the trade-increasing effect of trade agreements.
    Keywords: lasso, machine learning, preferential trade agreements, deep trade agreements
    Date: 2021–06–16
  12. By: Ziheng Chen
    Abstract: Abstract In this work, we build two environments, namely the modified QLBS and RLOP models, from a mathematics perspective which enables RL methods in option pricing through replicating by portfolio. We implement the environment specifications (the source code can be found at, the learning algorithm, and agent parametrization by a neural network. The learned optimal hedging strategy is compared against the BS prediction. The effect of various factors is considered and studied based on how they affect the optimal price and position.
    Date: 2022–05
  13. By: Dongshuai Zhao (ETH Zürich - Department of Management, Technology, and Economics (D-MTEC)); Zhongli Wang (Bielefeld University); Florian Schweizer-Gamborino (Price Waterhouse Coopers (PwC)); Didier Sornette (ETH Zürich - Department of Management, Technology, and Economics (D-MTEC); Swiss Finance Institute; Southern University of Science and Technology; Tokyo Institute of Technology)
    Abstract: Polytope Fraud Theory (PFT) extends the existing triangle and diamond theories of accounting fraud with ten abnormal financial practice alarms that a fraudulent firm might trigger. These warning signals are identified through evaluation of the shorting behavior of sophisticated activist short sellers, which are used to train several supervised machine-learning methods in detecting financial statement fraud using published accounting data. Our contributions include a systematic manual collection and labeling of companies that are shorted by professional activist short sellers. We also combine well-known asset pricing factors with accounting red flags in financial features selections. Using 80 percent of the data for training and the remaining 20 percent for out-of-sample test and performance assessment, we find that the best method is XGBoost, with a Recall of 79 percent and F1-score of 85 percent. Other methods have only slightly lower performance, demonstrating the robustness of our results. This shows that the sophisticated activist short sellers, from whom the algorithms are learning, have excellent accounting insights, tremendous forensic analytical knowledge, and sharp business acumen. Our feature importance analysis indicates that potential short-selling targets share many similar financial characteristics, such as bankruptcy or financial distress risk, clustering in some industries, inconsistency of profitability, high accrual, and unreasonable business operations. Our results imply the possible automation of advanced financial statement analysis, which can both improve auditing processes and effectively enhance investment performance. Finally, we propose the Unified Investor Protection Framework, summarizing and categorizing investor-protection related theories from the macro-level to the micro-level.
    Keywords: fraud risk assessment, financial fraud, fraud detection, machine learning
    JEL: C45 C53 M40 M41
    Date: 2022–05
  14. By: Toon Vanderschueren; Robert Boute; Tim Verdonck; Bart Baesens; Wouter Verbeke
    Abstract: Machine maintenance is a challenging operational problem, where the goal is to plan sufficient preventive maintenance to avoid machine failures and overhauls. Maintenance is often imperfect in reality and does not make the asset as good as new. Although a variety of imperfect maintenance policies have been proposed in the literature, these rely on strong assumptions regarding the effect of maintenance on the machine's condition, assuming the effect is (1) deterministic or governed by a known probability distribution, and (2) machine-independent. This work proposes to relax both assumptions by learning the effect of maintenance conditional on a machine's characteristics from observational data on similar machines using existing methodologies for causal inference. By predicting the maintenance effect, we can estimate the number of overhauls and failures for different levels of maintenance and, consequently, optimize the preventive maintenance frequency to minimize the total estimated cost. We validate our proposed approach using real-life data on more than 4,000 maintenance contracts from an industrial partner. Empirical results show that our novel, causal approach accurately predicts the maintenance effect and results in individualized maintenance schedules that are more accurate and cost-effective than supervised or non-individualized approaches.
    Date: 2022–06
  15. By: Chenrui Zhang
    Abstract: This paper discusses how to crawl the data of financial forums such as stock bar, and conduct emotional analysis combined with the in-depth learning model. This paper will use the Bert model to train the financial corpus and predict the Shenzhen stock index. Through the comparative study of the maximal information coefficient (MIC), it is found that the emotional characteristics obtained by applying the BERT model to the financial corpus can be reflected in the fluctuation of the stock market, which is conducive to effectively improve the prediction accuracy. At the same time, this paper combines in-depth learning with financial texts to further explore the impact mechanism of investor sentiment on the stock market through in-depth learning, which will help the national regulatory authorities and policy departments to formulate more reasonable policies and guidelines for maintaining the stability of the stock market.
    Date: 2022–05
  16. By: Chaojun Li; Shi Qiu
    Abstract: This study proposes an efficient algorithm for score computation for regime-switching models, and derived from which, an efficient expectation-maximization (EM) algorithm. Different from existing algorithms, this algorithm does not rely on the forward-backward filtering for smoothed regime probabilities, and only involves forward computation. Moreover, the algorithm to compute score is readily extended to compute the Hessian matrix.
    Date: 2022–05
  17. By: Dyrland, Kjetil; Lundervold, Alexander Selvikvåg (Western Norway University of Applied Sciences); Porta Mana, PierGianLuca (HVL Western Norway University of Applied Sciences)
    Abstract: How can one meaningfully make a measurement, if the meter does not conform to any standard and its scale expands or shrinks depending on what is measured? In the present work it is argued that current evaluation practices for machine-learning classifiers are affected by this kind of problem, leading to negative consequences that appear when classifiers are put to real use and that could have been avoided. It is proposed that evaluation be grounded on Decision Theory, and the consequences of such foundation are explored. The main result is that every evaluation metric must be a linear combination of confusion-matrix elements, with coefficients – ‘utilities’ – that depend on the specific classification problem. For binary classification, the space of such possible metrics is effectively two-dimensional. It is shown that popular metrics such as precision, balanced accuracy, Matthews Correlation Coefficient, Fowlkes-Mallows index, F1-measure, and Area Under the Curve are never optimal: they always give rise to an avoidable fraction of incorrect evaluations. This fraction is larger than would be caused by the use of a decision-theoretic metric with moderately wrong coefficients.
    Date: 2022–05–27
  18. By: Jason Nassios; James Giesecke
    Abstract: Australia has high housing prices by world standards. Australian state and local governments also have a high reliance on a variety of property taxes. This has generated calls for state tax reform. However, with property prices high, a concern of policy makers is that property tax reform might push house prices higher still. We investigate the effects of seventeen property tax reform options, with a particular focus on potential trade-offs between efficiency benefits and house price impacts.
    Keywords: CGE modelling, Immovable property tax, Recurrent property tax, Housing prices, Excess burden
    JEL: C68 E62 H2 H71 R38
    Date: 2022–06
  19. By: Cho, Jehum (Université catholique de Louvain, LIDAM/CORE, Belgium); Papavasiliou, Anthony (Université catholique de Louvain, LIDAM/CORE, Belgium)
    Abstract: We implement an exact mixed-integer programming algorithm for the chance-constrained multi-area reserve sizing problem in the presence of transmission network constraints. The problem can be cast as a two-stage stochastic mixed integer linear program using sample approximation. Due to the complicated structure of the problem, existing methods attempt to find a feasible solution based on heuristics. However, a recent development of integer programming techniques allow us to reformulate the problem into a form where we can solve it to optimality. In this paper, we apply this integer programming approach to solve our problem to optimality and compare the results with that of the existing heuristics.
    Keywords: Multi-area reserve sizing ; chance constraints ; probabilistic constraints ; mixed-integer programming
    Date: 2022–04–28
  20. By: Micha{\l} Narajewski
    Abstract: The exponential growth of renewable energy capacity has brought much uncertainty to electricity prices and to electricity generation. To address this challenge, the energy exchanges have been developing further trading possibilities, especially the intraday and balancing markets. For an energy trader participating in both markets, the forecasting of imbalance prices is of particular interest. Therefore, in this manuscript we conduct a very short-term probabilistic forecasting of imbalance prices, contributing to the scarce literature in this novel subject. The forecasting is performed 30 minutes before the delivery, so that the trader might still choose the trading place. The distribution of the imbalance prices is modelled and forecasted using methods well-known in the electricity price forecasting literature: lasso with bootstrap, gamlss, and probabilistic neural networks. The methods are compared with a naive benchmark in a meaningful rolling window study. The results provide evidence of the efficiency between the intraday and balancing markets as the sophisticated methods do not substantially overperform the intraday continuous price index. On the other hand, they significantly improve the empirical coverage. The analysis was conducted on the German market, however it could be easily applied to any other market of similar structure.
    Date: 2022–05

General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.