nep-cmp 2022-09-19 papers

on Computational Economics

Issue of 2022‒09‒19
fifteen papers chosen by

pystacked: Stacking generalization and machine learning in Stata By Achim Ahrens; Christian B. Hansen; Mark E. Schaffer
Modeling Path-Dependent State Transition by a Recurrent Neural Network By Yang, Bill Huajian
A simple learning agent interacting with an agent-based market model By Matthew Dicks; Tim Gebbie
DSGE Models and Machine Learning: An Application to Monetary Policy in the Euro Area By Daniel Stempel; Johannes Zahner
Learn Then Test: Calibrating Predictive Algorithms to Achieve Risk Control By Angelopoulos, Anastasios N.; Bates, Stephen; Candes, Emmanuel J.; Jordan, Michael I.; Lei, Lihua
Non-Stationary Dynamic Pricing Via Actor-Critic Information-Directed Pricing By Po-Yi Liu; Chi-Hua Wang; Heng-Hsui Tsai
An intelligent algorithmic trading based on a risk-return reinforcement learning algorithm By Boyi Jin
Randomized Optimal Stopping Problem in Continuous time and Reinforcement Learning Algorithm By Yuchao Dong
Time is limited on the road to asymptopia By Ivonne Schwartz; Mark Kirstein
Neural Payoff Machines: Predicting Fair and Stable Payoff Allocations Among Team Members By Daphne Cornelisse; Thomas Rood; Mateusz Malinowski; Yoram Bachrach; Tal Kachman
Gradient Descent Ascent in Min-Max Stackelberg Games By Denizalp Goktas; Amy Greenwald
Local processing of massive databases with R: a national analysis of a Brazilian social programme By Paz, Hellen; Maia, Mateus; Moraes, Fernando; Lustosa, Ricardo; Costa, Lilia; Macêdo, Samuel; Barreto, Marcos E.; Ara, Anderson
Choquet regularization for reinforcement learning By Xia Han; Ruodu Wang; Xun Yu Zhou
Deep Learning for Choice Modeling By Zhongze Cai; Hanzhao Wang; Kalyan Talluri; Xiaocheng Li
Stock Prices as Janardan Galton Watson Process By Ali Saeb

pystacked: Stacking generalization and machine learning in Stata

By:	Achim Ahrens; Christian B. Hansen; Mark E. Schaffer
Abstract:	pystacked implements stacked generalization (Wolpert, 1992) for regression and binary classification via Python's scikit-lear}. Stacking combines multiple supervised machine learners -- the "base" or "level-0" learners -- into a single learner. The currently supported base learners include regularized regression, random forest, gradient boosted trees, support vector machines, and feed-forward neural nets (multi-layer perceptron). pystacked can also be used with as a `regular' machine learning program to fit a single base learner and, thus, provides an easy-to-use API for scikit-learn's machine learning algorithms.
Date:	2022–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2208.10896&r=

Modeling Path-Dependent State Transition by a Recurrent Neural Network

By:	Yang, Bill Huajian
Abstract:	Rating transition models are widely used for credit risk evaluation. It is not uncommon that a time-homogeneous Markov rating migration model deteriorates quickly after projecting repeatedly for a few periods. This is because the time-homogeneous Markov condition is generally not satisfied. For a credit portfolio, rating transition is usually path dependent. In this paper, we propose a recurrent neural network (RNN) model for modeling path-dependent rating migration. An RNN is a type of artificial neural networks where connections between nodes form a directed graph along a temporal sequence. There are neurons for input and output at each time-period. The model is informed by the past behaviours for a loan along the path. Information learned from previous periods propagates to future periods. Experiments show this RNN model is robust.
Keywords:	Path-dependent, rating transition, recurrent neural network, deep learning, Markov property, time-homogeneity
JEL:	C13 C18 C45 C51 C58 G12 G17 G32 G33 M3
Date:	2022–08–18
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:114188&r=

A simple learning agent interacting with an agent-based market model

By:	Matthew Dicks; Tim Gebbie
Abstract:	We consider the learning dynamics of a single reinforcement learning optimal execution trading agent when it interacts with an event driven agent-based financial market model. Trading takes place asynchronously through a matching engine in event time. The optimal execution agent is considered at different levels of initial order-sizes and differently sized state spaces. The resulting impact on the agent-based model and market are considered using a calibration approach that explores changes in the empirical stylised facts and price impact curves. Convergence, volume trajectory and action trace plots are used to visualise the learning dynamics. This demonstrates how an optimal execution agent learns optimal trading decisions inside a simulated reactive market framework and how this in turn generates a back-reaction that changes the simulated market through the introduction of strategic order-splitting.
Date:	2022–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2208.10434&r=

DSGE Models and Machine Learning: An Application to Monetary Policy in the Euro Area

By:	Daniel Stempel (University of Duesseldorf); Johannes Zahner (University of Marburg)
Abstract:	In the euro area, monetary policy is conducted by a single central bank for 19 member countries. However, countries are heterogeneous in their economic development, including their inflation rates. This paper combines a New Keynesian model and a neural network to assess whether the European Central Bank (ECB) conducted monetary policy between 2002 and 2022 according to the weighted average of the inflation rates within the European Monetary Union (EMU) or reacted more strongly to the inflation rate developments of certain EMU countries. The New Keynesian model first generates data which is used to train and evaluate several machine learning algorithms. We find that a neural network performs best out-of-sample. Thus, we use this algorithm to classify historical EMU data. Our findings suggest disproportional emphasis on the inflation rates experienced by southern EMU members for the vast majority of the time frame considered (80%). We argue that this result stems from a tendency of the ECB to react more strongly to countries whose inflation rates exhibit greater deviations from their long-term trend.
Keywords:	New Keynesian Models, Monetary Policy, European Monetary Union, Neural Networks, Transfer Learning
JEL:	C45 C53 E58
Date:	2022
URL:	http://d.repec.org/n?u=RePEc:mar:magkse:202232&r=

Learn Then Test: Calibrating Predictive Algorithms to Achieve Risk Control

By:	Angelopoulos, Anastasios N. (?); Bates, Stephen (?); Candes, Emmanuel J. (?); Jordan, Michael I. (?); Lei, Lihua (Stanford U)
Abstract:	We introduce a framework for calibrating machine learning models so that their predictions satisfy explicit, finite-sample statistical guarantees. Our calibration algorithm works with any underlying model and (unknown) data-generating distribution and does not require model refitting. The framework addresses, among other examples, false discovery rate control in multi-label classification, intersection-over-union control in instance segmentation, and the simultaneous control of the type-1 error of outlier detection and confidence set coverage in classification or regression. Our main insight is to reframe the risk-control problem as multiple hypothesis testing, enabling techniques and mathematical arguments different from those in the previous literature. We use our framework to provide new calibration methods for several core machine learning tasks with detailed worked examples in computer vision and tabular medical data.
Date:	2022–04
URL:	http://d.repec.org/n?u=RePEc:ecl:stabus:4030&r=

Non-Stationary Dynamic Pricing Via Actor-Critic Information-Directed Pricing

By:	Po-Yi Liu; Chi-Hua Wang; Heng-Hsui Tsai
Abstract:	This paper presents a novel non-stationary dynamic pricing algorithm design, where pricing agents face incomplete demand information and market environment shifts. The agents run price experiments to learn about each product's demand curve and the profit-maximizing price, while being aware of market environment shifts to avoid high opportunity costs from offering sub-optimal prices. The proposed ACIDP extends information-directed sampling (IDS) algorithms from statistical machine learning to include microeconomic choice theory, with a novel pricing strategy auditing procedure to escape sub-optimal pricing after market environment shift. The proposed ACIDP outperforms competing bandit algorithms including Upper Confidence Bound (UCB) and Thompson sampling (TS) in a series of market environment shifts.
Date:	2022–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2208.09372&r=

An intelligent algorithmic trading based on a risk-return reinforcement learning algorithm

By:	Boyi Jin
Abstract:	This scientific paper propose a novel portfolio optimization model using an improved deep reinforcement learning algorithm. The objective function of the optimization model is the weighted sum of the expectation and value at risk(VaR) of portfolio cumulative return. The proposed algorithm is based on actor-critic architecture, in which the main task of critical network is to learn the distribution of portfolio cumulative return using quantile regression, and actor network outputs the optimal portfolio weight by maximizing the objective function mentioned above. Meanwhile, we exploit a linear transformation function to realize asset short selling. Finally, A multi-process method is used, called Ape-x, to accelerate the speed of deep reinforcement learning training. To validate our proposed approach, we conduct backtesting for two representative portfolios and observe that the proposed model in this work is superior to the benchmark strategies.
Date:	2022–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2208.10707&r=

Randomized Optimal Stopping Problem in Continuous time and Reinforcement Learning Algorithm

By:	Yuchao Dong
Abstract:	In this paper, we study the optimal stopping problem in the so-called exploratory framework, in which the agent takes actions randomly conditioning on current state and an entropy-regularized term is added to the reward functional. Such a transformation reduces the optimal stopping problem to a standard optimal control problem. We derive the related HJB equation and prove its solvability. Furthermore, we give a convergence rate of policy iteration and the comparison to classical optimal stopping problem. Based on the theoretical analysis, a reinforcement learning algorithm is designed and numerical results are demonstrated for several models.
Date:	2022–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2208.02409&r=

Time is limited on the road to asymptopia

By:	Ivonne Schwartz; Mark Kirstein
Abstract:	One challenge in the estimation of financial market agent-based models (FABMs) is to infer reliable insights using numerical simulations validated by only a single observed time series. Ergodicity (besides stationarity) is a strong precondition for any estimation, however it has not been systematically explored and is often simply presumed. For finite-sample lengths and limited computational resources empirical estimation always takes place in pre-asymptopia. Thus broken ergodicity must be considered the rule, but it remains largely unclear how to deal with the remaining uncertainty in non-ergodic observables. Here we show how an understanding of the ergodic properties of moment functions can help to improve the estimation of (F)ABMs. We run Monte Carlo experiments and study the convergence behaviour of moment functions of two prototype models. We find infeasibly-long convergence times for most. Choosing an efficient mix of ensemble size and simulated time length guided our estimation and might help in general.
Date:	2022–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2208.08169&r=

Neural Payoff Machines: Predicting Fair and Stable Payoff Allocations Among Team Members

By:	Daphne Cornelisse; Thomas Rood; Mateusz Malinowski; Yoram Bachrach; Tal Kachman
Abstract:	In many multi-agent settings, participants can form teams to achieve collective outcomes that may far surpass their individual capabilities. Measuring the relative contributions of agents and allocating them shares of the reward that promote long-lasting cooperation are difficult tasks. Cooperative game theory offers solution concepts identifying distribution schemes, such as the Shapley value, that fairly reflect the contribution of individuals to the performance of the team or the Core, which reduces the incentive of agents to abandon their team. Applications of such methods include identifying influential features and sharing the costs of joint ventures or team formation. Unfortunately, using these solutions requires tackling a computational barrier as they are hard to compute, even in restricted settings. In this work, we show how cooperative game-theoretic solutions can be distilled into a learned model by training neural networks to propose fair and stable payoff allocations. We show that our approach creates models that can generalize to games far from the training distribution and can predict solutions for more players than observed during training. An important application of our framework is Explainable AI: our approach can be used to speed-up Shapley value computations on many instances.
Date:	2022–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2208.08798&r=

Gradient Descent Ascent in Min-Max Stackelberg Games

By:	Denizalp Goktas; Amy Greenwald
Abstract:	Min-max optimization problems (i.e., min-max games) have attracted a great deal of attention recently as their applicability to a wide range of machine learning problems has become evident. In this paper, we study min-max games with dependent strategy sets, where the strategy of the first player constrains the behavior of the second. Such games are best understood as sequential, i.e., Stackelberg, games, for which the relevant solution concept is Stackelberg equilibrium, a generalization of Nash. One of the most popular algorithms for solving min-max games is gradient descent ascent (GDA). We present a straightforward generalization of GDA to min-max Stackelberg games with dependent strategy sets, but show that it may not converge to a Stackelberg equilibrium. We then introduce two variants of GDA, which assume access to a solution oracle for the optimal Karush Kuhn Tucker (KKT) multipliers of the games' constraints. We show that such an oracle exists for a large class of convex-concave min-max Stackelberg games, and provide proof that our GDA variants with such an oracle converge in $O(\frac{1}{\varepsilon^2})$ iterations to an $\varepsilon$-Stackelberg equilibrium, improving on the most efficient algorithms currently known which converge in $O(\frac{1}{\varepsilon^3})$ iterations. We then show that solving Fisher markets, a canonical example of a min-max Stackelberg game, using our novel algorithm, corresponds to buyers and sellers using myopic best-response dynamics in a repeated market, allowing us to prove the convergence of these dynamics in $O(\frac{1}{\varepsilon^2})$ iterations in Fisher markets. We close by describing experiments on Fisher markets which suggest potential ways to extend our theoretical results, by demonstrating how different properties of the objective function can affect the convergence and convergence rate of our algorithms.
Date:	2022–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2208.09690&r=

Local processing of massive databases with R: a national analysis of a Brazilian social programme

By:	Paz, Hellen; Maia, Mateus; Moraes, Fernando; Lustosa, Ricardo; Costa, Lilia; Macêdo, Samuel; Barreto, Marcos E.; Ara, Anderson
Abstract:	The analysis of massive databases is a key issue for most applications today and the use of parallel computing techniques is one of the suitable approaches for that. Apache Spark is a widely employed tool within this context, aiming at processing large amounts of data in a distributed way. For the Statistics community, R is one of the preferred tools. Despite its growth in the last years, it still has limitations for processing large volumes of data in single local machines. In general, the data analysis community has difficulty to handle a massive amount of data on local machines, often requiring high-performance computing servers. One way to perform statistical analyzes over massive databases is combining both tools (Spark and R) via the sparklyr package, which allows for an R application to use Spark. This paper presents an analysis of Brazilian public data from the Bolsa Família Programme (BFP—conditional cash transfer), comprising a large data set with 1.26 billion observations. Our goal was to understand how this social program acts in different cities, as well as to identify potentially important variables reflecting its utilization rate. Statistical modeling was performed using random forest to predict the utilization rated of BFP. Variable selection was performed through a recent method based on the importance and interpretation of variables in the random forest model. Among the 89 variables initially considered, the final model presented a high predictive performance capacity with 17 selected variables, as well as indicated high importance of some variables for the observed utilization rate in income, education, job informality, and inactive youth, namely: family income, education, occupation and density of people in the homes. In this work, using a local machine, we highlighted the potential of aggregating Spark and R for analysis of a large database of 111.6 GB. This can serve as proof of concept or reference for other similar works within the Statistics community, as well as our case study can provide important evidence for further analysis of this important social support programme.
Keywords:	big data; massive databases; impact evaluation; sparklyr; Bolsa Familia
JEL:	C1
Date:	2020–12–01
URL:	http://d.repec.org/n?u=RePEc:ehl:lserod:115770&r=

Choquet regularization for reinforcement learning

By:	Xia Han; Ruodu Wang; Xun Yu Zhou
Abstract:	We propose \emph{Choquet regularizers} to measure and manage the level of exploration for reinforcement learning (RL), and reformulate the continuous-time entropy-regularized RL problem of Wang et al. (2020, JMLR, 21(198)) in which we replace the differential entropy used for regularization with a Choquet regularizer. We derive the Hamilton--Jacobi--Bellman equation of the problem, and solve it explicitly in the linear--quadratic (LQ) case via maximizing statically a mean--variance constrained Choquet regularizer. Under the LQ setting, we derive explicit optimal distributions for several specific Choquet regularizers, and conversely identify the Choquet regularizers that generate a number of broadly used exploratory samplers such as $\epsilon$-greedy, exponential, uniform and Gaussian.
Date:	2022–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2208.08497&r=

Deep Learning for Choice Modeling

By:	Zhongze Cai; Hanzhao Wang; Kalyan Talluri; Xiaocheng Li
Abstract:	Choice modeling has been a central topic in the study of individual preference or utility across many fields including economics, marketing, operations research, and psychology. While the vast majority of the literature on choice models has been devoted to the analytical properties that lead to managerial and policy-making insights, the existing methods to learn a choice model from empirical data are often either computationally intractable or sample inefficient. In this paper, we develop deep learning-based choice models under two settings of choice modeling: (i) feature-free and (ii) feature-based. Our model captures both the intrinsic utility for each candidate choice and the effect that the assortment has on the choice probability. Synthetic and real data experiments demonstrate the performances of proposed models in terms of the recovery of the existing choice models, sample complexity, assortment effect, architecture design, and model interpretation.
Date:	2022–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2208.09325&r=

Stock Prices as Janardan Galton Watson Process

By:	Ali Saeb
Abstract:	Janardan (1980) introduces a class of offspring distributions that sandwich between Bernoulli and Poisson. This paper extends the Janardan Galton Watson (JGW) branching process as a model of stock prices. In this article, the return value over time t depends on the initial close price, which shows the number of offspring, has a role in the expectation of return and probability of extinction after the passage at time t. Suppose the number of offspring in t th generation is zero, (i.e., called extinction of model at time t) is equivalent with negative return values over time [0, t]. We also introduce the Algorithm that detecting the trend of stock markets.
Date:	2022–08
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2208.08496&r=

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.