
on Computational Economics 
By:  Achim Ahrens; Christian B. Hansen; Mark E. Schaffer 
Abstract:  pystacked implements stacked generalization (Wolpert, 1992) for regression and binary classification via Python's scikitlear}. Stacking combines multiple supervised machine learners  the "base" or "level0" learners  into a single learner. The currently supported base learners include regularized regression, random forest, gradient boosted trees, support vector machines, and feedforward neural nets (multilayer perceptron). pystacked can also be used with as a `regular' machine learning program to fit a single base learner and, thus, provides an easytouse API for scikitlearn's machine learning algorithms. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.10896&r= 
By:  Yang, Bill Huajian 
Abstract:  Rating transition models are widely used for credit risk evaluation. It is not uncommon that a timehomogeneous Markov rating migration model deteriorates quickly after projecting repeatedly for a few periods. This is because the timehomogeneous Markov condition is generally not satisfied. For a credit portfolio, rating transition is usually path dependent. In this paper, we propose a recurrent neural network (RNN) model for modeling pathdependent rating migration. An RNN is a type of artificial neural networks where connections between nodes form a directed graph along a temporal sequence. There are neurons for input and output at each timeperiod. The model is informed by the past behaviours for a loan along the path. Information learned from previous periods propagates to future periods. Experiments show this RNN model is robust. 
Keywords:  Pathdependent, rating transition, recurrent neural network, deep learning, Markov property, timehomogeneity 
JEL:  C13 C18 C45 C51 C58 G12 G17 G32 G33 M3 
Date:  2022–08–18 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:114188&r= 
By:  Matthew Dicks; Tim Gebbie 
Abstract:  We consider the learning dynamics of a single reinforcement learning optimal execution trading agent when it interacts with an event driven agentbased financial market model. Trading takes place asynchronously through a matching engine in event time. The optimal execution agent is considered at different levels of initial ordersizes and differently sized state spaces. The resulting impact on the agentbased model and market are considered using a calibration approach that explores changes in the empirical stylised facts and price impact curves. Convergence, volume trajectory and action trace plots are used to visualise the learning dynamics. This demonstrates how an optimal execution agent learns optimal trading decisions inside a simulated reactive market framework and how this in turn generates a backreaction that changes the simulated market through the introduction of strategic ordersplitting. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.10434&r= 
By:  Daniel Stempel (University of Duesseldorf); Johannes Zahner (University of Marburg) 
Abstract:  In the euro area, monetary policy is conducted by a single central bank for 19 member countries. However, countries are heterogeneous in their economic development, including their inflation rates. This paper combines a New Keynesian model and a neural network to assess whether the European Central Bank (ECB) conducted monetary policy between 2002 and 2022 according to the weighted average of the inflation rates within the European Monetary Union (EMU) or reacted more strongly to the inflation rate developments of certain EMU countries. The New Keynesian model first generates data which is used to train and evaluate several machine learning algorithms. We find that a neural network performs best outofsample. Thus, we use this algorithm to classify historical EMU data. Our findings suggest disproportional emphasis on the inflation rates experienced by southern EMU members for the vast majority of the time frame considered (80%). We argue that this result stems from a tendency of the ECB to react more strongly to countries whose inflation rates exhibit greater deviations from their longterm trend. 
Keywords:  New Keynesian Models, Monetary Policy, European Monetary Union, Neural Networks, Transfer Learning 
JEL:  C45 C53 E58 
Date:  2022 
URL:  http://d.repec.org/n?u=RePEc:mar:magkse:202232&r= 
By:  Angelopoulos, Anastasios N. (?); Bates, Stephen (?); Candes, Emmanuel J. (?); Jordan, Michael I. (?); Lei, Lihua (Stanford U) 
Abstract:  We introduce a framework for calibrating machine learning models so that their predictions satisfy explicit, finitesample statistical guarantees. Our calibration algorithm works with any underlying model and (unknown) datagenerating distribution and does not require model refitting. The framework addresses, among other examples, false discovery rate control in multilabel classification, intersectionoverunion control in instance segmentation, and the simultaneous control of the type1 error of outlier detection and confidence set coverage in classification or regression. Our main insight is to reframe the riskcontrol problem as multiple hypothesis testing, enabling techniques and mathematical arguments different from those in the previous literature. We use our framework to provide new calibration methods for several core machine learning tasks with detailed worked examples in computer vision and tabular medical data. 
Date:  2022–04 
URL:  http://d.repec.org/n?u=RePEc:ecl:stabus:4030&r= 
By:  PoYi Liu; ChiHua Wang; HengHsui Tsai 
Abstract:  This paper presents a novel nonstationary dynamic pricing algorithm design, where pricing agents face incomplete demand information and market environment shifts. The agents run price experiments to learn about each product's demand curve and the profitmaximizing price, while being aware of market environment shifts to avoid high opportunity costs from offering suboptimal prices. The proposed ACIDP extends informationdirected sampling (IDS) algorithms from statistical machine learning to include microeconomic choice theory, with a novel pricing strategy auditing procedure to escape suboptimal pricing after market environment shift. The proposed ACIDP outperforms competing bandit algorithms including Upper Confidence Bound (UCB) and Thompson sampling (TS) in a series of market environment shifts. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.09372&r= 
By:  Boyi Jin 
Abstract:  This scientific paper propose a novel portfolio optimization model using an improved deep reinforcement learning algorithm. The objective function of the optimization model is the weighted sum of the expectation and value at risk(VaR) of portfolio cumulative return. The proposed algorithm is based on actorcritic architecture, in which the main task of critical network is to learn the distribution of portfolio cumulative return using quantile regression, and actor network outputs the optimal portfolio weight by maximizing the objective function mentioned above. Meanwhile, we exploit a linear transformation function to realize asset short selling. Finally, A multiprocess method is used, called Apex, to accelerate the speed of deep reinforcement learning training. To validate our proposed approach, we conduct backtesting for two representative portfolios and observe that the proposed model in this work is superior to the benchmark strategies. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.10707&r= 
By:  Yuchao Dong 
Abstract:  In this paper, we study the optimal stopping problem in the socalled exploratory framework, in which the agent takes actions randomly conditioning on current state and an entropyregularized term is added to the reward functional. Such a transformation reduces the optimal stopping problem to a standard optimal control problem. We derive the related HJB equation and prove its solvability. Furthermore, we give a convergence rate of policy iteration and the comparison to classical optimal stopping problem. Based on the theoretical analysis, a reinforcement learning algorithm is designed and numerical results are demonstrated for several models. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.02409&r= 
By:  Ivonne Schwartz; Mark Kirstein 
Abstract:  One challenge in the estimation of financial market agentbased models (FABMs) is to infer reliable insights using numerical simulations validated by only a single observed time series. Ergodicity (besides stationarity) is a strong precondition for any estimation, however it has not been systematically explored and is often simply presumed. For finitesample lengths and limited computational resources empirical estimation always takes place in preasymptopia. Thus broken ergodicity must be considered the rule, but it remains largely unclear how to deal with the remaining uncertainty in nonergodic observables. Here we show how an understanding of the ergodic properties of moment functions can help to improve the estimation of (F)ABMs. We run Monte Carlo experiments and study the convergence behaviour of moment functions of two prototype models. We find infeasiblylong convergence times for most. Choosing an efficient mix of ensemble size and simulated time length guided our estimation and might help in general. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.08169&r= 
By:  Daphne Cornelisse; Thomas Rood; Mateusz Malinowski; Yoram Bachrach; Tal Kachman 
Abstract:  In many multiagent settings, participants can form teams to achieve collective outcomes that may far surpass their individual capabilities. Measuring the relative contributions of agents and allocating them shares of the reward that promote longlasting cooperation are difficult tasks. Cooperative game theory offers solution concepts identifying distribution schemes, such as the Shapley value, that fairly reflect the contribution of individuals to the performance of the team or the Core, which reduces the incentive of agents to abandon their team. Applications of such methods include identifying influential features and sharing the costs of joint ventures or team formation. Unfortunately, using these solutions requires tackling a computational barrier as they are hard to compute, even in restricted settings. In this work, we show how cooperative gametheoretic solutions can be distilled into a learned model by training neural networks to propose fair and stable payoff allocations. We show that our approach creates models that can generalize to games far from the training distribution and can predict solutions for more players than observed during training. An important application of our framework is Explainable AI: our approach can be used to speedup Shapley value computations on many instances. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.08798&r= 
By:  Denizalp Goktas; Amy Greenwald 
Abstract:  Minmax optimization problems (i.e., minmax games) have attracted a great deal of attention recently as their applicability to a wide range of machine learning problems has become evident. In this paper, we study minmax games with dependent strategy sets, where the strategy of the first player constrains the behavior of the second. Such games are best understood as sequential, i.e., Stackelberg, games, for which the relevant solution concept is Stackelberg equilibrium, a generalization of Nash. One of the most popular algorithms for solving minmax games is gradient descent ascent (GDA). We present a straightforward generalization of GDA to minmax Stackelberg games with dependent strategy sets, but show that it may not converge to a Stackelberg equilibrium. We then introduce two variants of GDA, which assume access to a solution oracle for the optimal Karush Kuhn Tucker (KKT) multipliers of the games' constraints. We show that such an oracle exists for a large class of convexconcave minmax Stackelberg games, and provide proof that our GDA variants with such an oracle converge in $O(\frac{1}{\varepsilon^2})$ iterations to an $\varepsilon$Stackelberg equilibrium, improving on the most efficient algorithms currently known which converge in $O(\frac{1}{\varepsilon^3})$ iterations. We then show that solving Fisher markets, a canonical example of a minmax Stackelberg game, using our novel algorithm, corresponds to buyers and sellers using myopic bestresponse dynamics in a repeated market, allowing us to prove the convergence of these dynamics in $O(\frac{1}{\varepsilon^2})$ iterations in Fisher markets. We close by describing experiments on Fisher markets which suggest potential ways to extend our theoretical results, by demonstrating how different properties of the objective function can affect the convergence and convergence rate of our algorithms. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.09690&r= 
By:  Paz, Hellen; Maia, Mateus; Moraes, Fernando; Lustosa, Ricardo; Costa, Lilia; Macêdo, Samuel; Barreto, Marcos E.; Ara, Anderson 
Abstract:  The analysis of massive databases is a key issue for most applications today and the use of parallel computing techniques is one of the suitable approaches for that. Apache Spark is a widely employed tool within this context, aiming at processing large amounts of data in a distributed way. For the Statistics community, R is one of the preferred tools. Despite its growth in the last years, it still has limitations for processing large volumes of data in single local machines. In general, the data analysis community has difficulty to handle a massive amount of data on local machines, often requiring highperformance computing servers. One way to perform statistical analyzes over massive databases is combining both tools (Spark and R) via the sparklyr package, which allows for an R application to use Spark. This paper presents an analysis of Brazilian public data from the Bolsa Família Programme (BFP—conditional cash transfer), comprising a large data set with 1.26 billion observations. Our goal was to understand how this social program acts in different cities, as well as to identify potentially important variables reflecting its utilization rate. Statistical modeling was performed using random forest to predict the utilization rated of BFP. Variable selection was performed through a recent method based on the importance and interpretation of variables in the random forest model. Among the 89 variables initially considered, the final model presented a high predictive performance capacity with 17 selected variables, as well as indicated high importance of some variables for the observed utilization rate in income, education, job informality, and inactive youth, namely: family income, education, occupation and density of people in the homes. In this work, using a local machine, we highlighted the potential of aggregating Spark and R for analysis of a large database of 111.6 GB. This can serve as proof of concept or reference for other similar works within the Statistics community, as well as our case study can provide important evidence for further analysis of this important social support programme. 
Keywords:  big data; massive databases; impact evaluation; sparklyr; Bolsa Familia 
JEL:  C1 
Date:  2020–12–01 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:115770&r= 
By:  Xia Han; Ruodu Wang; Xun Yu Zhou 
Abstract:  We propose \emph{Choquet regularizers} to measure and manage the level of exploration for reinforcement learning (RL), and reformulate the continuoustime entropyregularized RL problem of Wang et al. (2020, JMLR, 21(198)) in which we replace the differential entropy used for regularization with a Choquet regularizer. We derive the HamiltonJacobiBellman equation of the problem, and solve it explicitly in the linearquadratic (LQ) case via maximizing statically a meanvariance constrained Choquet regularizer. Under the LQ setting, we derive explicit optimal distributions for several specific Choquet regularizers, and conversely identify the Choquet regularizers that generate a number of broadly used exploratory samplers such as $\epsilon$greedy, exponential, uniform and Gaussian. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.08497&r= 
By:  Zhongze Cai; Hanzhao Wang; Kalyan Talluri; Xiaocheng Li 
Abstract:  Choice modeling has been a central topic in the study of individual preference or utility across many fields including economics, marketing, operations research, and psychology. While the vast majority of the literature on choice models has been devoted to the analytical properties that lead to managerial and policymaking insights, the existing methods to learn a choice model from empirical data are often either computationally intractable or sample inefficient. In this paper, we develop deep learningbased choice models under two settings of choice modeling: (i) featurefree and (ii) featurebased. Our model captures both the intrinsic utility for each candidate choice and the effect that the assortment has on the choice probability. Synthetic and real data experiments demonstrate the performances of proposed models in terms of the recovery of the existing choice models, sample complexity, assortment effect, architecture design, and model interpretation. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.09325&r= 
By:  Ali Saeb 
Abstract:  Janardan (1980) introduces a class of offspring distributions that sandwich between Bernoulli and Poisson. This paper extends the Janardan Galton Watson (JGW) branching process as a model of stock prices. In this article, the return value over time t depends on the initial close price, which shows the number of offspring, has a role in the expectation of return and probability of extinction after the passage at time t. Suppose the number of offspring in t th generation is zero, (i.e., called extinction of model at time t) is equivalent with negative return values over time [0, t]. We also introduce the Algorithm that detecting the trend of stock markets. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.08496&r= 