|
on Computational Economics |
Issue of 2019‒07‒22
35 papers chosen by |
By: | Jacobo Roa-Vicens; Cyrine Chtourou; Angelos Filos; Francisco Rullan; Yarin Gal; Ricardo Silva |
Abstract: | Multi-agent learning is a promising method to simulate aggregate competitive behaviour in finance. Learning expert agents' reward functions through their external demonstrations is hence particularly relevant for subsequent design of realistic agent-based simulations. Inverse Reinforcement Learning (IRL) aims at acquiring such reward functions through inference, allowing to generalize the resulting policy to states not observed in the past. This paper investigates whether IRL can infer such rewards from agents within real financial stochastic environments: limit order books (LOB). We introduce a simple one-level LOB, where the interactions of a number of stochastic agents and an expert trading agent are modelled as a Markov decision process. We consider two cases for the expert's reward: either a simple linear function of state features; or a complex, more realistic non-linear function. Given the expert agent's demonstrations, we attempt to discover their strategy by modelling their latent reward function using linear and Gaussian process (GP) regressors from previous literature, and our own approach through Bayesian neural networks (BNN). While the three methods can learn the linear case, only the GP-based and our proposed BNN methods are able to discover the non-linear reward case. Our BNN IRL algorithm outperforms the other two approaches as the number of samples increases. These results illustrate that complex behaviours, induced by non-linear reward functions amid agent-based stochastic scenarios, can be deduced through inference, encouraging the use of inverse reinforcement learning for opponent-modelling in multi-agent systems. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.04813&r=all |
By: | Francois Belletti; Davis King; Kun Yang; Roland Nelet; Yusef Shafi; Yi-Fan Chen; John Anderson |
Abstract: | Monte Carlo methods are core to many routines in quantitative finance such as derivatives pricing, hedging and risk metrics. Unfortunately, Monte Carlo methods are very computationally expensive when it comes to running simulations in high-dimensional state spaces where they are still a method of choice in the financial industry. Recently, Tensor Processing Units (TPUs) have provided considerable speedups and decreased the cost of running Stochastic Gradient Descent (SGD) in Deep Learning. After having highlighted computational similarities between training neural networks with SGD and stochastic process simulation, we ask in the present paper whether TPUs are accurate, fast and simple enough to use for financial Monte Carlo. Through a theoretical reminder of the key properties of such methods and thorough empirical experiments we examine the fitness of TPUs for option pricing, hedging and risk metrics computation. We show in the following that Tensor Processing Units (TPUs) in the cloud help accelerate Monte Carlo routines compared to Graphics Processing Units (GPUs) which in turn decreases the cost associated with running such simulations while leveraging the flexibility of the cloud. In particular we demonstrate that, in spite of the use of mixed precision, TPUs still provide accurate estimators which are fast to compute. We also show that the Tensorflow programming model for TPUs is elegant, expressive and simplifies automated differentiation. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.02818&r=all |
By: | A Itkin |
Abstract: | Recent progress in the field of artificial intelligence, machine learning and also in computer industry resulted in the ongoing boom of using these techniques as applied to solving complex tasks in both science and industry. Same is, of course, true for the financial industry and mathematical finance. In this paper we consider a classical problem of mathematical finance - calibration of option pricing models to market data, as it was recently drawn some attention of the financial society in the context of deep learning and artificial neural networks. We highlight some pitfalls in the existing approaches and propose resolutions that improve both performance and accuracy of calibration. We also address a problem of no-arbitrage pricing when using a trained neural net, that is currently ignored in the literature. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.03507&r=all |
By: | Donovan Platt |
Abstract: | Recent advances in computing power and the potential to make more realistic assumptions due to increased flexibility have led to the increased prevalence of simulation models in economics. While models of this class, and particularly agent-based models, are able to replicate a number of empirically-observed stylised facts not easily recovered by more traditional alternatives, such models remain notoriously difficult to estimate due to their lack of tractable likelihood functions. While the estimation literature continues to grow, existing attempts have approached the problem primarily from a frequentist perspective, with the Bayesian estimation literature remaining comparatively less developed. For this reason, we introduce a Bayesian estimation protocol that makes use of deep neural networks to construct an approximation to the likelihood, which we then benchmark against a prominent alternative from the existing literature. Overall, we find that our proposed methodology consistently results in more accurate estimates in a variety of settings, including the estimation of financial heterogeneous agent models and the identification of changes in dynamics occurring in models incorporating structural breaks. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.04522&r=all |
By: | Hyungjun Park; Min Kyu Sim; Dong Gu Choi |
Abstract: | A goal of financial portfolio trading is maximizing the trader's utility by allocating capital to assets in a portfolio in the investment horizon. Our study suggests an approach for deriving an intelligent portfolio trading strategy using deep Q-learning. In this approach, we introduce a Markov decision process model to enable an agent to learn about the financial environment and develop a deep neural network structure to approximate a Q-function. In addition, we devise three techniques to derive a trading strategy that chooses reasonable actions and is applicable to the real world. First, the action space of the learning agent is modeled as an intuitive set of trading directions that can be carried out for individual assets in the portfolio. Second, we introduce a mapping function that can replace an infeasible agent action in each state with a similar and valuable action to derive a reasonable trading strategy. Last, we introduce a method by which an agent simulates all feasible actions and learns about these experiences to utilize the training data efficiently. To validate our approach, we conduct backtests for two representative portfolios, and we find that the intelligent strategy derived using our approach is superior to the benchmark strategies. |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1907.03665&r=all |
By: | Jeremy D. Turiel; Tomaso Aste |
Abstract: | Logistic Regression and Support Vector Machine algorithms, together with Linear and Non-Linear Deep Neural Networks, are applied to lending data in order to replicate lender acceptance of loans and predict the likelihood of default of issued loans. A two phase model is proposed; the first phase predicts loan rejection, while the second one predicts default risk for approved loans. Logistic Regression was found to be the best performer for the first phase, with test set recall macro score of $77.4 \%$. Deep Neural Networks were applied to the second phase only, were they achieved best performance, with validation set recall score of $72 \%$, for defaults. This shows that AI can improve current credit risk models reducing the default risk of issued loans by as much as $70 \%$. The models were also applied to loans taken for small businesses alone. The first phase of the model performs significantly better when trained on the whole dataset. Instead, the second phase performs significantly better when trained on the small business subset. This suggests a potential discrepancy between how these loans are screened and how they should be analysed in terms of default prediction. |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1907.01800&r=all |
By: | Michael A. Kouritzin; Anne MacKay |
Abstract: | The use of sequential Monte Carlo within simulation for path-dependent option pricing is proposed and evaluated. Recently, it was shown that explicit solutions and importance sampling are valuable for efficient simulation of spot price and volatility, especially for purposes of path-dependent option pricing. The resulting simulation algorithm is an analog to the weighted particle filtering algorithm that might be improved by resampling or branching. Indeed, some branching algorithms are shown herein to improve pricing performance substantially while some resampling algorithms are shown to be less suitable in certain cases. A historical property is given and explained as the distinguishing feature between the sequential Monte Carlo algorithms that work on path-dependent option pricing and those that do not. In particular, it is recommended to use the so-called effective particle branching algorithm within importance-sampling Monte Carlo methods for path-dependent option pricing. All recommendations are based upon numeric comparison of option pricing problems in the Heston model. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1907.00219&r=all |
By: | Yuxuan Huang; Luiz Fernando Capretz; Danny Ho |
Abstract: | Application of neural network architectures for financial prediction has been actively studied in recent years. This paper presents a comparative study that investigates and compares feed-forward neural network (FNN) and adaptive neural fuzzy inference system (ANFIS) on stock prediction using fundamental financial ratios. The study is designed to evaluate the performance of each architecture based on the relative return of the selected portfolios with respect to the benchmark stock index. The results show that both architectures possess the ability to separate winners and losers from a sample universe of stocks, and the selected portfolios outperform the benchmark. Our study argues that FNN shows superior performance over ANFIS. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.05327&r=all |
By: | Laliotis, Dimitrios; Buesa, Alejandro; Leber, Miha; Población García, Francisco Javier |
Abstract: | We assess the effects of regulatory caps in the loan-to-value (LTV) ratio using agent-based models (ABMs). Our approach builds upon a straightforward ABM where we model the interactions of sellers, buyers and banks within a computational framework that enables the application of LTV caps. The results are first presented using simulated data and then we calibrate the probability distributions based on actual European data from the HFCS survey. The results suggest that this approach can be viewed as a useful alternative to the existing analytical frameworks for assessing the impact of macroprudential measures, mainly due to the very few assumptions the method relies upon and the ability to easily incorporate additional and more complex features related to the behavioral response of borrowers to such measures. JEL Classification: D14, D31, E50, R21 |
Keywords: | borrower-based measures, HFCS survey, house prices, macroprudential policy |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:ecb:ecbwps:20192294&r=all |
By: | Soybilgen, Baris |
Abstract: | We use dynamic factors and neural network models to identify current and past states (instead of future) of the US business cycle. In the first step, we reduce noise in data by using a moving average filter. Then, dynamic factors are extracted from a large-scale data set consisted of more than 100 variables. In the last step, these dynamic factors are fed into the neural network model for predicting business cycle regimes. We show that our proposed method follows US business cycle regimes quite accurately in sample and out of sample without taking account of the historical data availability. Our results also indicate that noise reduction is an important step for business cycle prediction. Furthermore using pseudo real time and vintage data, we show that our neural network model identifies turning points quite accurately and very quickly in real time. |
Keywords: | Dynamic Factor Model; Neural Network; Recession; Business Cycle |
JEL: | C38 E32 E37 |
Date: | 2018–07–05 |
URL: | http://d.repec.org/n?u=RePEc:pra:mprapa:94715&r=all |
By: | Maarten Buis (Universität Konstanz) |
Abstract: | An Agent Based Model (ABM) is a simulation in which agents that each follow simple rules interact with one another and thus produce an often surprising outcome at the macro level. The purpose of an ABM is to explore mechanisms through which actions of the individual agents add up to a macro outcome by varying the rules that agents have to follow or varying with whom the agent can interact (for example, varying the network). A simple example of an ABM is Schelling's segregation model, in which he showed that one does not need racists to produce segregated neighborhoods. The model starts with 25 red and 25 blue agents, each of which live in a cell of a chessboard. They can have up to 8 neighbors. In order for an agent to be happy, they need to have some, e.g. 30%, agents in the neighborhood of the same color. If the agent is unhappy, they will move to another empty cell that will make them happy. If we repeat this until everybody is happy or nobody can move, we will often end up with segregated neighborhoods. Implementing a new ABM will always require programming, but a lot of the tasks will be similar across ABMs. For example, in many ABMs the agents live on a square grid (like a chessboard), and can only interact with their neighbors. I have created a set of Mata functions that will do those tasks, and someone can also import their own ABM. In this presentation, I will illustrate how to build an ABM in Mata with these functions. |
Date: | 2019–07–10 |
URL: | http://d.repec.org/n?u=RePEc:boc:dsug19:04&r=all |
By: | Lotfi Boudabsa; Damir Filipovic |
Abstract: | We introduce a computational framework for dynamic portfolio valuation and risk management building on machine learning with kernels. We learn the replicating martingale of a portfolio from a finite sample of its terminal cumulative cash flow. The learned replicating martingale is given in closed form thanks to a suitable choice of the kernel. We develop an asymptotic theory and prove convergence and a central limit theorem. We also derive finite sample error bounds and concentration inequalities. Numerical examples show good results for a relatively small training sample size. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.03726&r=all |
By: | Damien Ackerer; Natasa Tagasovska; Thibault Vatter |
Abstract: | We present an artificial neural network (ANN) approach to value financial derivatives. Atypically to standard ANN applications, practitioners equally use option pricing models to validate market prices and to infer unobserved prices. Importantly, models need to generate realistic arbitrage-free prices, meaning that no option portfolio can lead to risk-free profits. The absence of arbitrage opportunities is guaranteed by penalizing the loss using soft constraints on an extended grid of input values. ANNs can be pre-trained by first calibrating a standard option pricing model, and then training an ANN to a larger synthetic dataset generated from the calibrated model. The parameters transfer as well as the non-arbitrage constraints appear to be particularly useful when only sparse or erroneous data are available. We also explore how deeper ANNs improve over shallower ones, as well as other properties of the network architecture. We benchmark our method against standard option pricing models, such as Heston with and without jumps. We validate our method both on training sets, and testing sets, namely, highlighting both their capacity to reproduce observed prices and predict new ones. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.05065&r=all |
By: | Michael Allan Ribers; Hannes Ullrich |
Abstract: | Antibiotic resistance constitutes a major health threat. Predicting bacterial causes of infections is key to reducing antibiotic misuse, a leading driver of antibiotic resistance. We train a machine learning algorithm on administrative and microbiological laboratory data from Denmark to predict diagnostic test outcomes for urinary tract infections. Based on predictions, we develop policies to improve prescribing in primary care, highlighting the relevance of physician expertise and policy implementation when patient distributions vary over time. The proposed policies delay antibiotic prescriptions for some patients until test results are known and give them instantly to others. We find that machine learning can reduce antibiotic use by 7.42 percent without reducing the number of treated bacterial infections. As Denmark is one of the most conservative countries in terms of antibiotic use, this result is likely to be a lower bound of what can be achieved elsewhere. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.03044&r=all |
By: | Hung Ba |
Abstract: | In this study, we employ Generative Adversarial Networks as an oversampling method to generate artificial data to assist with the classification of credit card fraudulent transactions. GANs is a generative model based on the idea of game theory, in which a generator G and a discriminator D are trying to outsmart each other. The objective of the generator is to confuse the discriminator. The objective of the discriminator is to distinguish the instances coming from the generator and the instances coming from the original dataset. By training GANs on a set of credit card fraudulent transactions, we are able to improve the discriminatory power of classifiers. The experiment results show that the Wasserstein-GAN is more stable in training and produce more realistic fraudulent transactions than the other GANs. On the other hand, the conditional version of GANs in which labels are set by k-means clustering does not necessarily improve the non-conditional versions of GANs. |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1907.03355&r=all |
By: | Brandon Da Silva; Sylvie Shang Shi |
Abstract: | Training deep learning models that generalize well to live deployment is a challenging problem in the financial markets. The challenge arises because of high dimensionality, limited observations, changing data distributions, and a low signal-to-noise ratio. High dimensionality can be dealt with using robust feature selection or dimensionality reduction, but limited observations often result in a model that overfits due to the large parameter space of most deep neural networks. We propose a generative model for financial time series, which allows us to train deep learning models on millions of simulated paths. We show that our generative model is able to create realistic paths that embed the underlying structure of the markets in a way stochastic processes cannot. |
Date: | 2019–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.03232&r=all |
By: | Bertin Martens (European Commission – JRC - IPTS); Songül Tolan (European Commission – JRC) |
Abstract: | There is a long-standing economic research literature on the impact of technological innovation and automation in general on employment and economic growth. Traditional economic models trade off a negative displacement or substitution effect against a positive complementarity effect on employment. Economic history since the industrial revolution as strongly supports the view that the net effect on employment and incomes is positive though recent evidence points to a declining labour share in total income. There are concerns that with artificial intelligence (AI) "this time may be different". The state-of-the-art task-based model creates an environment where humans and machines compete for the completion of tasks. It emphasizes the labour substitution effects of automation. This has been tested on robots data, with mixed results. However, the economic characteristics of rival robots are not comparable with non-rival and scalable AI algorithms that may constitute a general purpose technology and may accelerate the pace of innovation in itself. These characteristics give a hint that this time might indeed be different. However, there is as yet very little empirical evidence that relates AI or Machine Learning (ML) to employment and incomes. General growth models can only present a wide range of highly diverging and hypothetical scenarios, from growth implosion to an optimistic future with growth acceleration. Even extreme scenarios of displacement of men by machines offer hope for an overall wealthier economic future. The literature is clearer on the negative implications that automation may have for income equality. Redistributive policies to counteract this trend will have to incorporate behavioural responses to such policies. We conclude that that there are some elements that suggest that the nature of AI/ML is different from previous technological change but there is no empirical evidence yet to underpin this view. |
Keywords: | labour markets, employment, technological change, task-based model, artificial intelligence, income distribution, |
JEL: | J62 O33 |
Date: | 2018–08 |
URL: | http://d.repec.org/n?u=RePEc:ipt:decwpa:2018-08&r=all |
By: | Xinyi Li; Yinchuan Li; Yuancheng Zhan; Xiao-Yang Liu |
Abstract: | Portfolio allocation is crucial for investment companies. However, getting the best strategy in a complex and dynamic stock market is challenging. In this paper, we propose a novel Adaptive Deep Deterministic Reinforcement Learning scheme (Adaptive DDPG) for the portfolio allocation task, which incorporates optimistic or pessimistic deep reinforcement learning that is reflected in the influence from prediction errors. Dow Jones 30 component stocks are selected as our trading stocks and their daily prices are used as the training and testing data. We train the Adaptive DDPG agent and obtain a trading strategy. The Adaptive DDPG's performance is compared with the vanilla DDPG, Dow Jones Industrial Average index and the traditional min-variance and mean-variance portfolio allocation strategies. Adaptive DDPG outperforms the baselines in terms of the investment return and the Sharpe ratio. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1907.01503&r=all |
By: | Emir Hrnjic; Nikodem Tomczak |
Abstract: | Behavioral economics changed the way we think about market participants and revolutionized policy-making by introducing the concept of choice architecture. However, even though effective on the level of a population, interventions from behavioral economics, nudges, are often characterized by weak generalisation as they struggle on the level of individuals. Recent developments in data science, artificial intelligence (AI) and machine learning (ML) have shown ability to alleviate some of the problems of weak generalisation by providing tools and methods that result in models with stronger predictive power. This paper aims to describe how ML and AI can work with behavioral economics to support and augment decision-making and inform policy decisions by designing personalized interventions, assuming that enough personalized traits and psychological variables can be sampled. |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1907.02100&r=all |
By: | Berk, Istemi (Dokuz Eylul University); Çam , Eren (Energiewirtschaftliches Institut an der Universitaet zu Koeln (EWI)) |
Abstract: | The global crude oil market has gone through two important phases over the recent years. The first one was the price collapse that started in the third quarter of 2014 and continued until mid-2016. The second phase occurred in late 2016, after major producers within and outside OPEC agreed to cut production in order to adjust the ongoing fall in oil prices, which is now known as the OPEC+ agreement. This paper analyzes the effects of these recent developments on the market structure and on the behavior of major producers in the market. To this end, we develop a partial equilibrium model with a spatial structure for the global crude oil market and simulate the market for the period between 2013 and 2017 under oligopolistic, cartel and perfectly competitive market structure setups. The simulation results reveal that, although the oligopolistic market structures fit overall well to the realized market outcomes, they are not successful at explaining the low prices during 2015 and 2016, which instead are closer to estimated competitive levels. Moreover, we further suggest that from 2014 onward, the market power potential of major suppliers has shrunk considerably, supporting the view that the market has become more competitive. We also analyze the Saudi Arabia- and Russia-led OPEC+ agreement, and find that planned production cuts in 2017, particularly of Saudi Arabia (486 thousand barrels/day) and Russia (300 thousand barrels/day), were below the levels of estimated non-competitive market structure setups. This explains why the oil prices did not recover to pre-2014 levels although a temporary adjustment was observed in 2017. |
Keywords: | Crude Oil Market Structure; 2014 Oil Price Decline; OPEC+ Agreement; Market Simulation Model; DROPS |
JEL: | C63 D43 Q31 Q41 |
Date: | 2019–07–15 |
URL: | http://d.repec.org/n?u=RePEc:ris:ewikln:2019_005&r=all |
By: | Rukmal Weerawarana; Yiyi Zhu; Yuzhen He |
Abstract: | Market sectors play a key role in the efficient flow of capital through the modern Global economy. We analyze existing sectorization heuristics, and observe that the most popular - the GICS (which informs the S&P 500), and the NAICS (published by the U.S. Government) - are not entirely quantitatively driven, but rather appear to be highly subjective and rooted in dogma. Building on inferences from analysis of the capital structure irrelevance principle and the Modigliani-Miller theoretic universe conditions, we postulate that corporation fundamentals - particularly those components specific to the Modigliani-Miller universe conditions - would be optimal descriptors of the true economic domain of operation of a company. We generate a set of potential candidate learned sector universes by varying the linkage method of a hierarchical clustering algorithm, and the number of resulting sectors derived from the model (ranging from 5 to 19), resulting in a total of 60 candidate learned sector universes. We then introduce reIndexer, a backtest-driven sector universe evaluation research tool, to rank the candidate sector universes produced by our learned sector classification heuristic. This rank was utilized to identify the risk-adjusted return optimal learned sector universe as being the universe generated under CLINK (i.e. complete linkage), with 17 sectors. The optimal learned sector universe was tested against the benchmark GICS classification universe with reIndexer, outperforming on both absolute portfolio value, and risk-adjusted return over the backtest period. We conclude that our fundamentals-driven Learned Sector classification heuristic provides a superior risk-diversification profile than the status quo classification heuristic. |
Date: | 2019–05 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.03935&r=all |
By: | Crowley, Patrick M.; Hudgins, David |
Abstract: | It is widely recognized that the policy objectives of fiscal and monetary policymakers usually have different time horizons, and this feature may not be captured by traditional econometric techniques. In this paper, we first decompose U.S macroeconomic data using a time-frequency domain technique, namely discrete wavelet analysis. We then model the behavior of the U.S. economy over each wavelet frequency range and use our estimated parameters to construct a tracking model. To illustrate the usefulness of this approach, we simulate jointly optimal fiscal and monetary policy with different short-term targets: an inflation target, a money growth target, an interest rate target, and a real exchange rate target. The results determine the reaction in fiscal and monetary policy that is required to achieve an inflation target in a low inflation environment, and when both fiscal and monetary policy are concerned with meeting certain economic growth objectives. The combination of wavelet decomposition in an optimal control framework can also provide a new approach to macroeconomic forecasting. |
JEL: | C61 C63 C88 E52 E61 F47 |
Date: | 2019–07–10 |
URL: | http://d.repec.org/n?u=RePEc:bof:bofrdp:2019_011&r=all |
By: | Rémy Le Boennec (Institut VEDECOM); Fouad Hadj Selem (Institut VEDECOM); Ghazaleh Khodabandelou (Institut VEDECOM) |
Keywords: | Intelligence artificielle,Inférence des flux de mobilité,Déplacement domicile-travail,Report modal |
Date: | 2019–06–11 |
URL: | http://d.repec.org/n?u=RePEc:hal:journl:hal-02160862&r=all |
By: | Karen Turner; Gioele Figus; Kim Swales; L. (Lisa B.) Ryan; et al. |
Abstract: | Technological change is necessary for economies to grow and develop. This paper investigates how this technological change could be directed in order to simultaneously reduce carbon-intensive energy use and deliver a range of economic benefits. Using both partial and general equilibrium modelling, we consider improvements in the efficiency in the delivery of electricity as an increasingly low carbon option in the UK. We demonstrate how linking this to policy action to assist and encourage households to substitute away from more carbon-intensive gas- to electricity-powered heating systems may change the composition of energy use, and implied emissions intensity, but not the level of the resulting economic expansion. |
Keywords: | Technological change; CGE models; Multiple benefits; Rebound |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:ucn:oapubs:10197/10840&r=all |
By: | Xin Qian; Yudong Chen; Andreea Minca |
Abstract: | For the degree corrected stochastic block model in the presence of arbitrary or even adversarial outliers, we develop a convex-optimization-based clustering algorithm that includes a penalization term depending on the positive deviation of a node from the expected number of edges to other inliers. We prove that under mild conditions, this method achieves exact recovery of the underlying clusters. Our synthetic experiments show that our algorithm performs well on heterogeneous networks, and in particular those with Pareto degree distributions, for which outliers have a broad range of possible degrees that may enhance their adversarial power. We also demonstrate that our method allows for recovery with significantly lower error rates compared to existing algorithms. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.03305&r=all |
By: | Clement Gastaud; Theophile Carniel; Jean-Michel Dalle |
Abstract: | We address the issue of the factors driving startup success in raising funds. Using the popular and public startup database Crunchbase, we explicitly take into account two extrinsic characteristics of startups: the competition that the companies face, using similarity measures derived from the Word2Vec algorithm, as well as the position of investors in the investment network, pioneering the use of Graph Neural Networks (GNN), a recent deep learning technique that enables the handling of graphs as such and as a whole. We show that the different stages of fundraising, early- and growth-stage, are associated with different success factors. Our results suggest a marked relevance of startup competition for early stage while growth-stage fundraising is influenced by network features. Both of these factors tend to average out in global models, which could lead to the false impression that startup success in fundraising would mostly if not only be influenced by its intrinsic characteristics, notably those of their founders. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.03210&r=all |
By: | Tianyao Chen; Xue Cheng; Jingping Yang |
Abstract: | In this paper, we develop a theory of common decomposition for two correlated Brownian motions, in which, by using change of time method, the correlated Brownian motions are represented by a triple of processes, $(X,Y,T)$, where $X$ and $Y$ are independent Brownian motions. We show the equivalent conditions for the triple being independent. We discuss the connection and difference of the common decomposition with the local correlation model. Indicated by the discussion, we propose a new method for constructing correlated Brownian motions which performs very well in simulation. For applications, we use these very general results for pricing of two-factor financial derivatives whose payoffs rely very much on the correlations of underlyings. And in addition, with the help of numerical method, we also make a discussion of the pricing deviation when substituting a constant correlation model for a general one. |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1907.03295&r=all |
By: | Chahboun, Imad (Federal Reserve Bank of Boston); Hoover, Nathaniel (Federal Reserve Bank of Boston) |
Abstract: | This paper presents a quantitative model designed to understand the sensitivity of variable annuity (VA) contracts to market and actuarial assumptions and how these sensitivities make them a potentially important source of risk to insurance companies during times of stress. VA contracts often include long dated guarantees of market performance that expose the insurer to multiple nondiversifiable risks. Our modeling framework employs a Monte Carlo simulation of asset returns and policyholder behavior to derive fair prices for variable annuities in a risk neutral framework and to estimate sensitivities of reserve requirements under a real‐world probability measure. Simulated economic scenarios are applied to four hypothetical insurance company VA portfolios to assess the sensitivity of portfolio pricing and reserve levels to portfolio characteristics, modelling choices, and underlying economic assumptions. Additionally, a deterministic stress scenario, modeled on Japan beginning in the mid‐90s, is used to estimate the potential impact of a severe, but plausible, economic environment on the four hypothetical portfolios. The main findings of this exercise are: (1) interactions between market risk modeling assumptions and policyholder behavior modeling assumptions can significantly impact the estimated costs of providing guarantees, (2) estimated VA prices and reserve requirements are sensitive to market price discontinuities and multiple shocks to asset prices, (3) VA prices are very sensitive to assumptions related to interest rates, asset returns, and policyholder behavior, and (4) a drawn‐out period of low interest rates and asset underperformance, even if not accompanied by dramatic equity losses, is likely to result in significant losses in VA portfolios. |
Keywords: | insurance risk; market risk; variable annuities; derivative pricing; policyholder behavior |
JEL: | C15 G12 G17 G22 G23 |
Date: | 2019–04–09 |
URL: | http://d.repec.org/n?u=RePEc:fip:fedbqu:rpa19-1&r=all |
By: | Jose Luis Montiel Olea; Pietro Ortoleva; Mallesh M Pai; Andrea Prat |
Abstract: | Different agents compete to predict a variable of interest related to a set of covariates via an unknown data generating process. All agents are Bayesian, but may consider different subsets of covariates to make their prediction. After observing a common dataset, who has the highest confidence in her predictive ability? We characterize it and show that it crucially depends on the size of the dataset. With small data, typically it is an agent using a model that is `small-dimensional,' in the sense of considering fewer covariates than the true data generating process. With big data, it is instead typically `large-dimensional,' possibly using more variables than the true model. These features are reminiscent of model selection techniques used in statistics and machine learning. However, here model selection does not emerge normatively, but positively as the outcome of competition between standard Bayesian decision makers. The theory is applied to auctions of assets where bidders observe the same information but hold different priors. |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1907.03809&r=all |
By: | Catherine D'Hondt; Rudy De Winne; Eric Ghysels; Steve Raymond |
Abstract: | Artificial intelligence, or AI, enhancements are increasingly shaping our daily lives. Financial decision-making is no exception to this. We introduce the notion of AI Alter Egos, which are shadow robo-investors, and use a unique data set covering brokerage accounts for a large cross-section of investors over a sample from January 2003 to March 2012, which includes the 2008 financial crisis, to assess the benefits of robo-investing. We have detailed investor characteristics and records of all trades. Our data set consists of investors typically targeted for robo-advising. We explore robo-investing strategies commonly used in the industry, including some involving advanced machine learning methods. The man versus machine comparison allows us to shed light on potential benefits the emerging robo-advising industry may provide to certain segments of the population, such as low income and/or high risk averse investors. |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1907.03370&r=all |
By: | Zihao Zhang; Stefan Zohren; Stephen Roberts |
Abstract: | We showcase how Quantile Regression (QR) can be applied to forecast financial returns using Limit Order Books (LOBs), the canonical data source of high-frequency financial time-series. We develop a deep learning architecture that simultaneously models the return quantiles for both buy and sell positions. We test our model over millions of LOB updates across multiple different instruments on the London Stock Exchange. Our results suggest that the proposed network not only delivers excellent performance but also provides improved prediction robustness by combining quantile estimates. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.04404&r=all |
By: | Michael Lechner; Gabriel Okasa |
Abstract: | In econometrics so-called ordered choice models are popular when interest is in the estimation of the probabilities of particular values of categorical outcome variables with an inherent ordering, conditional on covariates. In this paper we develop a new machine learning estimator based on the random forest algorithm for such models without imposing any distributional assumptions. The proposed Ordered Forest estimator provides a flexible estimation method of the conditional choice probabilities that can naturally deal with nonlinearities in the data, while taking the ordering information explicitly into account. In addition to common machine learning estimators, it enables the estimation of marginal effects as well as conducting inference thereof and thus providing the same output as classical econometric estimators based on ordered logit or probit models. An extensive simulation study examines the finite sample properties of the Ordered Forest and reveals its good predictive performance, particularly in settings with multicollinearity among the predictors and nonlinear functional forms. An empirical application further illustrates the estimation of the marginal effects and their standard errors and demonstrates the advantages of the flexible estimation compared to a parametric benchmark model. |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1907.02436&r=all |
By: | Bertin Martens (European Commission – JRC - IPTS) |
Abstract: | Digitization triggered a steep drop in the cost of information. The resulting data glut created a bottleneck because human cognitive capacity is unable to cope with large amounts of information. Artificial intelligence and machine learning (AI/ML) triggered a similar drop in the cost of machine-based decision-making and helps in overcoming this bottleneck. Substantial change in the relative price of resources puts pressure on ownership and access rights to these resources. This explains pressure on access rights to data. ML thrives on access to big and varied datasets. We discuss the implications of access regimes for the development of AI in its current form of ML. The economic characteristics of data (non-rivalry, economies of scale and scope) favour data aggregation in big datasets. Non-rivalry implies the need for exclusive rights in order to incentivise data production when it is costly. The balance between access and exclusion is at the centre of the debate on data regimes. We explore the economic implications of several modalities for access to data, ranging from exclusive monopolistic control to monopolistic competition and free access. Regulatory intervention may push the market beyond voluntary exchanges, either towards more openness or reduced access. This may generate private costs for firms and individuals. Society can choose to do so if the social benefits of this intervention outweigh the private costs. We briefly discuss the main EU legal instruments that are relevant for data access and ownership, including the General Data Protection Regulation (GDPR) that defines the rights of data subjects with respect to their personal data and the Database Directive (DBD) that grants ownership rights to database producers. These two instruments leave a wide legal no-man's land where data access is ruled by bilateral contracts and Technical Protection Measures that give exclusive control to de facto data holders, and by market forces that drive access, trade and pricing of data. The absence of exclusive rights might facilitate data sharing and access or it may result in a segmented data landscape where data aggregation for ML purposes is hard to achieve. It is unclear if incompletely specified ownership and access rights maximize the welfare of society and facilitate the development of AI/ML. |
Keywords: | digital data, ownership and access rights, trade in data, machine learning, artificial intelligence |
JEL: | L00 |
Date: | 2018–09 |
URL: | http://d.repec.org/n?u=RePEc:ipt:decwpa:2018-09&r=all |
By: | Fabrice Daniel |
Abstract: | This article studies the financial time series data processing for machine learning. It introduces the most frequent scaling methods, then compares the resulting stationarity and preservation of useful information for trend forecasting. It proposes an empirical test based on the capability to learn simple data relationship with simple models. It also speaks about the data split method specific to time series, avoiding unwanted overfitting and proposes various labelling for classification and regression. |
Date: | 2019–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1907.03010&r=all |
By: | Matias Barenstein |
Abstract: | In this paper I re-examine the COMPAS recidivism score and criminal history data collected by ProPublica in 2016, which has fueled intense debate and research in the nascent field of `algorithmic fairness' or `fair machine learning' over the past three years. ProPublica's COMPAS data is used in an ever-increasing number of studies to test various definitions and methodologies of algorithmic fairness. This paper takes a closer look at the actual datasets put together by ProPublica. In particular, I examine the distribution of defendants across COMPAS screening dates and find that ProPublica made an important data processing mistake when it created some of the key datasets most often used by other researchers. Specifically, the datasets built to study the likelihood of recidivism within two years of the original COMPAS screening date. As I show in this paper, ProPublica made a mistake implementing the two-year sample cutoff rule for recidivists in such datasets (whereas it implemented an appropriate two-year sample cutoff rule for non-recidivists). As a result, ProPublica incorrectly kept a disproportionate share of recidivists. This data processing mistake leads to biased two-year recidivism datasets, with artificially high recidivism rates. This also affects the positive and negative predictive values. On the other hand, this data processing mistake does not impact some of the key statistical measures highlighted by ProPublica and other researchers, such as the false positive and false negative rates, nor the overall accuracy. |
Date: | 2019–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:1906.04711&r=all |