|
on Computational Economics |
Issue of 2020‒09‒07
thirty-one papers chosen by |
By: | Grzegorz Marcjasz; Jesus Lago; Rafa{\l} Weron |
Abstract: | Recent advancements in the fields of artificial intelligence and machine learning methods resulted in a significant increase of their popularity in the literature, including electricity price forecasting. Said methods cover a very broad spectrum, from decision trees, through random forests to various artificial neural network models and hybrid approaches. In electricity price forecasting, neural networks are the most popular machine learning method as they provide a non-linear counterpart for well-tested linear regression models. Their application, however, is not straightforward, with multiple implementation factors to consider. One of such factors is the network's structure. This paper provides a comprehensive comparison of two most common structures when using the deep neural networks -- one that focuses on each hour of the day separately, and one that reflects the daily auction structure and models vectors of the prices. The results show a significant accuracy advantage of using the latter, confirmed on data from five distinct power exchanges. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.08006&r=all |
By: | Patryk Gierjatowicz; Marc Sabate-Vidales; David \v{S}i\v{s}ka; Lukasz Szpruch; \v{Z}an \v{Z}uri\v{c} |
Abstract: | Mathematical modelling is ubiquitous in the financial industry and drives key decision processes. Any given model provides only a crude approximation to reality and the risk of using an inadequate model is hard to detect and quantify. By contrast, modern data science techniques are opening the door to more robust and data-driven model selection mechanisms. However, most machine learning models are "black-boxes" as individual parameters do not have meaningful interpretation. The aim of this paper is to combine the above approaches achieving the best of both worlds. Combining neural networks with risk models based on classical stochastic differential equations (SDEs), we find robust bounds for prices of derivatives and the corresponding hedging strategies while incorporating relevant market data. The resulting model called neural SDE is an instantiation of generative models and is closely linked with the theory of causal optimal transport. Neural SDEs allow consistent calibration under both the risk-neutral and the real-world measures. Thus the model can be used to simulate market scenarios needed for assessing risk profiles and hedging strategies. We develop and analyse novel algorithms needed for efficient use of neural SDEs. We validate our approach with numerical experiments using both local and stochastic volatility models. |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.04154&r=all |
By: | Eduardo Ramos-P\'erez; Pablo J. Alonso-Gonz\'alez; Jos\'e Javier N\'u\~nez-Vel\'azquez |
Abstract: | Currently, legal requirements demand that insurance companies increase their emphasis on monitoring the risks linked to the underwriting and asset management activities. Regarding underwriting risks, the main uncertainties that insurers must manage are related to the premium sufficiency to cover future claims and the adequacy of the current reserves to pay outstanding claims. Both risks are calibrated using stochastic models due to their nature. This paper introduces a reserving model based on a set of machine learning techniques such as Gradient Boosting, Random Forest and Artificial Neural Networks. These algorithms and other widely used reserving models are stacked to predict the shape of the runoff. To compute the deviation around a former prediction, a log-normal approach is combined with the suggested model. The empirical results demonstrate that the proposed methodology can be used to improve the performance of the traditional reserving techniques based on Bayesian statistics and a Chain Ladder, leading to a more accurate assessment of the reserving risk. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.07564&r=all |
By: | Peter Belcak; Peter Belcak; Jan-Peter Calliess; Stefan Zohren |
Abstract: | We introduce a new software toolbox, called Multi-Agent eXchange Environment (MAXE), for agent-based simulation of limit order books. Offering both efficient C++ implementations and Python APIs, it allows the user to simulate large-scale agent-based market models while providing user-friendliness for rapid prototyping. Furthermore, it benefits from a versatile message-driven architecture that offers the flexibility to simulate a range of different (easily customisable) market rules and to study the effect of auxiliary factors, such as delays, on the market dynamics. Showcasing its utility for research, we employ our simulator to investigate the influence the choice of the matching algorithm has on the behaviour of artificial trader agents in a zero-intelligence model. In addition, we investigate the role of the order processing delay in normal trading on an exchange and in the scenario of a significant price change. Our results include the findings that (i) the variance of the bid-ask spread exhibits a behavior similar to resonance of a damped harmonic oscillator with respect to the processing delay and that (ii) the delay markedly affects the impact a large trade has on the limit order book. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.07871&r=all |
By: | Loris Cannelli; Giuseppe Nuti; Marzio Sala; Oleg Szehr |
Abstract: | The construction of replication strategies for contingent claims in the presence of risk and market friction is a key problem of financial engineering. In real markets, continuous replication, such as in the model of Black, Scholes and Merton, is not only unrealistic but it is also undesirable due to high transaction costs. Over the last decades stochastic optimal-control methods have been developed to balance between effective replication and losses. More recently, with the rise of artificial intelligence, temporal-difference Reinforcement Learning, in particular variations of $Q$-learning in conjunction with Deep Neural Networks, have attracted significant interest. From a practical point of view, however, such methods are often relatively sample inefficient, hard to train and lack performance guarantees. This motivates the investigation of a stable benchmark algorithm for hedging. In this article, the hedging problem is viewed as an instance of a risk-averse contextual $k$-armed bandit problem, for which a large body of theoretical results and well-studied algorithms are available. We find that the $k$-armed bandit model naturally fits to the $P\&L$ formulation of hedging, providing for a more accurate and sample efficient approach than $Q$-learning and reducing to the Black-Scholes model in the absence of transaction costs and risks. |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.01623&r=all |
By: | Jie Fang; Jianwu Lin; Shutao Xia; Yong Jiang; Zhikang Xia; Xiang Liu |
Abstract: | Instead of conducting manual factor construction based on traditional and behavioural finance analysis, academic researchers and quantitative investment managers have leveraged Genetic Programming (GP) as an automatic feature construction tool in recent years, which builds reverse polish mathematical expressions from trading data into new factors. However, with the development of deep learning, more powerful feature extraction tools are available. This paper proposes Neural Network-based Automatic Factor Construction (NNAFC), a tailored neural network framework that can automatically construct diversified financial factors based on financial domain knowledge and a variety of neural network structures. The experiment results show that NNAFC can construct more informative and diversified factors than GP, to effectively enrich the current factor pool. For the current market, both fully connected and recurrent neural network structures are better at extracting information from financial time series than convolution neural network structures. Moreover, new factors constructed by NNAFC can always improve the return, Sharpe ratio, and the max draw-down of a multi-factor quantitative investment strategy due to their introducing more information and diversification to the existing factor pool. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.06225&r=all |
By: | Dirk Roeder; Georgi Dimitroff |
Abstract: | In a recent paper "Deep Learning Volatility" a fast 2-step deep calibration algorithm for rough volatility models was proposed: in the first step the time consuming mapping from the model parameter to the implied volatilities is learned by a neural network and in the second step standard solver techniques are used to find the best model parameter. In our paper we compare these results with an alternative direct approach where the the mapping from market implied volatilities to model parameters is approximated by the neural network, without the need for an extra solver step. Using a whitening procedure and a projection of the target parameter to [0,1], in order to be able to use a sigmoid type output function we found that the direct approach outperforms the two-step one for the data sets and methods published in "Deep Learning Volatility". For our implementation we use the open source tensorflow 2 library. The paper should be understood as a technical comparison of neural network techniques and not as an methodically new Ansatz. |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.03494&r=all |
By: | Dominique Guegan (UP1 - Université Panthéon-Sorbonne, CES - Centre d'économie de la Sorbonne - UP1 - Université Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique, University of Ca’ Foscari [Venice, Italy]) |
Abstract: | We are interested in the analysis of the concept of interpretability associated with a ML algorithm. We distinguish between the "How", i.e., how a black box or a very complex algorithm works, and the "Why", i.e. why an algorithm produces such a result. These questions appeal to many actors, users, professions, regulators among others. Using a formal standardized framework , we indicate the solutions that exist by specifying which elements of the supply chain are impacted when we provide answers to the previous questions. This presentation, by standardizing the notations, allows to compare the different approaches and to highlight the specificities of each of them: both their objective and their process. The study is not exhaustive and the subject is far from being closed. |
Keywords: | Interpretability,Counterfactual approach,Artificial Intelligence,Agnostic models,LIME method,Machine learning |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:hal:journl:halshs-02900929&r=all |
By: | Maximilian Blömer; Andreas Peichl; Maximilian Joseph Blömer |
Abstract: | This paper describes the ifo Tax and Transfer Behavioral Microsimulation Model (ifo-MSM-TTL), a policy microsimulation model for Germany. The model uses household microdata from the German Socio-Economic Panel and firm data from the German Linked Employer-Employee Dataset. This microsimulation model consists of three components: First, a static module simulates the effects of a tax-benefit reform on the budget of the individual household. This includes taxes on income and consumption, social security contributions, and public transfers. Secondly, behavioral labor supply responses are estimated. Thirdly, a demand module takes into account possible restrictions of labor demand and identifies the partial equilibrium of the labor market after the supply reactions. The demand module distinguishes our model from most other microsimulation tools |
Keywords: | Tax and benefit systems, labor supply, labor demand, Germany, policy simulation |
JEL: | D58 H20 J22 J23 |
Date: | 2020 |
URL: | http://d.repec.org/n?u=RePEc:ces:ifowps:_335&r=all |
By: | Kaukin, Andrei (Каукин, Андрей) (The Russian Presidential Academy of National Economy and Public Administration); Kosarev, Vladimir (Косарев, Владимир) (The Russian Presidential Academy of National Economy and Public Administration) |
Abstract: | The paper presents a method for conditional forecasting of the economic cycle taking into account industry dynamics. The predictive model includes a neural network auto-encoder and an adapted deep convolutional network of the «WaveNet» architecture. The first function block reduces the dimension of the data. The second block predicts the phase of the economic cycle of the studied industry. A neural network uses the main components of the explanatory factors as input. The proposed model can be used both as an independent and an additional method for estimating the growth rate of the industrial production index along with dynamic factor models. |
Date: | 2020–05 |
URL: | http://d.repec.org/n?u=RePEc:rnp:wpaper:052019&r=all |
By: | Antosiewicz, Marek (Institute for Structural Research (IBS)); Fuentes, J. Rodrigo (University of Chile); Lewandowski, Piotr (Institute for Structural Research (IBS)); Witajewski-Baltvilks, Jan (Institute for Structural Research (IBS)) |
Abstract: | In this paper, we assess the distributional impact of introducing a carbon tax in Poland. We apply a two-step simulation procedure. First, we evaluate the economy-wide effects with a dynamic general equilibrium model. Second, we use a microsimulation model based on household budget survey data to assess the effects on various income groups and on inequality. We introduce a new adjustment channel related to employment changes, which is qualitatively different from price and behavioural effects, and is quantitatively important. We nd that the overall distributional effect of a carbon tax is largely driven by how the revenue is spent: distributing the revenues from a carbon tax as lump-sum transfers to households reduces income inequality, while spending the revenues on a reduction of labour taxation increases inequality. These results could be relevant for other coal-producing countries, such as South Africa, Germany, or Australia. |
Keywords: | climate policy, carbon tax, distributional effect, microsimulation, general equilibrium, employment |
JEL: | H23 P18 O15 |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:iza:izadps:dp13481&r=all |
By: | Anke Mönnig (GWS - Institute of Economic Structures Research); Frank Hohmann (GWS - Institute of Economic Structures Research) |
Abstract: | Forecasting has become an important part for policy planning. Economic forecasts offer guidance under conditions. They are used and/or produced by politicians, researchers, companies, associations or unions. They enter decision making processes and have an impact on state budget, consumption decision, personnel strategies – just to name a few. Ex-ante (policy) impact assessment (IA) is a forward-looking concept that has to deal with a lot of unknowns (e.g. natural disasters). The predicted impact is only valid within a certain framework or set of assumptions. However, it allows to pass judgements on the effectiveness and efficiency of planned measures. In many countries, impact assessments “has become an important tool for assisting policy makers in their decision-making process” (Großmann et al. 2016: 13). Due to its importance of (ex-ante) impact assessment, the questions arises regularly whether the forecasted results are robust. Or put differently: How good is the forecast? One method to answer this is to apply counterfactual forecasts. Such counterfactual scenarios or ex-post scenarios are, however, challenging to model. There can be two reasons for going “back to the future” and for facing this challenge: First, to test the accuracy of a forecasting model, or, second, to study the efficiency of an already implemented policy. While the first reasons produces a “first order” ex-post simulation, the second reason is a “second order” ex-post simulation where first order results are used as a baseline. Only first-order simulations can be compared to the already known real world. With some diagnostic checks – such as mean, relative or squared error tests – the model forecasting performance can be evaluated. However, the question is not if there are error terms to be observed but how big they are. Second-order scenarios can only be compared to first-order scenarios, not to actual data. In this paper, we introduce how to perform ex-post forecast with a macroeconometric input-output model. We take the example of COFORCE which has been developed to fore-cast the Chilean economy until 2035 (Mönnig & Bieritz 2019). The remainder of the paper is structured as follows: First, an overview about the challeng-es concerned with ex-post simulations is given. Then, the methodological approach is described. Next, an ex-post simulation is performed on the model COFORCE. The paper concludes with the main findings. |
Keywords: | counterfactual scenarios, macroeconometric model building, input-output |
JEL: | C6 |
Date: | 2020 |
URL: | http://d.repec.org/n?u=RePEc:gws:dpaper:20-1&r=all |
By: | Hannes Mueller (Institut d’Analisi Economica (CSIC), Barcelona GSE); Christopher Rauh (Université de Montréal, CIREQ) |
Abstract: | There is a rising interest in conflict prevention and this interest provides a strong motivation for better conflict forecasting. A key problem of conflict forecasting for preventionis that predicting the start of conflict in previously peaceful countries is extremely hard.To make progress in this hard problem this project exploits both supervised and unsupervised machine learning. Specifically, the latent Dirichlet allocation (LDA) model is usedfor feature extraction from 3.8 million newspaper articles and these features are then usedin a random forest model to predict conflict. We find that several features are negativelyassociated with the outbreak of conflict and these gain importance when predicting hardonsets. This is because the decision tree uses the text features in lower nodes where theyare evaluated conditionally on conflict history, which allows the random forest to adapt tothe hard problem and provides useful forecasts for prevention. |
Date: | 2019–04 |
URL: | http://d.repec.org/n?u=RePEc:mtl:montec:02-2019&r=all |
By: | Dr. Anett Großmann (GWS - Institute of Economic Structures Research); Svenja Schwarz (GWS - Institute of Economic Structures Research); Frank Hohmann (GWS - Institute of Economic Structures Research); Anke Mönnig (GWS - Institute of Economic Structures Research) |
Abstract: | The research project “Development of sustainable strategies in the Chilean mining sector through a regionalized national model” – funded by the German Federal Ministry of Education and Research – analyses the socio-economic impacts of copper on the Chilean economy. For this, the model COFORCE (COpper FORecasting in ChilE, www.coforce.cl) was developed from scratch. First, a macro-econometric input-output (IO) model for Chile (COFORCE) was built in line with the INFORUM (Interindustry FORcasting at the University of Maryland) modelling approach to forecast and simulate the impact of copper industry on the overall economy. Second, due to the importance of Chilean copper exports, the COFORCE model is linked to the bilateral trade model TINFORGE which captures among other world trade of copper between 153 countries. The national COFORCE model receives export demand and import prices from the world model according to its global market shares. Third, the COFORCE model was regionalized by using an Interregional Input-Output table developed by partners in Brazil (Haddad et al. 2018). The national and 15 regional models for Chile are linked via final demand components and industries by applying a top-down approach. Therefore, regional economic growth is mainly driven by the industry structure and inter- and intraregional trade. This set of three projection and simulation models considers the main aspects regarding copper: 1. It is the main exporting product, 2. It has a huge impact on the economic development and 3. The copper industry is regionally differently concentrated. The modelling tools are applied for the evaluation of alternative economic scenarios, e. g. copper export scenarios at the national and subnational level. The main focus of this paper is to introduce the methodology used to regionalize the national model COFORCE, to explain the main transmission channels and to present regional modelling results. The national model COFORCE and the underlying model philosophy and characteristics are explained in detail in Mönnig/Bieritz 2019. Section 4 shows examples of applications and how to implement scenarios by using the graphical user interface solver(c) (see section 4.1). which includes the underlying data set (historic data and forecasted) and supports the user in scenario design. |
Keywords: | Model Building, Input-Output, Sustainable Mining, Copper |
JEL: | C67 R15 R11 |
Date: | 2020 |
URL: | http://d.repec.org/n?u=RePEc:gws:dpaper:20-3&r=all |
By: | Rangan Gupta (Department of Economics, University of Pretoria, Pretoria, 0002, South Africa); Hardik A. Marfatia (Department of Economics, Northeastern Illinois University, 5500 N St Louis Ave, BBH 344G, Chicago, IL 60625, USA); Christian Pierdzioch (Department of Economics, Helmut Schmidt University, Holstenhofweg 85, P.O.B. 700822, 22008 Hamburg, Germany); Afees A. Salisu (Centre for Econometric & Allied Research, University of Ibadan, Ibadan, Nigeria) |
Abstract: | We analyze the role of macroeconomic uncertainty in predicting synchronization in housing price movements across all the United States (US) states plus District of Columbia (DC). We first use a Bayesian dynamic factor model to decompose the house price movements into a national, four regional (Northeast, South, Midwest, and West), and state-specific factors. We then study the ability of macroeconomic uncertainty in forecasting the comovements in housing prices, by controlling for a wide-array of predictors, such as factors derived from a large macroeconomic dataset, oil shocks, and financial market-related uncertainties. To accommodate for multiple predictors and nonlinearities, we take a machine learning approach of random forests. Our results provide strong evidence of forecastability of the national house price factor based on the information content of macroeconomic uncertainties over and above the other predictors. This result also carries over, albeit by a varying degree, to the factors associated with the four census regions, and the overall house price growth of the US economy. Moreover, macroeconomic uncertainty is found to have predictive content for (stochastic) volatility of the national factor and aggregate US house price. Our results have important implications for policymakers and investors. |
Keywords: | Machine learning, Random forests, Bayesian dynamic factor model, Forecasting, Housing markets synchronization, United States |
JEL: | C22 C32 E32 Q02 R30 |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:pre:wpaper:202077&r=all |
By: | Lakshay Chauhan; John Alberg; Zachary C. Lipton |
Abstract: | On a periodic basis, publicly traded companies report fundamentals, financial data including revenue, earnings, debt, among others. Quantitative finance research has identified several factors, functions of the reported data that historically correlate with stock market performance. In this paper, we first show through simulation that if we could select stocks via factors calculated on future fundamentals (via oracle), that our portfolios would far outperform standard factor models. Motivated by this insight, we train deep nets to forecast future fundamentals from a trailing 5-year history. We propose lookahead factor models which plug these predicted future fundamentals into traditional factors. Finally, we incorporate uncertainty estimates from both neural heteroscedastic regression and a dropout-based heuristic, improving performance by adjusting our portfolios to avert risk. In retrospective analysis, we leverage an industry-grade portfolio simulator (backtester) to show simultaneous improvement in annualized return and Sharpe ratio. Specifically, the simulated annualized return for the uncertainty-aware model is 17.7% (vs 14.0% for a standard factor model) and the Sharpe ratio is 0.84 (vs 0.52). |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.04082&r=all |
By: | Burn, Ian (University of Liverpool); Button, Patrick (Tulane University); Munguia Corella, Luis (University of California, Irvine); Neumark, David (University of California, Irvine) |
Abstract: | We study the relationships between ageist stereotypes – as reflected in the language used in job ads – and age discrimination in hiring, exploiting the text of job ads and differences in callbacks to older and younger job applicants from a resume (correspondence study) field experiment (Neumark, Burn, and Button, 2019). Our analysis uses methods from computational linguistics and machine learning to directly identify, in a field-experiment setting, ageist stereotypes that underlie age discrimination in hiring. The methods we develop provide a framework for applied researchers analyzing textual data, highlighting the usefulness of various computer science techniques for empirical economics research. We find evidence that language related to stereotypes of older workers sometimes predicts discrimination against older workers. For men, our evidence points to age stereotypes about all three categories we consider – health, personality, and skill – predicting age discrimination, and for women, age stereotypes about personality. In general, the evidence is much stronger for men, and our results for men are quite consistent with the industrial psychology literature on age stereotypes. |
Keywords: | ageist stereotypes, age discrimination, job ads, machine learning |
JEL: | J14 J7 |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:iza:izadps:dp13506&r=all |
By: | Nora Lustig (Tulane University); Valentina Martinez Pabon (Tulane University); Federico Sanz (Tulane University); Stephen D. Younger (CEQ Institute) |
Abstract: | We use microsimulation to estimate the distributional consequences of covid-19-induced lockdown policies in Argentina, Brazil, Colombia and Mexico. Our estimates of the poverty consequences are worse than many others’ projections because we do not assume that the income losses are proportionally equal across the income distribution. We also simulate the effects of most of the expanded social assistance governments have introduced in response to the crisis. This has a large offsetting effect in Brazil and Argentina, much less in Colombia. In Mexico, there has been no such expansion. Contrary to prior expectations, we find that the worst effects are not on the poorest, but those (roughly) in the middle of the ex ante income distribution. In Brazil we find that poverty among the afrodescendants and indigenous populations increases by more than for whites, but the offsetting effects of expanded social assistance also are larger for the former. In Mexico, the crisis induces significantly less poverty among the indigenous population than it does for the nonindigenous one. In all countries the increase in poverty induced by the lockdown is similar for male- and female-headed households but the offsetting effect of expanded social assistance is greater for female-headed households. |
Keywords: | Covid-19, inequality, poverty, mobility, microsimulations, Latin America |
JEL: | C63 D31 I32 I38 |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:inq:inqwps:ecineq2020-558&r=all |
By: | Loreto Bieritz (GWS - Institute of Economic Structures Research); Anke Mönnig (GWS - Institute of Economic Structures Research) |
Abstract: | As one of the world leading copper producers, Chile`s economy is strongly focused on copper: 14% of its GDP is based on the mining sector, 30% of the country’s total investments (including FDI) and around 45% of Chilean exports originate from copper. Hence, the dependency of Chile on this metal is high. The fact that governmental spending is directly linked to the projected copper price from the copper reference price committee emphasizes the importance of the price for Chile`s economic development. For this purpose the ministry of finance convenes ten national copper experts to a panel and asks them for a copper price projection in constant prices for the next decade. By taking the average of the given projections corrected by the highest and lowest projection, the committee obtains the projected copper price. Due to the current international trade conflicts in between China and the USA the copper demand slowed down especially be-cause China is one of the world`s biggest copper importers. In the two years before the trade conflicts began, an optimistic mood in the world copper market prevailed as world-wide demand and prices rose constantly. In a mid-term perspective, copper demand can be expected to remain at a high and stable level since climate change with its growing environmental demands nourishes copper demand through an increasing market for electromobility and renewable energy. In comparison to combustion engines or conventional energy, these new technologies require a far higher amount of copper (Uken (2011), Warren Centre (2016)). For Chile, these tendencies are promising. But China has started to enlarge its copper purchase from Peru (Fajardo (2017), Ortiz (2017)). The imported Peruvian copper misses the purity of the Chilean one but Chinas refinery ca-pacities gives the country the possibility to import the cheaper one and refine it by them-selves (Campodónico (2016), El Comercio (2017)). The reflexion of the present scenario analysis is based on the question if Chile`s northern neighbour could become a serious competitor. To answer this question, Chilean exports are reduced and the macroeconomic effects for Chile are analysed. For this projection the forecasting and simulation model COFORCE is applied. |
Keywords: | Chilean mining sector, Sustainable Mining, Copper, Chilean economy |
JEL: | C5 O2 Q3 |
Date: | 2020 |
URL: | http://d.repec.org/n?u=RePEc:gws:dpaper:20-2&r=all |
By: | Philippe Goulet Coulombe |
Abstract: | It is notoriously hard to build a bad Random Forest (RF). Concurrently, RF is perhaps the only standard ML algorithm that blatantly overfits in-sample without any consequence out-of-sample. Standard arguments cannot rationalize this paradox. I propose a new explanation: bootstrap aggregation and model perturbation as implemented by RF automatically prune a (latent) true underlying tree. More generally, there is no need to tune the stopping point of a properly randomized ensemble of greedily optimized base learners. Thus, Boosting and MARS are eligible. I empirically demonstrate the property with simulations and real data by reporting that these new ensembles yield equivalent performance to their tuned counterparts. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.07063&r=all |
By: | Deininger,Klaus W.; Ali,Daniel Ayalew; Kussul,Nataliia; Lavreniuk,Mykola; Nivievskyi,Oleg |
Abstract: | To overcome the constraints for policy and practice posed by limited availability of data on crop rotation, this paper applies machine learning to freely available satellite imagery to identify the rotational practices of more than 7,000 villages in Ukraine. Rotation effects estimated based on combining these data with survey-based yield information point toward statistically significant and economically meaningful effects that differ from what has been reported in the literature, highlighting the value of this approach. Independently derived indices of vegetative development and soil water content produce similar results, not only supporting the robustness of the results, but also suggesting that the opportunities for spatial and temporal disaggregation inherent in such data offer tremendous unexploited opportunities for policy-relevant analysis. |
Date: | 2020–06–29 |
URL: | http://d.repec.org/n?u=RePEc:wbk:wbrwps:9306&r=all |
By: | Edmond Lezmi; Jules Roche; Thierry Roncalli; Jiali Xu |
Abstract: | This article explores the use of machine learning models to build a market generator. The underlying idea is to simulate artificial multi-dimensional financial time series, whose statistical properties are the same as those observed in the financial markets. In particular, these synthetic data must preserve the probability distribution of asset returns, the stochastic dependence between the different assets and the autocorrelation across time. The article proposes then a new approach for estimating the probability distribution of backtest statistics. The final objective is to develop a framework for improving the risk management of quantitative investment strategies, in particular in the space of smart beta, factor investing and alternative risk premia. |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.04838&r=all |
By: | Kazuya Kaneko; Koichi Miyamoto; Naoyuki Takeda; Kazuyoshi Yoshino |
Abstract: | Applications of the quantum algorithm for Monte Carlo simulation to pricing of financial derivatives have been discussed in previous papers. However, up to now, the pricing model discussed in such papers is Black-Scholes model, which is important but simple. Therefore, it is motivating to consider how to implement more complex models used in practice in financial institutions. In this paper, we then consider the local volatility (LV) model, in which the volatility of the underlying asset price depends on the price and time. We present two types of implementation. One is the register-per-RN way, which is adopted in most of previous papers. In this way, each of random numbers (RNs) required to generate a path of the asset price is generated on a separated register, so the required qubit number increases in proportion to the number of RNs. The other is the PRN-on-a-register way, which is proposed in the author's previous work. In this way, a sequence of pseudo-random numbers (PRNs) generated on a register is used to generate paths of the asset price, so the required qubit number is reduced with a trade-off against circuit depth. We present circuit diagrams for these two implementations in detail and estimate required resources: qubit number and T-count. |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.01467&r=all |
By: | Christopher Rauh (Université de Montréal, CIREQ) |
Abstract: | In this paper I present a methodology to provide uncertainty measures at the regional level in real time using the full bandwidth of news. In order to do so I download vast amounts of newspaper articles, summarize these into topics using unsupervised machine learning, and then show that the resulting topics foreshadow fluctuations in economic indicators. Given large regional disparities in economic performance and trends within countries, it is particularly important to have regional measures for a policymaker to tailor policy responses. I use a vector-autoregression model for the case of Canada, a large and diverse country, to show that the generated topics are significantly related to movements in economic performance indicators, inflation, and the unemployment rate at the national and provincial level. Evidence is provided that a composite index of the generated diverse topics can serve as a measure of uncertainty. Moreover, I show that some topics are general enough to have homogenous associations across provinces, while others are specific to fluctuations in certain regions. |
Keywords: | machine learning, latent dirichlet allocation, newspaper text, economic uncertainty, topic model, Canada |
Date: | 2019–09 |
URL: | http://d.repec.org/n?u=RePEc:mtl:montec:09-2019&r=all |
By: | Aleksandr Alekseev (Chapman University) |
Abstract: | I study the welfare effect of automation on workers in a setting where technology is complementary but imperfect. Using a modified task-based framework, I argue that imperfect complementary automation can impose non-pecuniary costs on workers via a behavioral channel. The theoretical model suggests that a critical factor determining the welfare effect of imperfect complementary automation is the automatability of the production process. I confirm the model's predictions in an experiment that elicits subjects' revealed preference for automation. Increasing automatability leads to a significant increase in the demand for automation. I explore additional drivers of the demand for automation using machine learning analysis and textual analysis of choice reasons. The analysis reveals that task enjoyment, performance, and cognitive flexibility are the most important predictors of subjects' choices. There is significant heterogeneity in how subjects evaluate imperfect complementary automation. I discuss the implications of my results for workers' welfare, technology adoption, and inequality. |
Keywords: | automation, worker welfare, imperfect technology, task-switching, personnel economics, experiment |
JEL: | C91 D63 D91 M52 J24 O33 |
Date: | 2020 |
URL: | http://d.repec.org/n?u=RePEc:chu:wpaper:20-29&r=all |
By: | Daisuke Miyakawa (Associate Professor, Hitotsubashi University Business School (E-mail: dmiyakawa@hub.hit-u.ac.jp)); Kohei Shintani (Director and Senior Economist, Institute for Monetary and Economic Studies, Bank of Japan (E-mail: kouhei.shintani@boj.or.jp)) |
Abstract: | We document how professional analysts' predictions of firm exits disagree with machine-based predictions. First, on average, human predictions underperform machine predictions. Second, however, the relative performance of human to machine predictions improves for firms with specific characteristics, such as less observable information, possibly due to the unstructured information used only in human predictions. Third, for firms with less information, reallocating prediction tasks from machine to analysts reduces type I error while simultaneously increasing type II error. Under certain conditions, human predictions can outperform machine predictions. |
Keywords: | Machine Learning, Human Prediction, Disagreement |
JEL: | C10 C55 G33 |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:ime:imedps:20-e-11&r=all |
By: | Andr\'es Ram\'irez-Hassan; Raquel Vargas-Correa; Gustavo Garc\'ia; Daniel Londo\~no |
Abstract: | We propose a simple approach to optimally select the number of control units in k nearest neighbors (kNN) algorithm focusing in minimizing the mean squared error for the average treatment effects. Our approach is non-parametric where confidence intervals for the treatment effects were calculated using asymptotic results with bias correction. Simulation exercises show that our approach gets relative small mean squared errors, and a balance between confidence intervals length and type I error. We analyzed the average treatment effects on treated (ATET) of participation in 401(k) plans on accumulated net financial assets confirming significant effects on amount and positive probability of net asset. Our optimal k selection produces significant narrower ATET confidence intervals compared with common practice of using k=1. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.06564&r=all |
By: | Boyuan Zhang |
Abstract: | In this paper, we estimate and leverage latent constant group structure to generate the point, set, and density forecasts for short dynamic panel data. We implement a nonparametric Bayesian approach to simultaneously identify coefficients and group membership in the random effects which are heterogeneous across groups but fixed within a group. This method allows us to incorporate subjective prior knowledge on the group structure that potentially improves the predictive accuracy. In Monte Carlo experiments, we demonstrate that our Bayesian grouped random effects (BGRE) estimators produce accurate estimates and score predictive gains over standard panel data estimators. With a data-driven group structure, the BGRE estimators exhibit comparable accuracy of clustering with the nonsupervised machine learning algorithm Kmeans and outperform Kmeans in a two-step procedure. In the empirical analysis, we apply our method to forecast the investment rate across a broad range of firms and illustrate that the estimated latent group structure facilitate forecasts relative to standard panel data estimators. |
Date: | 2020–07 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2007.02435&r=all |
By: | Daniel Goller |
Abstract: | We analyse a sequential contest with two players in darts where one of the contestants enjoys a technical advantage. Using methods from the causal machine learning literature, we analyse the built-in advantage, which is the first-mover having potentially more but never less moves. Our empirical findings suggest that the first-mover has an 8.6 percentage points higher probability to win the match induced by the technical advantage. Contestants with low performance measures and little experience have the highest built-in advantage. With regard to the fairness principle that contestants with equal abilities should have equal winning probabilities, this contest is ex-ante fair in the case of equal built-in advantages for both competitors and a randomized starting right. Nevertheless, the contest design produces unequal probabilities of winning for equally skilled contestants because of asymmetries in the built-in advantage associated with social pressure for contestants competing at home and away. |
Date: | 2020–08 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2008.07165&r=all |
By: | Jacobs, B.J.D.; Fok, D.; Donkers, A.C.D. |
Abstract: | In modern retail contexts, retailers sell products from vast product assortments to a large and heterogeneous customer base. Understanding purchase behavior in such a context is very important. Standard models cannot be used due to the high dimen- sionality of the data. We propose a new model that creates an efficient dimension reduction through the idea of purchase motivations. We only require customer-level purchase history data, which is ubiquitous in modern retailing. The model han- dles large-scale data and even works in settings with shopping trips consisting of few purchases. As scalability of the model is essential for practical applicability, we develop a fast, custom-made inference algorithm based on variational inference. Essential features of our model are that it accounts for the product, customer and time dimensions present in purchase history data; relates the relevance of moti- vations to customer- and shopping-trip characteristics; captures interdependencies between motivations; and achieves superior predictive performance. Estimation re- sults from this comprehensive model provide deep insights into purchase behavior. Such insights can be used by managers to create more intuitive, better informed, and more effective marketing actions. We illustrate the model using purchase history data from a Fortune 500 retailer involving more than 4,000 unique products. |
Keywords: | dynamic purchase behavior, large-scale assortment, purchase history data, topic model, machine learning, variational inference |
Date: | 2020–08–01 |
URL: | http://d.repec.org/n?u=RePEc:ems:eureri:129674&r=all |
By: | Keeley, Alexander Ryota; Matsumoto, Ken'ichi; Tanaka, Kenta; Sugiawan, Yogi; Managi, Shunsuke |
Abstract: | This study combines regression analysis with machine learning analysis to study the merit order effect of renewable energy focusing on German market, the largest market in Europe with high renewable energy penetration. The results show that electricity from wind and solar sources reduced the spot market price by 9.64 €/MWh on average during the period from 2010 to 2017. Wind had a relatively stable impact across the day, ranging from 5.88 €/MWh to 8.04 €/MWh, while the solar energy impact varied greatly across different hours, ranging from 0.24 €/MWh to 11.78 €/MWh and having a stronger impact than wind during peak hours. The results also show characteristics of the interactions between renewable energy and spot market prices, including the slightly diminishing merit order effect of renewable energy at high generation volumes. Finally, a scenario-based analysis illustrates how different proportions of wind and solar energies affect the spot market price. |
Keywords: | Renewable energy sources, Electricity spot price, Intermittency, Merit order effect, Boosting. |
JEL: | Q41 Q42 Q47 Q56 |
Date: | 2020 |
URL: | http://d.repec.org/n?u=RePEc:pra:mprapa:102314&r=all |