nep-cmp 2020-04-06 papers

on Computational Economics

Issue of 2020‒04‒06
twenty-six papers chosen by

Is the Juice Worth the Squeeze? Machine Learning (ML) In and For Agent-Based Modelling (ABM) By Johannes Dahlke; Kristina Bogner; Matthias Mueller; Thomas Berger; Andreas Pyka; Bernd Ebersberger
Automated Vehicles are Expected to Increase Driving and Emissions Without Policy Intervention By Rodier, Caroline; Jaller, Miguel; Pourrahmani, Elham; Pahwa, Anmol; Bischoff, Joschka; Freedman, Joel
EB-dynaRE: Real-Time Adjustor for Brownian Movement with Examples of Predicting Stock Trends Based on a Novel Event-Based Supervised Learning Algorithm By Yang Chen; Emerson Li
A spatial agent based model for simulating and optimizing networked eco-industrial systems By J. Raimbault; J. Broere; M. Somveille; J. M. Serna; E. Strombom; C. Moore; B. Zhu; L. Sugar
Sentiment, emotions and stock market predictability in developed and emerging markets By Steyn, Dimitri H. W.; Greyling, Talita; Rossouw, Stephanie; Mwamba, John M.
"Mechanism Design with Blockchain Enforcement" By Kohei Maehashi; Mototsugu Shintani
Institutional sector classifier, a machine learning approach By Paolo Massaro; Ilaria Vannini; Oliver Giudice
Causal Simulation Experiments: Lessons from Bias Amplification By Tyrel Stokes; Russell Steele; Ian Shrier
Irpef: (Un)Fairness and (in)efficiency. A structural analysis based on the BIMic microsimulation model By Nicola Curci; Pietro Rizza; Marzia Romanelli; Marco Savegnago
Algorithmic trading in a microstructural limit order book model By Frédéric Abergel; Côme Huré; Huyên Pham
Zero-Intelligence vs. Human Agents: An Experimental Analysis of the Efficiency of Double Auctions and Over-the-Counter Markets of Varying Sizes By Giuseppe Attanasi; Samuele Centorrino; Elena Manzoni
RIOTs in Germany - constructing an interregional input-output table for Germany By Krebs, Oliver
A nested computational social science approach for deep-narrative analysis in energy policy research By Debnath, Ramit; darby, Sarah; Bardhan, Ronita; Mohaddes, Kamiar; Sunikka-Blank, Minna
A Super-Learning Machine for Predicting Economic Outcomes By Cerulli, Giovanni
Research Notes: Data Structures for Social Media Machine Learning — The Tweet Term Matrix (TTM) and Tweet Bio-Term Matrix (TBTM) By Flor, Nick V.
Economic Effects of the USA - China Trade War: CGE Analysis with the GTAP 9.0a Data Base By Enkhbayar Shagdar; Tomoyoshi Nakajima
Quality checks on granular banking data: an experimental approach based on machine learning? By Fabio Zambuto; Maria Rosaria Buzzi; Giuseppe Costanzo; Marco Di Lucido; Barbara La Ganga; Pasquale Maddaloni; Fabio Papale; Emiliano Svezia
Poverty-reducing or Poverty-inducing? A CGE-based Analysis of Foreign Capital Inflows in Pakistan By Siddiqui, Rizwana; Kemal, A.R.
A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improved Efficiency By Xinyi Guo; Jinfeng Li
Simple Rules for a Complex World with Arti?cial Intelligence By Jesus Fernandez-Villaverde
Machine Learning or Econometrics for Credit Scoring: Let's Get the Best of Both Worlds * By Elena Dumitrescu; Sullivan Hué; Christophe Hurlin; Sessi Tokpavi
NetDP: An Industrial-Scale Distributed Network Representation Framework for Default Prediction in Ant Credit Pay By Jianbin Lin; Zhiqiang Zhang; Jun Zhou; Xiaolong Li; Jingli Fang; Yanming Fang; Quan Yu; Yuan Qi
Blockwise Euclidean likelihood for spatio-temporal covariance models By Víctor Morales-Oñate; Federico Crudu; Moreno Bevilacqua
Gamma Related Ornstein-Uhlenbeck Processes and their Simulation By Nicola Cufaro Petroni; Piergiacomo Sabino
Estimating intergenerational income mobility on sub-optimal data: a machine learning approach By Francesco Bloise; Paolo Brunori; Patrizio Piraino
Revenu de base – Simulations en vue d’une expérimentation By Mahdi Ben Jelloul; Antoine Bozio; Sophie Cottet; Brice Fabre; Claire Leroy

Is the Juice Worth the Squeeze? Machine Learning (ML) In and For Agent-Based Modelling (ABM)

By:	Johannes Dahlke; Kristina Bogner; Matthias Mueller; Thomas Berger; Andreas Pyka; Bernd Ebersberger
Abstract:	In recent years, many scholars praised the seemingly endless possibilities of using machine learning (ML) techniques in and for agent-based simulation models (ABM). To get a more comprehensive understanding of these possibilities, we conduct a systematic literature review (SLR) and classify the literature on the application of ML in and for ABM according to a theoretically derived classification scheme. We do so to investigate how exactly machine learning has been utilized in and for agent-based models so far and to critically discuss the combination of these two promising methods. We find that, indeed, there is a broad range of possible applications of ML to support and complement ABMs in many different ways, already applied in many different disciplines. We see that, so far, ML is mainly used in ABM for two broad cases: First, the modelling of adaptive agents equipped with experience learning and, second, the analysis of outcomes produced by a given ABM. While these are the most frequent, there also exist a variety of many more interesting applications. This being the case, researchers should dive deeper into the analysis of when and how which kinds of ML techniques can support ABM, e.g. by conducting a more in-depth analysis and comparison of different use cases. Nonetheless, as the application of ML in and for ABM comes at certain costs, researchers should not use ML for ABMs just for the sake of doing it.
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2003.11985&r=all

Automated Vehicles are Expected to Increase Driving and Emissions Without Policy Intervention

By:	Rodier, Caroline; Jaller, Miguel; Pourrahmani, Elham; Pahwa, Anmol; Bischoff, Joschka; Freedman, Joel
Abstract:	Researchers at UC Davis explored what an automated vehicle future in the San Francisco Bay Area might look like by simulating: 1) A 100% personal automated vehicle future and its effects on travel and greenhouse emissions. 2) The introduction of an automated taxi service with plausible per-mile fares and its effects on conventional personal vehicle and transit travel. The researchers used the Metropolitan Transportation Commission’s activity-based travel demand model (MTC-ABM) and MATSim, an agent-based transportation model, to carry out the simulations. This policy brief summarizes the results, which provide insight into the relative benefits of each service and automated vehicle technology and the potential market for these services. View the NCST Project Webpage
Keywords:	Engineering, Social and Behavioral Sciences, Intelligent vehicles, Multi-agent systems, Multimodal transportation, Public transit, Ridesharing, Simulation, Traffic simulation, Travel behavior, Travel demand, Value of time
Date:	2020–03–01
URL:	http://d.repec.org/n?u=RePEc:cdl:itsdav:qt4sf2n6rs&r=all

EB-dynaRE: Real-Time Adjustor for Brownian Movement with Examples of Predicting Stock Trends Based on a Novel Event-Based Supervised Learning Algorithm

By:	Yang Chen; Emerson Li
Abstract:	Stock prices are influenced over time by underlying macroeconomic factors. Jumping out of the box of conventional assumptions about the unpredictability of the market noise, we modeled the changes of stock prices over time through the Markov Decision Process, a discrete stochastic control process that aids decision making in a situation that is partly random. We then did a "Region of Interest" (RoI) Pooling of the stock time-series graphs in order to predict future prices with existing ones. Generative Adversarial Network (GAN) is then used based on a competing pair of supervised learning algorithms, to regenerate future stock price projections on a real-time basis. The supervised learning algorithm used in this research, moreover, is original to this study and will have wider uses. With the ensemble of these algorithms, we are able to identify, to what extent, each specific macroeconomic factor influences the change of the Brownian/random market movement. In addition, our model will have a wider influence on the predictions of other Brownian movements.
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2003.11473&r=all

A spatial agent based model for simulating and optimizing networked eco-industrial systems

By:	J. Raimbault; J. Broere; M. Somveille; J. M. Serna; E. Strombom; C. Moore; B. Zhu; L. Sugar
Abstract:	Industrial symbiosis involves creating integrated cycles of by-products and waste between networks of industrial actors in order to maximize economic value, while at the same time minimizing environmental strain. In such a network, the global environmental strain is no longer equal to the sum of the environmental strain of the individual actors, but it is dependent on how well the network performs as a whole. The development of methods to understand, manage or optimize such networks remains an open issue. In this paper we put forward a simulation model of by-product flow between industrial actors. The goal is to introduce a method for modelling symbiotic exchanges from a macro perspective. The model takes into account the effect of two main mechanisms on a multi-objective optimization of symbiotic processes. First it allows us to study the effect of geographical properties of the economic system, said differently, where actors are divided in space. Second, it allows us to study the effect of clustering complementary actors together as a function of distance, by means of a spatial correlation between the actors' by-products. Our simulations unveil patterns that are relevant for macro-level policy. First, our results show that the geographical properties are an important factor for the macro performance of symbiotic processes. Second, spatial correlations, which can be interpreted as planned clusters such as Eco-industrial parks, can lead to a very effective macro performance, but only if these are strictly implemented. Finally, we provide a proof of concept by comparing the model to real world data from the European Pollutant Release and Transfer Register database using georeferencing of the companies in the dataset. This work opens up research opportunities in interactive data-driven models and platforms to support real-world implementation of industrial symbiosis.
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2003.14133&r=all

Sentiment, emotions and stock market predictability in developed and emerging markets

By:	Steyn, Dimitri H. W.; Greyling, Talita; Rossouw, Stephanie; Mwamba, John M.
Abstract:	This paper investigates the predictability of stock market movements using text data extracted from the social media platform, Twitter. We analyse text data to determine the sentiment and the emotion embedded in the Tweets and use them as explanatory variables to predict stock market movements. The study contributes to the literature by analysing high-frequency data and comparing the results obtained from analysing emerging and developed markets, respectively. To this end, the study uses three different Machine Learning Classification Algorithms, the Naïve Bayes, K-Nearest Neighbours and the Support Vector Machine algorithm. Furthermore, we use several evaluation metrics such as the Precision, Recall, Specificity and the F-1 score to test and compare the performance of these algorithms. Lastly, we use the K-Fold Cross-Validation technique to validate the results of our machine learning models and the Variable Importance Analysis to show which variables play an important role in the prediction of our models. The predictability of the market movements is estimated by first including sentiment only and then sentiment with emotions. Our results indicate that investor sentiment and emotions derived from stock market-related Tweets are significant predictors of stock market movements, not only in developed markets but also in emerging markets.
Keywords:	Sentiment Analysis,Classification,Stock Prediction,Machine Learning
JEL:	C6 C8 G0
Date:	2020
URL:	http://d.repec.org/n?u=RePEc:zbw:glodps:502&r=all

"Mechanism Design with Blockchain Enforcement"

By:	Kohei Maehashi (School of Engineering, The University of Tokyo); Mototsugu Shintani (Faculty of Economics, The University of Tokyo)
Abstract:	We perform a thorough comparative analysis of factor models and machine learningto forecast Japanese macroeconomic time series. Our main results can be summarizedas follows. First, factor models and machine learning perform better than the con-ventional AR model in many cases. Second, predictions made by machine learningmethods perform particularly well for medium to long forecast horizons. Third, thesuccess of machine learning mainly comes from the nonlinearity and interaction ofvariables, suggesting the importance of nonlinear structure in predicting the Japanesemacroeconomic series. Fourth, while neural networks are helpful in forecasting, simplyadding many hidden layers does not necessarily enhance its forecast accuracy. Fifth,the composite forecast of factor models and machine learning performs better thanfactor models or machine learning alone, and machine learning methods applied toprincipal components are found to be useful in the composite forecast.
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:tky:fseres:2020cf1146&r=all

Institutional sector classifier, a machine learning approach

By:	Paolo Massaro (Bank of Italy); Ilaria Vannini (Bank of Italy); Oliver Giudice (Bank of Italy)
Abstract:	We implement machine learning techniques to obtain an automatic classification by sector of economic activity of the Italian companies recorded in the Bank of Italy Entities Register. To this end, first we extract a sample of correctly classified corporations from the universe of Italian companies. Second, we select a set of features that are related to the sector of economic activity code and use these to implement supervised approaches to infer output predictions. We choose a multi-step approach based on the hierarchical structure of the sector classification. Because of the imbalance in the target classes, at each step, we first apply two resampling procedures – random oversampling and the Synthetic Minority Over-sampling Technique – to get a more balanced training set. Then, we fit Gradient Boosting and Support Vector Machine models. Overall, the performance of our multi-step classifier yields very reliable predictions of the sector code. This approach can be employed to make the whole classification process more efficient by reducing the area of manual intervention.
Keywords:	machine learning, entities register, classification by institutional sector
JEL:	C18 C81 G21
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:bdi:opques:qef_548_20&r=all

Causal Simulation Experiments: Lessons from Bias Amplification

By:	Tyrel Stokes; Russell Steele; Ian Shrier
Abstract:	Recent theoretical work in causal inference has explored an important class of variables which, when conditioned on, may further amplify existing unmeasured confounding bias (bias amplification). Despite this theoretical work, existing simulations of bias amplification in clinical settings have suggested bias amplification may not be as important in many practical cases as suggested in the theoretical literature.We resolve this tension by using tools from the semi-parametric regression literature leading to a general characterization in terms of the geometry of OLS estimators which allows us to extend current results to a larger class of DAGs, functional forms, and distributional assumptions. We further use these results to understand the limitations of current simulation approaches and to propose a new framework for performing causal simulation experiments to compare estimators. We then evaluate the challenges and benefits of extending this simulation approach to the context of a real clinical data set with a binary treatment, laying the groundwork for a principled approach to sensitivity analysis for bias amplification in the presence of unmeasured confounding.
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2003.08449&r=all

Irpef: (Un)Fairness and (in)efficiency. A structural analysis based on the BIMic microsimulation model

By:	Nicola Curci (Banca d'Italia); Pietro Rizza (Banca d'Italia); Marzia Romanelli (Banca d’Italia); Marco Savegnago (Banca d’Italia)
Abstract:	We discuss the structural features of the Italian personal income tax (Irpef) and its effects on income redistribution and labour supply incentives. The analysis is carried out using Banca d’Italia’s static microsimulation model BIMic. We find that Irpef plays a decisive role in making the Italian tax system progressive; almost all of the redistribution that comes from Irpef is due in equal measure to the structure of income brackets and to deductions, while the definition of the tax base has an ambiguous effect overall on Irpef’s redistributive effects. As regards the efficiency and the degree of distortion of economic agents’ choices, we note the presence of high values of effective marginal tax rates rates (EMTRs) also at relatively low levels of income; moreover, the variability of EMTRs is also large within the same income class. Finally, we discuss two counterfactual hypotheses that modify the ‘Irpef bonus’ with respect to the 2019 legislation. The difference between the two hypotheses concerns the extent of the increase and, above all, the speed at which the bonus phases out as income increases. Both alternatives result in a reduction of both inequality and EMTRs compared with the baseline scenario. Moreover, the traditional equity-efficiency trade-off is confirmed, since the reduction in EMTRs is more pronounced in the scenario where that of inequality is milder, and vice versa.
Keywords:	Irpef, redistribution, efficiency, microsimulation
JEL:	H22 H23 H24 H31 C15 C63
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:bdi:opques:qef_546_20&r=all

Algorithmic trading in a microstructural limit order book model

By:	Frédéric Abergel (MICS - Mathématiques et Informatique pour la Complexité et les Systèmes - CentraleSupélec); Côme Huré (LPSM (UMR_8001) - Laboratoire de Probabilités, Statistique et Modélisation - UPD7 - Université Paris Diderot - Paris 7 - SU - Sorbonne Université - CNRS - Centre National de la Recherche Scientifique); Huyên Pham (LPSM (UMR_8001) - Laboratoire de Probabilités, Statistique et Modélisation - UPD7 - Université Paris Diderot - Paris 7 - SU - Sorbonne Université - CNRS - Centre National de la Recherche Scientifique)
Abstract:	We propose a microstructural modeling framework for studying optimal market making policies in a FIFO (first in first out) limit order book (LOB). In this context, the limit orders, market orders, and cancel orders arrivals in the LOB are modeled as Cox point processes with intensities that only depend on the state of the LOB. These are high-dimensional models which are realistic from a micro-structure point of view and have been recently developed in the literature. In this context, we consider a market maker who stands ready to buy and sell stock on a regular and continuous basis at a publicly quoted price, and identifies the strategies that maximize her P&L penalized by her inventory. We apply the theory of Markov Decision Processes and dynamic programming method to characterize analytically the solutions to our optimal market making problem. The second part of the paper deals with the numerical aspect of the high-dimensional trading problem. We use a control randomization method combined with quantization method to compute the optimal strategies. Several computational tests are performed on simulated data to illustrate the efficiency of the computed optimal strategy. In particular, we simulated an order book with constant/ symmet-ric/ asymmetrical/ state dependent intensities, and compared the computed optimal strategy with naive strategies. Some codes are available on https://github.com/comeh.
Keywords:	high-dimensional stochastic control,quantization,local regression,Hawkes Process,pure-jump controlled process,Limit order book,Markov Decision Process,high-frequency trading
Date:	2020
URL:	http://d.repec.org/n?u=RePEc:hal:journl:hal-01514987&r=all

Zero-Intelligence vs. Human Agents: An Experimental Analysis of the Efficiency of Double Auctions and Over-the-Counter Markets of Varying Sizes

By:	Giuseppe Attanasi (Université Côte d'Azur; CNRS, GREDEG, France); Samuele Centorrino (Stony Brook University); Elena Manzoni (University of Verona)
Abstract:	We study two well-known electronic markets: an over-the-counter (OTC) market, in which each agent looks for the best counterpart through bilateral negotiations, and a double auction (DA) market, in which traders post their quotes publicly. We focus on the DA-OTC efficiency gap and show how it varies with different market sizes (10, 20, 40, and 80 traders). We compare experimental results from a sample of 6,400 undergraduate students in Economics with zero-intelligent (ZI) agent-based simulations. Simulations with ZI traders show that the traded quantity (with respect to the efficient one) increases with market size under both DA and OTC. Experimental results with human traders confirm the same tendency under DA, while the share of periods in which the traded quantity is higher (lower) than the efficient one decreases (increases) with market size under OTC, ultimately leading to a DA-OTC efficiency gap increasing with market size. We rationalize these results by putting forward a novel game-theoretical model of OTC market as a repeated bargaining procedure under incomplete information on buyers' valuations and sellers' costs, showing how efficiency decreases slightly with size due to two counteracting effects: acceptance rates in earlier periods decrease with size, and earlier offers increase, but not always enough to compensate the decrease in acceptance rates.
Keywords:	Market Design, Classroom Experiment, Agent-based Modelling, Game-theoretic Modelling
JEL:	C70 C91 C92 D41 D47
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:gre:wpaper:2020-10&r=all

RIOTs in Germany - constructing an interregional input-output table for Germany

By:	Krebs, Oliver
Abstract:	This paper shows how to adapt recent methodological advances to derive a shipment based interregional input output table for 402 German counties and 26 foreign partners for 17 sectors that is, for national aggregates, cell-by-cell compatible with the WIOD tables. It far outperforms the standard approach of applying unit values to interregional shipments in replicating observed regional statistics and can be used for improved impact analysis and CGE model calibration. It thereby mitigates the surprising but problematic lack of regional German trade data in the analysis of both, regional effects of aggregate shocks such as trade agreements as well as network effects of regional policies. Moreover, the paper takes an in-depth look at the derived German production structure and trade network at the county level finding a surprisingly vast heterogeneity with respect to specialization, agglomeration and trade partners.
Keywords:	Germany,regional trade,input-output tables,unit values,proportionality
JEL:	R15 R12 F17
Date:	2020
URL:	http://d.repec.org/n?u=RePEc:zbw:tuewef:132&r=all

A nested computational social science approach for deep-narrative analysis in energy policy research

By:	Debnath, Ramit; darby, Sarah; Bardhan, Ronita; Mohaddes, Kamiar; Sunikka-Blank, Minna
Abstract:	Text-based data sources like narratives and stories have become increasingly popular as critical insight generator in energy research and social science. However, their implications in policy application usually remain superficial and fail to fully exploit state-of-the-art resources which digital era holds for text analysis. This paper illustrates the potential of deep-narrative analysis in energy policy research using text analysis tools from the cutting-edge domain of computational social sciences, notably topic modelling. We argue that a nested application of topic modelling and grounded theory in narrative analysis promises advances in areas where manual-coding driven narrative analysis has traditionally struggled with directionality biases, scaling, systematisation and repeatability. The nested application of the topic model and the grounded theory goes beyond the frequentist approach of narrative analysis and introduces insight generation capabilities based on the probability distribution of words and topics in a text corpus. In this manner, our proposed methodology deconstructs the corpus and enables the analyst to answer research questions based on the foundational element of the text data structure. We verify the theoretical and epistemological fit of the proposed nested methodology through a meta-analysis of a state-of-the-art bibliographic database on energy policy and computational social science. We find that the nested application contributes to the literature gap on the need for multidisciplinary polyvalence methodologies that can systematically include qualitative evidence into policymaking.
Date:	2020–03–27
URL:	http://d.repec.org/n?u=RePEc:osf:socarx:hvcb5&r=all

A Super-Learning Machine for Predicting Economic Outcomes

By:	Cerulli, Giovanni
Abstract:	We present a Super-Learning Machine (SLM) to predict economic outcomes which improves prediction (i) by cross-validated optimal tuning, (ii) by comparing/combining results from different learners. Our application to a labor economics dataset shows that different learners may behave differently. However, combining learners into one singleton super-learner proves to preserve good predictive accuracy lowering the variance more than stand-alone approaches.
Keywords:	Machine learning; Ensemble methods; Optimal prediction
JEL:	C53 C61 C63
Date:	2020–03–10
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:99111&r=all

Research Notes: Data Structures for Social Media Machine Learning — The Tweet Term Matrix (TTM) and Tweet Bio-Term Matrix (TBTM)

By:	Flor, Nick V. (University of New Mexico)
Abstract:	The document term matrix (“DTM”) is a representation of a collection of documents, and is a key input to many machine learning algorithms. It can be applied to a collection of tweets as well. I give the set-predicate formalism for the tweet term matrix (“TTM”), and the tweet bio-term matrix (“TBTM”).
Date:	2020–03–09
URL:	http://d.repec.org/n?u=RePEc:osf:socarx:tp5mu&r=all

Economic Effects of the USA - China Trade War: CGE Analysis with the GTAP 9.0a Data Base

By:	Enkhbayar Shagdar (Economic Research Institute for Northeast Asia (ERINA)); Tomoyoshi Nakajima (Economic Research Institute for Northeast Asia (ERINA))
Abstract:	An analysis of the economic effects of the ongoing USA-China trade war using the standard CGE Model and GTAP Data Base 9.0a revealed that both parties will be worse-off from this trade friction, having welfare losses and real GDP contractions regardless of international capital mobility status—i.e. whether the capital is internationally mobile or not. Moreover, the results indicated that the negative economic and trade impacts on China would be larger compared to those of the USA. Although, other countries and regions would be better-off having positive changes in their welfare and real GDP, their magnitudes were much lower than losses of the USA and China. Therefore, as a whole, the global economy will be worse-off as a result of this trade war between the world’s two largest economies, the USA and China.
Keywords:	Trade policy, CGE models
JEL:	F13 C68
Date:	2018–12
URL:	http://d.repec.org/n?u=RePEc:eri:dpaper:1806e&r=all

Quality checks on granular banking data: an experimental approach based on machine learning?

By:	Fabio Zambuto (Bank of Italy); Maria Rosaria Buzzi (Bank of Italy); Giuseppe Costanzo (Bank of Italy); Marco Di Lucido (Bank of Italy); Barbara La Ganga (Bank of Italy); Pasquale Maddaloni (Bank of Italy); Fabio Papale (Bank of Italy); Emiliano Svezia (Bank of Italy)
Abstract:	We propose a new methodology, based on machine learning algorithms, for the automatic detection of outliers in the data that banks report to the Bank of Italy. Our analysis focuses on granular data gathered within the statistical data collection on payment services, in which the lack of strong ex ante deterministic relationships among the collected variables makes standard diagnostic approaches less powerful. Quantile regression forests are used to derive a region of acceptance for the targeted information. For a given level of probability, plausibility thresholds are obtained on the basis of individual bank characteristics and are automatically updated as new data are reported. The approach was applied to validate semi-annual data on debit card issuance received from reporting agents between December 2016 and June 2018. The algorithm was trained with data reported in previous periods and tested by cross-checking the identified outliers with the reporting agents. The method made it possible to detect, with a high level of precision in term of false positives, new outliers that had not been detected using the standard procedures.
Keywords:	banking data, data quality management, outlier detection, machine learning, quantile regression, random forests
JEL:	C18 C81 G21
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:bdi:opques:qef_547_20&r=all

Poverty-reducing or Poverty-inducing? A CGE-based Analysis of Foreign Capital Inflows in Pakistan

By:	Siddiqui, Rizwana; Kemal, A.R.
Abstract:	Foreign capital inflows (FKI) help an economy by financing the imbalance between income and expenditure. However, their impact on poverty in the recipient economy is a controversial issue. In this study, a static computable general equilibrium (CGE) model for Pakistan has been used to assess the impact of foreign capital on poverty. Several interesting results emerged from the study. FKI increase demand for goods for investment purposes that lead to the expansion of import-competing- sector machinery to fulfil domestic demand. However, the contraction of the majority of trading sectors combined with expansion of non-trading sectors of the economy have generated ‘Dutch disease effect’. The results show that FKIs have a positive impact on poverty in Pakistan. Trade liberalization of import of machinery reduces the negative effect of the decline in FKI. Rise in poverty in Pakistan may be attributed to the decline in foreign capital.
Keywords:	FKI, Poverty, CGE model
JEL:	F13 F2 I32
Date:	2019–01–01
URL:	http://d.repec.org/n?u=RePEc:pra:mprapa:99013&r=all

A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improved Efficiency

By:	Xinyi Guo; Jinfeng Li
Abstract:	A novel social networks sentiment analysis model is proposed based on Twitter sentiment score (TSS) for real-time prediction of the future stock market price FTSE 100, as compared with conventional econometric models of investor sentiment based on closed-end fund discount (CEFD). The proposed TSS model features a new baseline correlation approach, which not only exhibits a decent prediction accuracy, but also reduces the computation burden and enables a fast decision making without the knowledge of historical data. Polynomial regression, classification modelling and lexicon-based sentiment analysis are performed using R. The obtained TSS predicts the future stock market trend in advance by 15 time samples (30 working hours) with an accuracy of 67.22% using the proposed baseline criterion without referring to historical TSS or market data. Specifically, TSS's prediction performance of an upward market is found far better than that of a downward market. Under the logistic regression and linear discriminant analysis, the accuracy of TSS in predicting the upward trend of the future market achieves 97.87%.
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2003.08137&r=all

Simple Rules for a Complex World with Arti?cial Intelligence

By:	Jesus Fernandez-Villaverde (University of Pennsylvania)
Abstract:	Can arti?cial intelligence, in particular, machine learning algorithms, replace the idea of simple rules, such as ?rst possession and voluntary exchange in free markets, as a foundation for public policy? This paper argues that the preponderance of the evidence sides with the interpretation that while arti?cial intelligence will help public policy along important aspects, simple rules will remain the fundamental guideline for the design of institutions and legal environments where markets operate. “Digital socialism” might be a hipster thing to talk about in Williamsburg or Shoreditch, but it is as much of a chimera as “analog socialism.”
Keywords:	Arti?cial intelligence, machine learning, economics, law, rule of law
JEL:	D85 H10 H30
Date:	2020–03–20
URL:	http://d.repec.org/n?u=RePEc:pen:papers:20-010&r=all

Machine Learning or Econometrics for Credit Scoring: Let's Get the Best of Both Worlds *

By:	Elena Dumitrescu; Sullivan Hué; Christophe Hurlin (University of Orleans - LEO); Sessi Tokpavi (LEO - Laboratoire d'économie d'Orleans - UO - Université d'Orléans - CNRS - Centre National de la Recherche Scientifique)
Abstract:	Decision trees and related ensemble methods like random forest are state-of-the-art tools in the field of machine learning for credit scoring. Although they are shown to outperform logistic regression, they lack interpretability and this drastically reduces their use in the credit risk management industry, where decision-makers and regulators need transparent score functions. This paper proposes to get the best of both worlds, introducing a new, simple and interpretable credit scoring method which uses information from decision trees to improve the performance of logistic regression. Formally, rules extracted from various short-depth decision trees built with couples of predictive variables are used as predictors in a penalized or regularized logistic regression. By modeling such univariate and bivariate threshold effects, we achieve significant improvement in model performance for the logistic regression while preserving its simple interpretation. Applications using simulated and four real credit defaults datasets show that our new method outperforms traditional logistic regressions. Moreover, it compares competitively to random forest, while providing an interpretable scoring function. JEL Classification: G10 C25, C53
Keywords:	Credit scoring,Machine Learning,Risk management,Interpretability,Econometrics
Date:	2020–03–13
URL:	http://d.repec.org/n?u=RePEc:hal:wpaper:hal-02507499&r=all

NetDP: An Industrial-Scale Distributed Network Representation Framework for Default Prediction in Ant Credit Pay

By:	Jianbin Lin; Zhiqiang Zhang; Jun Zhou; Xiaolong Li; Jingli Fang; Yanming Fang; Quan Yu; Yuan Qi
Abstract:	Ant Credit Pay is a consumer credit service in Ant Financial Service Group. Similar to credit card, loan default is one of the major risks of this credit product. Hence, effective algorithm for default prediction is the key to losses reduction and profits increment for the company. However, the challenges facing in our scenario are different from those in conventional credit card service. The first one is scalability. The huge volume of users and their behaviors in Ant Financial requires the ability to process industrial-scale data and perform model training efficiently. The second challenges is the cold-start problem. Different from the manual review for credit card application in conventional banks, the credit limit of Ant Credit Pay is automatically offered to users based on the knowledge learned from big data. However, default prediction for new users is suffered from lack of enough credit behaviors. It requires that the proposal should leverage other new data source to alleviate the cold-start problem. Considering the above challenges and the special scenario in Ant Financial, we try to incorporate default prediction with network information to alleviate the cold-start problem. In this paper, we propose an industrial-scale distributed network representation framework, termed NetDP, for default prediction in Ant Credit Pay. The proposal explores network information generated by various interaction between users, and blends unsupervised and supervised network representation in a unified framework for default prediction problem. Moreover, we present a parameter-server-based distributed implement of our proposal to handle the scalability challenge. Experimental results demonstrate the effectiveness of our proposal, especially in cold-start problem, as well as the efficiency for industrial-scale dataset.
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2004.00201&r=all

Blockwise Euclidean likelihood for spatio-temporal covariance models

By:	Víctor Morales-Oñate; Federico Crudu; Moreno Bevilacqua
Abstract:	In this paper we propose a spatio-temporal blockwise Euclidean likelihood method for the estimation of covariance models when dealing with large spatio-temporal Gaussian data. The method uses moment conditions coming from the score of the pairwise composite likelihood. The blockwise approach guarantees considerable computational improvements over the standard pairwise composite likelihood method. In order to further speed up computation we consider a general purpose graphics processing unit implementation using OpenCL. We derive the asymptotic properties of the proposed estimator and we illustrate the nite sampleproperties of our methodology by means of a simulation study highlighting the computational gains of the OpenCL graphics processing unit implementation. Finally, we apply our estimation method to a wind component data set.
Keywords:	Composite likelihood; Euclidean likelihood; Gaussian random elds; Parallel computing; OpenCL
JEL:	C14 C21 C23
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:usi:wpaper:822&r=all

By:	Nicola Cufaro Petroni; Piergiacomo Sabino
Abstract:	We investigate the distributional properties of two generalized Ornstein-Uhlenbeck (OU) processes whose stationary distributions are the gamma law and the bilateral gamma law, respectively. The said distributions turn out to be related to the self-decomposable gamma and bilateral gamma laws, and their densities and characteristic functions are here given in closed-form. Algorithms for the exact generation of such processes are accordingly derived with the advantage of being significantly faster than those available in the literature and therefore suitable for real-time simulations.
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:arx:papers:2003.08810&r=all

Estimating intergenerational income mobility on sub-optimal data: a machine learning approach

By:	Francesco Bloise (AFFILIATION); Paolo Brunori (University of Florence); Patrizio Piraino (University of Notre Dame)
Abstract:	Much of the global evidence on intergenerational income mobility is based on sub-optimal data. In particular, two-stage techniques are widely used to impute parental incomes for analyses of developing countries and for estimating long-run trends across multiple generations and historical periods. We propose a machine learning method that may improve the reliability and comparability of such estimates. Our approach minimizes the out-of-sample prediction error in the parental income imputation, which provides an objective criterion for choosing across different specifications of the first-stage equation. We apply the method to data from the United States and South Africa to show that under common conditions it can limit the bias generally associated to mobility estimates based on imputed parental income.
Keywords:	Intergenerational elasticity; income; mobility; elastic net; regularization; PSID, South Africa..
JEL:	J62 D63 C18
Date:	2020–03
URL:	http://d.repec.org/n?u=RePEc:inq:inqwps:ecineq2020-526&r=all

Revenu de base – Simulations en vue d’une expérimentation

By:	Mahdi Ben Jelloul (IPP - Institut des politiques publiques); Antoine Bozio (IPP - Institut des politiques publiques, PJSE - Paris Jourdan Sciences Economiques - UP1 - Université Panthéon-Sorbonne - ENS Paris - École normale supérieure - Paris - INRA - Institut National de la Recherche Agronomique - EHESS - École des hautes études en sciences sociales - ENPC - École des Ponts ParisTech - CNRS - Centre National de la Recherche Scientifique, PSE - Paris School of Economics); Sophie Cottet (IPP - Institut des politiques publiques, PSE - Paris School of Economics, PJSE - Paris Jourdan Sciences Economiques - UP1 - Université Panthéon-Sorbonne - ENS Paris - École normale supérieure - Paris - INRA - Institut National de la Recherche Agronomique - EHESS - École des hautes études en sciences sociales - ENPC - École des Ponts ParisTech - CNRS - Centre National de la Recherche Scientifique); Brice Fabre (PSE - Paris School of Economics, IPP - Institut des politiques publiques, PJSE - Paris Jourdan Sciences Economiques - UP1 - Université Panthéon-Sorbonne - ENS Paris - École normale supérieure - Paris - INRA - Institut National de la Recherche Agronomique - EHESS - École des hautes études en sciences sociales - ENPC - École des Ponts ParisTech - CNRS - Centre National de la Recherche Scientifique); Claire Leroy (IPP - Institut des politiques publiques)
Abstract:	Le système de prestations sociales actuel suscite des débats sur de nombreuses dimensions : non-recours aux minima sociaux, empilement de dispositifs multiples, conditions restrictives d'éligibilité pour la population jeune, etc. Face à ces enjeux, 13 conseils départementaux (l'Ardèche, l'Ariège, l'Aude, la Dordogne, le Gers, la Gironde, la Haute-Garonne, l'Ille-et-Vilaine, les Landes, le Lot-et-Garonne, la Meurthe-et-Moselle, la Nièvre et la Seine-Saint-Denis) ont lancé un projet d'expérimentation de la mise en place d'un revenu de base simplifiant le système existant et ouvert à tout individu au-dessus d'un certain âge sous condition de ressources. Un préalable à la mise en œuvre de ce projet est la définition du ou des scénarios de réforme à expérimenter. Ce rapport s'inscrit dans cet objectif, en évaluant ex-ante les effets budgétaires et redistributifs de plusieurs scénarios de réforme définis par les conseils départementaux impliqués. À partir du modèle de microsimulation TAXIPP 1.0, qui mobilise à la fois des données administratives de source fiscale et des données d'enquête, ce rapport propose deux schémas de simplification du système existant : le remplacement du Revenu de Solidarité Active (RSA) et de la prime d'activité par un dispositif simplifié d'une part, et l'intégration des aides au logement dans le nouveau dispositif unifié d'autre part. Sont notamment évalués les effets de l'ouverture de ces dispositifs aux individus de 18 à 24 ans, qui sont aujourd'hui les plus touchés par la pauvreté.
Date:	2018–06
URL:	http://d.repec.org/n?u=RePEc:hal:psewpa:halshs-02514725&r=all

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.

NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.