nep-big New Economics Papers
on Big Data
Issue of 2019‒10‒07
twenty-six papers chosen by
Tom Coupé
University of Canterbury

  1. Too Much Data: Prices and Inefficiencies in Data Markets By Daron Acemoglu; Ali Makhdoumi; Azarakhsh Malekian; Asuman Ozdaglar
  2. Text-Based Rental Rate Predictions of Airbnb Listings By Norbert Pfeifer
  3. Can a machine understand real estate pricing? – Evaluating machine learning approaches with big data By Marcelo Cajias
  4. A Robust Transferable Deep Learning Framework for Cross-sectional Investment Strategy By Kei Nakagawa; Masaya Abe; Junpei Komiyama
  5. Using Machine Learning to Predict Realized Variance By Peter Carr; Liuren Wu; Zhibai Zhang
  6. Artificial Intelligence BlockCloud (AIBC) Technical Whitepaper By Qi Deng
  7. Machine Learning Optimization Algorithms & Portfolio Allocation By Sarah Perrin; Thierry Roncalli
  8. Exploring Graph Neural Networks for Stock Market Predictions with Rolling Window Analysis By Daiki Matsunaga; Toyotaro Suzumura; Toshihiro Takahashi
  9. Towards Federated Graph Learning for Collaborative Financial Crimes Detection By Toyotaro Suzumura; Yi Zhou; Natahalie Barcardo; Guangnan Ye; Keith Houck; Ryo Kawahara; Ali Anwar; Lucia Larise Stavarache; Daniel Klyashtorny; Heiko Ludwig; Kumar Bhaskaran
  10. New Technology and Data in Real Estate By Marcelo Cajias
  11. An introduction to flexible methods for policy evaluation By Martin Huber
  12. The option pricing model based on time values: an application of the universal approximation theory on unbounded domains By Yang Qu; Ming-Xi Wang
  13. Heterogeneous Households and Market Segmentation in a Hedonic Framework By Martijn Droes; Martin Hoesli; Steven C. Bourassa
  14. I know where you will invest in the next year – Forecasting real estate investments with machine learning methods By Marcelo Cajias; Jonas Willwersch; Felix Lorenz
  15. PAGAN: Portfolio Analysis with Generative Adversarial Networks By Giovanni Mariani; Yada Zhu; Jianbo Li; Florian Scheidegger; Roxana Istrate; Costas Bekas; A. Cristiano I. Malossi
  16. Big data analytics business value and firm performance: Linking with environmental context By Claudio Vitari; Elisabetta Raguseo
  17. Big Data, GAFA et Assurance By Arthur Charpentier
  18. A Framework for the optimal Development and Application of Automated Valuation Models (AVMs) By Andreas Kindt
  19. How Polarized are Citizens? Measuring Ideology from the Ground-Up By Draca, Mirko; Schwarz, Carlo
  20. World Corporate Top R&D investors: Shaping the Future of Technologies and of AI By Helene Dernis; Petros Gkotsis; Nicola Grassano; Shohei Nakazato; Mariagrazia Squicciarini; Brigitte van Beuzekom; Antonio Vezzani
  21. Intérêt des adhérents d'une mutuelle pour des services utilisant leurs données personnelles dans le cadre de la médecine personnalisée By Bénédicte H. Apouey
  22. Debiased/Double Machine Learning for Instrumental Variable Quantile Regressions By Jau-er Chen; Jia-Jyun Tien
  23. WATTNet: Learning to Trade FX via Hierarchical Spatio-Temporal Representation of Highly Multivariate Time Series By Michael Poli; Jinkyoo Park; Ilija Ilievski
  24. Deep Neural Network Framework Based on Backward Stochastic Differential Equations for Pricing and Hedging American Options in High Dimensions By Yangang Chen; Justin W. L. Wan
  25. Artificial intelligence: Why a digital base is critical By Jacques Bughin; Nicolas van Zeebroeck
  26. The Economics and Implications of Data; An Integrated Perspective By Yan Carriere-Swallow; Vikram Haksar

  1. By: Daron Acemoglu; Ali Makhdoumi; Azarakhsh Malekian; Asuman Ozdaglar
    Abstract: When a user shares her data with an online platform, she typically reveals relevant information about other users. We model a data market in the presence of this type of externality in a setup where one or multiple platforms estimate a user’s type with data they acquire from all users and (some) users value their privacy. We demonstrate that the data externalities depress the price of data because once a user’s information is leaked by others, she has less reason to protect her data and privacy. These depressed prices lead to excessive data sharing. We characterize conditions under which shutting down data markets improves (utilitarian) welfare. Competition between platforms does not redress the problem of excessively low price for data and too much data sharing, and may further reduce welfare. We propose a scheme based on mediated data-sharing that improves efficiency.
    JEL: D62 D83 L86
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:nbr:nberwo:26296&r=all
  2. By: Norbert Pfeifer
    Abstract: The validation of house price value remains a critical task for scientific research as well as for practitioners. The following paper investigates this challenge by integrating textual-based information contained in real estate descriptions. More specifically, we show different approaches surrounding how to integrate verbal descriptions from real estate advertisements in an automated valuation model. By using Airbnb listing data, we address the proposed methods against a traditional hedonic-based approach, where we show that a neural network-based prediction model—featuring only information from verbal descriptions—are able to outperform a traditional hedonic-based model estimated with physical attributes, such as bathrooms or/and bedrooms. We also draw attention to techniques that allow for interrelations between physical, locational, and qualitative, text-based attributes. The results strongly suggest the integration of textual information, specifically modelled in a 2-stage model architecture in which the first model (recurrent long short-term memory network) outputs a probability distribution over price classifications, which is then used along with quantitative measurements in a stacked feed-forward neural network.
    Keywords: AVM; housing; Neural Network; NLP
    JEL: R3
    Date: 2019–01–01
    URL: http://d.repec.org/n?u=RePEc:arz:wpaper:eres2019_329&r=all
  3. By: Marcelo Cajias
    Abstract: In the era of internet and digitalization real estate prices of dwellings are predominantly collected live by multiple listing services and merged with supporting data such as spatio-temporal geo-information. Despite the computational requirements for analyzing such large datasets, the methods for analyzing big data have evolved substantially and go much far beyond the traditional regression. In this context, the usage of machine learning technologies for analyzing prices in the real estate industry is not commonplace. This paper applies machine learnings algorithms on a data set of more than 3 Mio. observations in the German residential market to explore the predicting accuracy of methods such as the random forests regressions, XGboost and the stacked regression among others. The results show a significant reduction in the forecasting variance and confirm that artificial intelligence understands real estate prices much deeper.
    Keywords: Big Data in real estate; German housing; Machine learning Algorithms; Random forest; XGBoost
    JEL: R3
    Date: 2019–01–01
    URL: http://d.repec.org/n?u=RePEc:arz:wpaper:eres2019_232&r=all
  4. By: Kei Nakagawa; Masaya Abe; Junpei Komiyama
    Abstract: Stock return predictability is an important research theme as it reflects our economic and social organization, and significant efforts are made to explain the dynamism therein. Statistics of strong explanative power, called "factor" have been proposed to summarize the essence of predictive stock returns. Although machine learning methods are increasingly popular in stock return prediction, an inference of the stock returns is highly elusive, and still most investors, if partly, rely on their intuition to build a better decision making. The challenge here is to make an investment strategy that is consistent over a reasonably long period, with the minimum human decision on the entire process. To this end, we propose a new stock return prediction framework that we call Ranked Information Coefficient Neural Network (RIC-NN). RIC-NN is a deep learning approach and includes the following three novel ideas: (1) nonlinear multi-factor approach, (2) stopping criteria with ranked information coefficient (rank IC), and (3) deep transfer learning among multiple regions. Experimental comparison with the stocks in the Morgan Stanley Capital International (MSCI) indices shows that RIC-NN outperforms not only off-the-shelf machine learning methods but also the average return of major equity investment funds in the last fourteen years.
    Date: 2019–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1910.01491&r=all
  5. By: Peter Carr; Liuren Wu; Zhibai Zhang
    Abstract: In this paper we formulate a regression problem to predict realized volatility by using option price data and enhance VIX-styled volatility indices' predictability and liquidity. We test algorithms including regularized regression and machine learning methods such as Feedforward Neural Networks (FNN) on S&P 500 Index and its option data. By conducting a time series validation we find that both Ridge regression and FNN can improve volatility indexing with higher prediction performance and fewer options required. The best approach found is to predict the difference between the realized volatility and the VIX-styled index's prediction rather than to predict the realized volatility directly, representing a successful combination of human learning and machine learning. We also discuss suitability of different regression algorithms for volatility indexing and applications of our findings.
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.10035&r=all
  6. By: Qi Deng
    Abstract: The AIBC is an Artificial Intelligence and blockchain technology based large-scale decentralized ecosystem that allows system-wide low-cost sharing of computing and storage resources. The AIBC consists of four layers: a fundamental layer, a resource layer, an application layer, and an ecosystem layer. The AIBC implements a two-consensus scheme to enforce upper-layer economic policies and achieve fundamental layer performance and robustness: the DPoEV incentive consensus on the application and resource layers, and the DABFT distributed consensus on the fundamental layer. The DABFT uses deep learning techniques to predict and select the most suitable BFT algorithm in order to achieve the best balance of performance, robustness, and security. The DPoEV uses the knowledge map algorithm to accurately assess the economic value of digital assets.
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.12063&r=all
  7. By: Sarah Perrin; Thierry Roncalli
    Abstract: Portfolio optimization emerged with the seminal paper of Markowitz (1952). The original mean-variance framework is appealing because it is very efficient from a computational point of view. However, it also has one well-established failing since it can lead to portfolios that are not optimal from a financial point of view. Nevertheless, very few models have succeeded in providing a real alternative solution to the Markowitz model. The main reason lies in the fact that most academic portfolio optimization models are intractable in real life although they present solid theoretical properties. By intractable we mean that they can be implemented for an investment universe with a small number of assets using a lot of computational resources and skills, but they are unable to manage a universe with dozens or hundreds of assets. However, the emergence and the rapid development of robo-advisors means that we need to rethink portfolio optimization and go beyond the traditional mean-variance optimization approach. Another industry has faced similar issues concerning large-scale optimization problems. Machine learning has long been associated with linear and logistic regression models. Again, the reason was the inability of optimization algorithms to solve high-dimensional industrial problems. Nevertheless, the end of the 1990s marked an important turning point with the development and the rediscovery of several methods that have since produced impressive results. The goal of this paper is to show how portfolio allocation can benefit from the development of these large-scale optimization algorithms. Not all of these algorithms are useful in our case, but four of them are essential when solving complex portfolio optimization problems. These four algorithms are the coordinate descent, the alternating direction method of multipliers, the proximal gradient method and the Dykstra's algorithm.
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.10233&r=all
  8. By: Daiki Matsunaga; Toyotaro Suzumura; Toshihiro Takahashi
    Abstract: Recently, there has been a surge of interest in the use of machine learning to help aid in the accurate predictions of financial markets. Despite the exciting advances in this cross-section of finance and AI, many of the current approaches are limited to using technical analysis to capture historical trends of each stock price and thus limited to certain experimental setups to obtain good prediction results. On the other hand, professional investors additionally use their rich knowledge of inter-market and inter-company relations to map the connectivity of companies and events, and use this map to make better market predictions. For instance, they would predict the movement of a certain company's stock price based not only on its former stock price trends but also on the performance of its suppliers or customers, the overall industry, macroeconomic factors and trade policies. This paper investigates the effectiveness of work at the intersection of market predictions and graph neural networks, which hold the potential to mimic the ways in which investors make decisions by incorporating company knowledge graphs directly into the predictive model. The main goal of this work is to test the validity of this approach across different markets and longer time horizons for backtesting using rolling window analysis.In this work, we concentrate on the prediction of individual stock prices in the Japanese Nikkei 225 market over a period of roughly 20 years. For the knowledge graph, we use the Nikkei Value Search data, which is a rich dataset showing mainly supplier relations among Japanese and foreign companies. Our preliminary results show a 29.5% increase and a 2.2-fold increase in the return ratio and Sharpe ratio, respectively, when compared to the market benchmark, as well as a 6.32% increase and 1.3-fold increase, respectively, compared to the baseline LSTM model.
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.10660&r=all
  9. By: Toyotaro Suzumura; Yi Zhou; Natahalie Barcardo; Guangnan Ye; Keith Houck; Ryo Kawahara; Ali Anwar; Lucia Larise Stavarache; Daniel Klyashtorny; Heiko Ludwig; Kumar Bhaskaran
    Abstract: Financial crime is a large and growing problem, in some way touching almost every financial institution. Financial institutions are the front line in the war against financial crime and accordingly, must devote substantial human and technology resources to this effort. Current processes to detect financial misconduct have limitations in their ability to effectively differentiate between malicious behavior and ordinary financial activity. These limitations tend to result in gross over-reporting of suspicious activity that necessitate time-intensive and costly manual review. Advances in technology used in this domain, including machine learning based approaches, can improve upon the effectiveness of financial institutions' existing processes, however, a key challenge that most financial institutions continue to face is that they address financial crimes in isolation without any insight from other firms. Where financial institutions address financial crimes through the lens of their own firm, perpetrators may devise sophisticated strategies that may span across institutions and geographies. Financial institutions continue to work relentlessly to advance their capabilities, forming partnerships across institutions to share insights, patterns and capabilities. These public-private partnerships are subject to stringent regulatory and data privacy requirements, thereby making it difficult to rely on traditional technology solutions. In this paper, we propose a methodology to share key information across institutions by using a federated graph learning platform that enables us to build more accurate machine learning models by leveraging federated learning and also graph learning approaches. We demonstrated that our federated model outperforms local model by 20% with the UK FCA TechSprint data set. This new platform opens up a door to efficiently detecting global money laundering activity.
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.12946&r=all
  10. By: Marcelo Cajias
    Abstract: Initial yields are used by institutional investors and investment managers to assess the pricing conditions of real estate markets. In contrast to commercial real estate, initial yields in the residential sector are hard to quantify, especially due to the lack of comparables. In the era of digitalisation and big data residential assets are mostly brought to the market via digital multiple listing systems. The paper develops semiparametric hedonic models for extracting the implicit information to calculate residential net initial yields for both a buy-to-hold and rental investment strategy based on more than 3 million observations. The results are robust and confirm that the pricing conditions of residential markets are captured by the hedonic approach, enhancing the transparency in real estate markets.
    Keywords: Big data; buy or rent; German residential; Net initial yields; semiparametric regression
    JEL: R3
    Date: 2019–01–01
    URL: http://d.repec.org/n?u=RePEc:arz:wpaper:eres2019_155&r=all
  11. By: Martin Huber
    Abstract: This chapter covers different approaches to policy evaluation for assessing the causal effect of a treatment or intervention on an outcome of interest. As an introduction to causal inference, the discussion starts with the experimental evaluation of a randomized treatment. It then reviews evaluation methods based on selection on observables (assuming a quasi-random treatment given observed covariates), instrumental variables (inducing a quasi-random shift in the treatment), difference-in-differences and changes-in-changes (exploiting changes in outcomes over time), as well as regression discontinuities and kinks (using changes in the treatment assignment at some threshold of a running variable). The chapter discusses methods particularly suited for data with many observations for a flexible (i.e. semi- or nonparametric) modeling of treatment effects, and/or many (i.e. high dimensional) observed covariates by applying machine learning to select and control for covariates in a data-driven way. This is not only useful for tackling confounding by controlling for instance for factors jointly affecting the treatment and the outcome, but also for learning effect heterogeneities across subgroups defined upon observable covariates and optimally targeting those groups for which the treatment is most effective.
    Date: 2019–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1910.00641&r=all
  12. By: Yang Qu; Ming-Xi Wang
    Abstract: Hutchinson, Lo and Poggio raised the question that if learning works can learn the Black-Scholes formula, and they proposed the network mapping the ratio of underlying price to strike $S_t/K$ and the time to maturity $\tau$ directly into the ratio of option price to strike $C_t/K$. In this paper we propose a novel descision function and study the network mapping $S_t/K$ and $\tau$ into the ratio of time value to strike $V_t/K$. Time values' appearance in artificial intelligence fits into traders' natural intelligence. Empirical experiments will be carried out to demonstrate that it significantly improves Hutchinson-Lo-Poggio's original model by faster learning and better generalization performance. In order to take a conceptual viewpoint and to prove that $V_t/K$ but not $C_t/K$ can be approximated by superpositions of logistic functions on its domain of definition, we work on the theory of universal approximation on unbounded domains. We prove some general results which imply that an artificial neural network with a single hidden layer and sigmoid activation represents no function in $L^{p}(\RR^2 \times [0, 1]^{n})$ unless it is constant zero, and that an artificial neural network with a single hidden layer and logistic activation is a universal approximator of $L^{2}(\RR \times [0, 1]^{n})$. Our work partially generalizes Cybenko's fundamental universal approximation theorem on the unit hypercube $[0, 1]^{n}$.
    Date: 2019–10
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1910.01490&r=all
  13. By: Martijn Droes; Martin Hoesli; Steven C. Bourassa
    Abstract: This paper explores Rosen’s (1974) suggestion that within the hedonic framework there are natural tendencies toward market segmentation. We show that market segmentation can be estimated on the basis of an augmented hedonic model in which marginal prices are separated by household characteristics into different classes. The classes can either be exogenously defined or endogenously determined based on an unsupervised machine learning algorithm or a latent class formulation. We illustrate the usefulness of these methods using American Housing Survey data for Louisville and show that there are distinct housing market segments within the Louisville metropolitan area based on income and family structure.
    Keywords: Hedonic Model; heterogeneous households; latent class; Machine Learning; Market Segmentation
    JEL: R3
    Date: 2019–01–01
    URL: http://d.repec.org/n?u=RePEc:arz:wpaper:eres2019_218&r=all
  14. By: Marcelo Cajias; Jonas Willwersch; Felix Lorenz
    Abstract: Real estate transactions can be seen as a spatial point pattern over space and time. That means, that real estate transactions occur in places where at a certain point of time certain characteristics are given that lead to an investment decision. While the decision-making process by investors is impossible to capture, this paper applies new methods for capturing the conditions under which real estate transactions are made over space and time. In other words, we explain and forecast real estate transactions with machine learning methods including both real estate transactions, geographical information and most importantly microeconomic data.
    Keywords: Machine Learning; Point pattern analysis; Real estate transactions; Spatial-temporal analysis; Surveillance analysis
    JEL: R3
    Date: 2019–01–01
    URL: http://d.repec.org/n?u=RePEc:arz:wpaper:eres2019_171&r=all
  15. By: Giovanni Mariani; Yada Zhu; Jianbo Li; Florian Scheidegger; Roxana Istrate; Costas Bekas; A. Cristiano I. Malossi
    Abstract: Since decades, the data science community tries to propose prediction models of financial time series. Yet, driven by the rapid development of information technology and machine intelligence, the velocity of today's information leads to high market efficiency. Sound financial theories demonstrate that in an efficient marketplace all information available today, including expectations on future events, are represented in today prices whereas future price trend is driven by the uncertainty. This jeopardizes the efforts put in designing prediction models. To deal with the unpredictability of financial systems, today's portfolio management is largely based on the Markowitz framework which puts more emphasis in the analysis of the market uncertainty and less in the price prediction. The limitation of the Markowitz framework stands in taking very strong ideal assumptions about future returns probability distribution. To address this situation we propose PAGAN, a pioneering methodology based on deep generative models. The goal is modeling the market uncertainty that ultimately is the main factor driving future trends. The generative model learns the joint probability distribution of price trends for a set of financial assets to match the probability distribution of the real market. Once the model is trained, a portfolio is optimized by deciding the best diversification to minimize the risk and maximize the expected returns observed over the execution of several simulations. Applying the model for analyzing possible futures, is as simple as executing a Monte Carlo simulation, a technique very familiar to finance experts. The experimental results on different portfolios representing different geopolitical areas and industrial segments constructed using real-world public data sets demonstrate promising results.
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.10578&r=all
  16. By: Claudio Vitari (AMU - Aix Marseille Université, CERGAM - Centre d'Études et de Recherche en Gestion d'Aix-Marseille - AMU - Aix Marseille Université - UTLN - Université de Toulon); Elisabetta Raguseo (Polito - Politecnico di Torino [Torino])
    Abstract: Previous studies, grounded on the resource based view, have already explored the relationship between the business value that Big Data Analytics (BDA) can bring to firm performance. However, the role played by the environmental characteristics in which companies operate has not been investigated in the literature. We inform the theory, in that direction, via the integration of the contingency theory to the resource based view theory of the firm. This original and integrative model examines the moderating influence of environmental features on the relationship between BDA business value and firm performance. The combination of survey data and secondary financial data on a representative sample of medium and large companies makes possible the statistical validation of our research model. The results offer evidence that BDA business value leads to higher firm performance, namely financial performance, market performance and customer satisfaction. More original is the demonstration that this relationship is stronger in munificent environments, while the dynamism of the environment does not have any moderating effect on the performance of BDA solutions. It means that managers working for firms in markets with a growing demand are in the best position to profit from BDA.
    Keywords: Resource based view,contingency theory,Big Data Analytics,customer satisfaction,financial performance,market performance,munificence,dynamism
    Date: 2019–09–09
    URL: http://d.repec.org/n?u=RePEc:hal:journl:hal-02293765&r=all
  17. By: Arthur Charpentier (CREM - Centre de recherche en économie et management - UNICAEN - Université de Caen Normandie - NU - Normandie Université - UR1 - Université de Rennes 1 - UNIV-RENNES - Université de Rennes - CNRS - Centre National de la Recherche Scientifique, UQAM - Université du Québec à Montréal)
    Abstract: Les sociétés technologiques et le monde de l'assurance auraient tout pour être opposé. Agilité, rapidité, obsession du futur chez les uns, conservatisme, réflexivité, fascination pour les données passées chez les autres. Et pourtant les deux s'observent, et commencent à nouer des partenariats, comprenant que la donnée est leur coeur de métier.
    Date: 2019–09–23
    URL: http://d.repec.org/n?u=RePEc:hal:wpaper:hal-02294899&r=all
  18. By: Andreas Kindt
    Abstract: With the increasing digitization and big data, automated valuation models (AVMs) are becoming increasingly important within the real estate industry (internationally). The further potential for the use of AVM seems enormous. However, the mainstream AVM research is hitherto largely one-dimensional and requires a wider and focus. Especially stakeholders with situational-recurrent (e.g. mortgage lending, transaction, etc.) or regular valuation tasks (e.g., risk management, performance analysis, etc.), have the need for individual AVM solutions. An efficient access to the subject is still difficult because of the high complexity. Thus, the respective stakeholders have an increased interest in systematic and integrated decision support. Working on this point, the dissertation tries to give guidance for the optimal development and application of AVMs.
    Keywords: Automated Valuation Models; AVM; Big data; Digitalization; Property Valuation
    JEL: R3
    Date: 2019–01–01
    URL: http://d.repec.org/n?u=RePEc:arz:wpaper:eres2019_240&r=all
  19. By: Draca, Mirko (University of Warwick); Schwarz, Carlo (University of Warwick)
    Abstract: Strong evidence has been emerging that major democracies have become more politically polarized, at least according to measures based on the ideological positions of political elites. We ask: have the general public (‘citizens’) followed the same pattern? Our approach is based on unsupervised machine learning models as applied to issueposition survey data. This approach firstly indicates that coherent, latent ideologies are strongly apparent in the data, with a number of major, stable types that we label as: Liberal Centrist, Conservative Centrist, Left Anarchist and Right Anarchist. Using this framework, and a resulting measure of ‘citizen slant’, we are then able to decompose the shift in ideological positions across the population over time. Specifically, we find evidence of a ‘disappearing center’ in a range of countries with citizens shifting away from centrist ideologies into anti-establishment ‘anarchist’ ideologies over time. This trend is especially pronounced for the US.
    Keywords: Polarization, Ideology, Unsupervised Learning. JEL Classification: D72, C81.
    Date: 2019
    URL: http://d.repec.org/n?u=RePEc:cge:wacage:432&r=all
  20. By: Helene Dernis (OECD); Petros Gkotsis (European Commission - JRC); Nicola Grassano (European Commission - JRC); Shohei Nakazato (OECD); Mariagrazia Squicciarini (OECD); Brigitte van Beuzekom (OECD); Antonio Vezzani
    Abstract: This report brings together data on patents, scientific publications, trademarks and designs of the world’s top corporate R&D investors to shed some light on the role they play in shaping the future of technologies and AI. As for the two previous editions, the present report is the product of a collaborative effort of the JRC of the European Commission and the OECD, two organisations committed to provide high quality open data and up-to-date indicators and analysis. The audience this report wants to reach is quite diverse: from the scientific community to the industry representatives, from practitioners to policy makers. Its scope is to be a useful source of analysis and data for all those interested in getting an understanding of the scientific and technological activities of key industrial players, particularly in the field of AI. The data underlying the analysis presented are publicly available for all those who want to use them for further analysis.
    Keywords: R&D investment, Artificial Intelligence, Intellectual Property, Patents, Trademarks,Scientific publications
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:ipt:iptwpa:jrc117068&r=all
  21. By: Bénédicte H. Apouey (PSE - Paris School of Economics, PJSE - Paris Jourdan Sciences Economiques - UP1 - Université Panthéon-Sorbonne - ENS Paris - École normale supérieure - Paris - INRA - Institut National de la Recherche Agronomique - EHESS - École des hautes études en sciences sociales - ENPC - École des Ponts ParisTech - CNRS - Centre National de la Recherche Scientifique)
    Abstract: Au cours d'une enquête quantitative menée en 2016 auprès de 1 700 adhérents d'une mutuelle, nous avons mesuré l'intérêt pour différents services qui seraient proposés par la mutuelle et utiliseraient les données personnelles dans une logique de médecine personnalisée. Les répondants sont à la fois préoccupés par la confidentialité de leurs données et intéressés par leur utilisation dans un but de suivi, de prédiction et de prévention. L'intérêt est plus marqué en cas de mauvaise santé et d'inquiétude pour les vieux jours. On observe un intérêt plus faible chez les individus dont la position sociale est plus élevée, peut-être du fait de leurs ressources matérielles et culturelles et de leur préoccupation vis-à-vis des risques liés à l'utilisation des données.
    Keywords: données personnelles en santé,objets connectés,quantified self,big data,assureurs,France
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:hal:psewpa:halshs-02295392&r=all
  22. By: Jau-er Chen; Jia-Jyun Tien
    Abstract: The aim of this paper is to investigate estimation and inference on a low-dimensional causal parameter in the presence of high-dimensional controls in an instrumental variable quantile regression. The estimation and inference are based on the Neyman-type orthogonal moment conditions, that are relatively insensitive to the estimation of the nuisance parameters. The Monte Carlo experiments show that the econometric procedure performs well. We also apply the procedure to reinvestigate two empirical studies: the quantile treatment effect of 401(k) participation on accumulated wealth, and the distributional effect of job-training program participation on trainee earnings.
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.12592&r=all
  23. By: Michael Poli; Jinkyoo Park; Ilija Ilievski
    Abstract: Finance is a particularly challenging application area for deep learning models due to low noise-to-signal ratio, non-stationarity, and partial observability. Non-deliverable-forwards (NDF), a derivatives contract used in foreign exchange (FX) trading, presents additional difficulty in the form of long-term planning required for an effective selection of start and end date of the contract. In this work, we focus on tackling the problem of NDF tenor selection by leveraging high-dimensional sequential data consisting of spot rates, technical indicators and expert tenor patterns. To this end, we construct a dataset from the Depository Trust & Clearing Corporation (DTCC) NDF data that includes a comprehensive list of NDF volumes and daily spot rates for 64 FX pairs. We introduce WaveATTentionNet (WATTNet), a novel temporal convolution (TCN) model for spatio-temporal modeling of highly multivariate time series, and validate it across NDF markets with varying degrees of dissimilarity between the training and test periods in terms of volatility and general market regimes. The proposed method achieves a significant positive return on investment (ROI) in all NDF markets under analysis, outperforming recurrent and classical baselines by a wide margin. Finally, we propose two orthogonal interpretability approaches to verify noise stability and detect the driving factors of the learned tenor selection strategy.
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.10801&r=all
  24. By: Yangang Chen; Justin W. L. Wan
    Abstract: We propose a deep neural network framework for computing prices and deltas of American options in high dimensions. The architecture of the framework is a sequence of neural networks, where each network learns the difference of the price functions between adjacent timesteps. We introduce the least squares residual of the associated backward stochastic differential equation as the loss function. Our proposed framework yields prices and deltas on the entire spacetime, not only at a given point. The computational cost of the proposed approach is quadratic in dimension, which addresses the curse of dimensionality issue that state-of-the-art approaches suffer. Our numerical simulations demonstrate these contributions, and show that the proposed neural network framework outperforms state-of-the-art approaches in high dimensions.
    Date: 2019–09
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1909.11532&r=all
  25. By: Jacques Bughin; Nicolas van Zeebroeck
    Date: 2018
    URL: http://d.repec.org/n?u=RePEc:ulb:ulbeco:2013/283916&r=all
  26. By: Yan Carriere-Swallow; Vikram Haksar
    Abstract: This SPR Departmental Paper will provide policymakers with a framework for studying changes to national data policy frameworks.
    Keywords: Unemployment;Economic integration;Economic conditions;Economic growth;Statistics;Data,growth,inequality,privacy,consumer protection,competition,financial stability,cyber-security,DPPP,cybersecurity,personal data,economic characteristic,increase return,individual data
    Date: 2019–09–23
    URL: http://d.repec.org/n?u=RePEc:imf:imfdep:19/16&r=all

This nep-big issue is ©2019 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at http://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.