nep-big New Economics Papers
on Big Data
Issue of 2023‒03‒27
25 papers chosen by
Tom Coupé
University of Canterbury

  1. Remote Work across Jobs, Companies, and Space By Stephen Hansen; Peter John Lambert; Nicholas Bloom; Steven J. Davis; Raffaella Sadun; Bledi Taska
  2. Does Machine Learning Amplify Pricing Errors in the Housing Market? -- The Economics of Machine Learning Feedback Loops By Nikhil Malik; Emaad Manzoor
  3. Real Estate Property Valuation using Self-Supervised Vision Transformers By Mahdieh Yazdani; Maziar Raissi
  4. The Effects of Artificial Intelligence on the World as a Whole from an Economic Perspective By Sharma, Rahul
  5. Inflation in Pakistan: High-Frequency Estimation and Forecasting By Sonan Memon
  6. The performance of time series forecasting based on classical and machine learning methods for S&P 500 index By Maudud Hassan Uzzal; Robert Ślepaczuk
  7. Attitudes and Latent Class Choice Models using Machine learning By Lorena Torres Lahoz; Francisco Camara Pereira; Georges Sfeir; Ioanna Arkoudi; Mayara Moraes Monteiro; Carlos Lima Azevedo
  8. Unsupervised Machine Learning for Explainable Health Care Fraud Detection By Shubhranshu Shekhar; Jetson Leder-Luis; Leman Akoglu
  9. Remote Work across Jobs, Companies, and Space By Hansen, Stephen; Lambert, Peter John; Bloom, Nicholas; Davis, Steven J.; Sadun, Raffaella; Taska, Bledi
  10. Addressing Data Gaps with Innovative Data Sources By Albert, Jose Ramon G.; Vizmanos, Jana Flor V.; Muñoz, Mika S.; Brucal, Arlan; Halili, Riza Teresita; Lumba, Angelo Jose; Patanñe, Gaile Anne
  11. The global economic impact of AI technologies in the fight against financial crime By James Bell
  12. The Macroeconomy as a Random Forest By Philippe Goulet Coulombe
  13. Solar Panel Adoption in SMEs in Emerging Countries By Pedro I. Hancevic; Hector H. Sandoval
  14. A Neural Phillips Curve and a Deep Output Gap By Philippe Goulet Coulombe
  15. Forecasting Macroeconomic Tail Risk in Real Time: Do Textual Data Add Value? By Philipp Ad\"ammer; Jan Pr\"user; Rainer Sch\"ussler
  16. Artificial intelligence adoption in the public sector- a case study By Laura Nurski
  17. Exploring the Advantages of Transformers for High-Frequency Trading By Fazl Barez; Paul Bilokon; Arthur Gervais; Nikita Lisitsyn
  18. Nowcasting GDP using tone-adjusted time varying news topics: Evidence from the financial press By Dorinth van Dijk; Jasper de Winter
  19. Parametric Differential Machine Learning for Pricing and Calibration By Arun Kumar Polala; Bernhard Hientzsch
  20. Reevaluating the Taylor Rule with Machine Learning By Alper Deniz Karakas
  21. The demand and supply of information about inflation By Massimiliano Marcellino; Dalibor Stevanovic
  22. On the Validity of Using Webpage Texts to Identify the Target Population of a Survey: An Application to Detect Online Platforms By Daas, Piet; Hassink, Wolter; Klijs, Bart
  23. Spatio-Temporal Momentum: Jointly Learning Time-Series and Cross-Sectional Strategies By Wee Ling Tan; Stephen Roberts; Stefan Zohren
  24. Post-Episodic Reinforcement Learning Inference By Vasilis Syrgkanis; Ruohan Zhan
  25. Simultaneous upper and lower bounds of American option prices with hedging via neural networks By Ivan Guo; Nicolas Langren\'e; Jiahao Wu

  1. By: Stephen Hansen; Peter John Lambert; Nicholas Bloom; Steven J. Davis; Raffaella Sadun; Bledi Taska
    Abstract: The pandemic catalyzed an enduring shift to remote work. To measure and characterize this shift, we examine more than 250 million job vacancy postings across five English-speaking countries. Our measurements rely on a state-of-the-art language-processing framework that we fit, test, and refine using 30, 000 human classifications. We achieve 99% accuracy in flagging job postings that advertise hybrid or fully remote work, greatly outperforming dictionary methods and also outperforming other machine-learning methods. From 2019 to early 2023, the share of postings that say new employees can work remotely one or more days per week rose more than three-fold in the U.S and by a factor of five or more in Australia, Canada, New Zealand and the U.K. These developments are highly non-uniform across and within cities, industries, occupations, and companies. Even when zooming in on employers in the same industry competing for talent in the same occupations, we find large differences in the share of job postings that explicitly offer remote work.
    JEL: C55 E24 M54 O33 R3
    Date: 2023–03
  2. By: Nikhil Malik; Emaad Manzoor
    Abstract: Machine learning algorithms are increasingly employed to price or value homes for sale, properties for rent, rides for hire, and various other goods and services. Machine learning-based prices are typically generated by complex algorithms trained on historical sales data. However, displaying these prices to consumers anchors the realized sales prices, which will in turn become training samples for future iterations of the algorithms. The economic implications of this machine learning "feedback loop" - an indirect human-algorithm interaction - remain relatively unexplored. In this work, we develop an analytical model of machine learning feedback loops in the context of the housing market. We show that feedback loops lead machine learning algorithms to become overconfident in their own accuracy (by underestimating its error), and leads home sellers to over-rely on possibly erroneous algorithmic prices. As a consequence at the feedback loop equilibrium, sale prices can become entirely erratic (relative to true consumer preferences in absence of ML price interference). We then identify conditions (choice of ML models, seller characteristics and market characteristics) where the economic payoffs for home sellers at the feedback loop equilibrium is worse off than no machine learning. We also empirically validate primitive building blocks of our analytical model using housing market data from Zillow. We conclude by prescribing algorithmic corrective strategies to mitigate the effects of machine learning feedback loops, discuss the incentives for platforms to adopt these strategies, and discuss the role of policymakers in regulating the same.
    Date: 2023–02
  3. By: Mahdieh Yazdani; Maziar Raissi
    Abstract: The use of Artificial Intelligence (AI) in the real estate market has been growing in recent years. In this paper, we propose a new method for property valuation that utilizes self-supervised vision transformers, a recent breakthrough in computer vision and deep learning. Our proposed algorithm uses a combination of machine learning, computer vision and hedonic pricing models trained on real estate data to estimate the value of a given property. We collected and pre-processed a data set of real estate properties in the city of Boulder, Colorado and used it to train, validate and test our algorithm. Our data set consisted of qualitative images (including house interiors, exteriors, and street views) as well as quantitative features such as the number of bedrooms, bathrooms, square footage, lot square footage, property age, crime rates, and proximity to amenities. We evaluated the performance of our model using metrics such as Root Mean Squared Error (RMSE). Our findings indicate that these techniques are able to accurately predict the value of properties, with a low RMSE. The proposed algorithm outperforms traditional appraisal methods that do not leverage property images and has the potential to be used in real-world applications.
    Date: 2023–01
  4. By: Sharma, Rahul
    Abstract: Artificial intelligence (AI) has made tremendous advances in recent years, and there is no doubt that this technology will have a significant impact on the overall economy in terms of productivity, growth, markets, as well as innovation. A growing number of perspectives on the impact of artificial intelligence (AI) are flooding the business press, but finding one that deals with the economic impact of AI in a unique and original way is becoming increasingly difficult. In terms of adoption rates, there has been quite an uneven adoption rate of AI and ML (artificial intelligence and machine learning) methods in the economics profession, as far as adoption rates are concerned, as far as AI and ML methods are concerned. Microeconomics is one of the most prominent fields in which artificial intelligence and machine learning are being used. As a result of the explosion of data collection, especially at the consumer level (companies such as Google for example), artificial intelligence and machine learning have become increasingly apparent and feasible. It has been observed that, due to the enormous amount of information that these models require in order to be useful, their application has been heavily concentrated in the field of microeconomics, which is a field that requires a vast amount of data in order to be useful.
    Keywords: artificial intelligence, machine learning, macroeconomics, internet of things, inventory management, technology
    JEL: O1 O32 Q5 Q55
    Date: 2021–04–15
  5. By: Sonan Memon (Pakistan Institute of Development Economics)
    Abstract: I begin by motivating the utility of high-frequency inflation estimation and reviewing recent work done at the State Bank of Pakistan for inflation forecasting and now-casting GDP using machine learning (ML) tools. I also present stylised facts about the structure of historical and especially recent inflation trends in Pakistan.
    Keywords: Forecast Accuracy, Forecasts of Inflation in Pakistan, High Frequency, Hyperinflation, Inflation Estimation and Forecasting, Machine Learning, Synthetic Data, VAR Models, Web Scrapping and Scanner Data,
    JEL: C53 E30 E31 E32 E37 E47 E52 E58
    Date: 2022
  6. By: Maudud Hassan Uzzal (University of Warsaw, Faculty of Economic Sciences, Quantitative Finance Research Group); Robert Ślepaczuk (University of Warsaw, Faculty of Economic Sciences, Department of Quantitative Finance, Quantitative Finance Research Group)
    Abstract: Based on one step ahead forecasts, this study compares the forecasting abilities of the traditional technique (ARIMA) with recurrent neural network (LSTM). In order to check the possible use of these forecasts in different asset management methods, these forecasts are afterwards included into trading signals of investment strategies. As a benchmark method, the Random Walk model producing naive forecasts has been utilized. This research examines daily data from the S&P 500 index for 20 years, from 2000 to 2020, and it includes information on some significant market turbulence. The methods were tested in terms of robustness to changes in parameters and hyperparameters and evaluated based on various error metrics (MAE, MAPE, RMSE MSE). The results show that ARIMA outperforms LSTM in terms of one step ahead forecasts. Finally, LSTM model with a variety of hyperparameters - including a number of epochs, a loss function, an optimizer, activation functions, a number of units, a batch size, and a learning rate - was tested in order to check its robustness.
    Keywords: deep learning, recurrent neural networks, ARIMA, algorithmic investment strategies, trading systems, LSTM, walk-forward process, optimization
    JEL: C4 C14 C45 C53 C58 G13
    Date: 2023
  7. By: Lorena Torres Lahoz (DTU Management, Technical University of Denmark); Francisco Camara Pereira (DTU Management, Technical University of Denmark); Georges Sfeir (DTU Management, Technical University of Denmark); Ioanna Arkoudi (DTU Management, Technical University of Denmark); Mayara Moraes Monteiro (DTU Management, Technical University of Denmark); Carlos Lima Azevedo (DTU Management, Technical University of Denmark)
    Abstract: Latent Class Choice Models (LCCM) are extensions of discrete choice models (DCMs) that capture unobserved heterogeneity in the choice process by segmenting the population based on the assumption of preference similarities. We present a method of efficiently incorporating attitudinal indicators in the specification of LCCM, by introducing Artificial Neural Networks (ANN) to formulate latent variables constructs. This formulation overcomes structural equations in its capability of exploring the relationship between the attitudinal indicators and the decision choice, given the Machine Learning (ML) flexibility and power in capturing unobserved and complex behavioural features, such as attitudes and beliefs. All of this while still maintaining the consistency of the theoretical assumptions presented in the Generalized Random Utility model and the interpretability of the estimated parameters. We test our proposed framework for estimating a Car-Sharing (CS) service subscription choice with stated preference data from Copenhagen, Denmark. The results show that our proposed approach provides a complete and realistic segmentation, which helps design better policies.
    Date: 2023–02
  8. By: Shubhranshu Shekhar; Jetson Leder-Luis; Leman Akoglu
    Abstract: The US spends more than 4 trillion dollars per year on health care, largely conducted by private providers and reimbursed by insurers. A major concern in this system is overbilling, waste and fraud by providers, who face incentives to misreport on their claims in order to receive higher payments. In this work, we develop novel machine learning tools to identify providers that overbill insurers. Using large-scale claims data from Medicare, the US federal health insurance program for elderly adults and the disabled, we identify patterns consistent with fraud or overbilling among inpatient hospitalizations. Our proposed approach for fraud detection is fully unsupervised, not relying on any labeled training data, and is explainable to end users, providing reasoning and interpretable insights into the potentially suspicious behavior of the flagged providers. Data from the Department of Justice on providers facing anti-fraud lawsuits and case studies of suspicious providers validate our approach and findings. We also perform a post-analysis to understand hospital characteristics, those not used for detection but associate with a high suspiciousness score. Our method provides an 8-fold lift over random targeting, and can be used to guide investigations and auditing of suspicious providers for both public and private health insurance systems.
    JEL: C19 D73 I13 K42 M42
    Date: 2023–02
  9. By: Hansen, Stephen (University College London); Lambert, Peter John (London School of Economics); Bloom, Nicholas (Stanford University); Davis, Steven J. (University of Chicago); Sadun, Raffaella (Harvard University); Taska, Bledi (Lightcast)
    Abstract: The pandemic catalyzed an enduring shift to remote work. To measure and characterize this shift, we examine more than 250 million job vacancy postings across five English-speaking countries. Our measurements rely on a state-of-the-art languageprocessing framework that we fit, test, and refine using 30, 000 human classifications. We achieve 99% accuracy in flagging job postings that advertise hybrid or fully remote work, greatly outperforming dictionary methods and also outperforming other machine-learning methods. From 2019 to early 2023, the share of postings that say new employees can work remotely one or more days per week rose more than three-fold in the U.S and by a factor of five or more in Australia, Canada, New Zealand and the U.K. These developments are highly non-uniform across and within cities, industries, occupations, and companies. Even when zooming in on employers in the same industry competing for talent in the same occupations, we find large differences in the share of job postings that explicitly offer remote work.
    Keywords: remote work, hybrid work, work from home, job vacancies, text classifiers, BERT, pandemic impact, labour markets, COVID-19
    JEL: E24 O33 R3 M54 C55
    Date: 2023–02
  10. By: Albert, Jose Ramon G.; Vizmanos, Jana Flor V.; Muñoz, Mika S.; Brucal, Arlan; Halili, Riza Teresita; Lumba, Angelo Jose; Patanñe, Gaile Anne
    Abstract: With the advent of digital transformation, information and communications technology innovations have also led to a “data revolution” wherein more data is being captured, produced, stored, accessed, analyzed, archived, and reanalyzed at an exponential pace. An examination of new data sources, including big data and crowd-sourced data, can complement traditional sources of statistics and unlock insights that can ultimately lead to interventions for better outcomes by informing policies and actions toward attaining robust, sustainable, and inclusive development. This study will examine PIDS website download data and Twitter data to illustrate stories obtained from new data sources and explore how access, analysis, and use of new data sources can be promoted. Several quantitative tools are used on these new data sources, including (a) market basket analysis for website download data, (b) text mining (and sentiment analysis) for web scraped Twitter data, and (c) other big data analytics tools. Policy issues are also discussed, including risk management for using these new data sources. Comments to this paper are welcome within 60 days from the date of posting. Email
    Keywords: data revolution;big data;new data sources;social media data;market-basket analysis;web scraping;text mining;sentiment analysis
    Date: 2022
  11. By: James Bell
    Abstract: Is the rapid adoption of Artificial Intelligence a sign that creative destruction (a capitalist innovation process first theorised in 1942) is occurring? Although its theory suggests that it is only visible over time in aggregate, this paper devises three hypotheses to test its presence on a macro level and research methods to produce the required data. This paper tests the theory using news archives, questionnaires, and interviews with industry professionals. It considers the risks of adopting Artificial Intelligence, its current performance in the market and its general applicability to the role. The results suggest that creative destruction is occurring in the AML industry despite the activities of the regulators acting as natural blockers to innovation. This is a pressurised situation where current-generation Artificial Intelligence may offer more harm than benefit. For managers, this papers results suggest that safely pursuing AI in AML requires having realistic expectations of Artificial Intelligence's benefits combined with using a framework for AI Ethics.
    Date: 2023–02
  12. By: Philippe Goulet Coulombe (University of Pennsylvania)
    Abstract: I develop Macroeconomic Random Forest (MRF), an algorithm adapting the canonical Machine Learning (ML) tool to flexibly model evolving parameters in a linear macro equation. Its main output, Generalized Time-Varying Parameters (GTVPs), is a versatile device nesting many popular nonlinearities (threshold/switching, smooth transition, structural breaks/change) and allowing for sophisticated new ones. The approach delivers clear forecasting gains over numerous alternatives, predicts the 2008 drastic rise in unemployment, and performs well for inflation. Unlike most ML-based methods, MRF is directly interpretable — via its GTVPs. For instance, the successful unemployment forecast is due to the influence of forward-looking variables (e.g., term spreads, housing starts) nearly doubling before every recession. Interestingly, the Phillips curve has indeed flattened, and its might is highly cyclical.
    Date: 2021–06
  13. By: Pedro I. Hancevic (CIDE/Universidad Panamericana); Hector H. Sandoval (Bureau of Economic and Business Research)
    Abstract: We analyze the determinants of adoption of distributed solar photovoltaic systems, focusing on small and medium-sized commercial and service firms. We make use of monthly billing data that is perfectly matched with data from the ENCENRE-2019 –a novel survey that gathers data on electricity consumption, stock of electric equipment, and a rich set of firm characteristics in the Metropolitan Area of Aguascalientes, Mexico. Using an econometric model, we find evidence that a set of explanatory variables such as business characteristics, the economic sector, ownership status, stock and usage of equipment and appliances, presence of other solartechnologies, and views about the use of renewable energy are important determinants of the probability of adoption of solar panel systems. Furthermore, using machine learning methods to identify the best predictors of solar adoption, we indirectly validate the theory-driven empirical model by assessing a large set of explanatory variables and selecting a subset of these variables. In addition, we investigate relevant cases where a priory solar panel adoption seems to be costeffective but structural adoption barriers and adoption gaps might coexist for certain groups of electricity users. We also calculate the social cost savings and the avoided CO2 emissions. Finally, based on our results, we provide several policy implications and recommendations.
    Keywords: small and medium-sized enterprises (SMEs), distributed photovoltaic generation, electricity consumption, technology adoption, Mexico
    JEL: D22 O14 Q40 Q53
    Date: 2023–03
  14. By: Philippe Goulet Coulombe (University of Quebec in Montreal)
    Abstract: Many problems plague the estimation of Phillips curves. Among them is the hurdle that the two key components, inflation expectations and the output gap, are both unobserved. Traditional remedies include creating reasonable proxies for the notable absentees or extracting them via some form of assumptions-heavy filtering procedure. I propose an alternative route: a Hemisphere Neural Network (HNN) whose peculiar architecture yields a final layer where components can be interpreted as latent states within a Neural Phillips Curve. There are benefits. First, HNN conducts the supervised estimation of nonlinearities that arise when translating a high-dimensional set of observed regressors into latent states. Second, computations are fast. Third, forecasts are economically interpretable. Fourth, inflation volatility can also be predicted by merely adding a hemisphere to the model. Among other findings, the contribution of real activity to inflation appears severely underestimated in traditional econometric specifications. Also, HNN captures out-of-sample the 2021 upswing in inflation and attributes it first to an abrupt and sizable disanchoring of the expectations component, followed by a wildly positive gap starting from late 2020. HNN’s gap unique path comes from dispensing with unemployment and GDP in favor of an amalgam of nonlinearly processed alternative tightness indicators – some of which are skyrocketing as of early 2022.
    Date: 2022–01
  15. By: Philipp Ad\"ammer; Jan Pr\"user; Rainer Sch\"ussler
    Abstract: We examine the incremental value of news-based data relative to the FRED-MD economic indicators for quantile predictions (now- and forecasts) of employment, output, inflation and consumer sentiment. Our results suggest that news data contain valuable information not captured by economic indicators, particularly for left-tail forecasts. Methods that capture quantile-specific non-linearities produce superior forecasts relative to methods that feature linear predictive relationships. However, adding news-based data substantially increases the performance of quantile-specific linear models, especially in the left tail. Variable importance analyses reveal that left tail predictions are determined by both economic and textual indicators, with the latter having the most pronounced impact on consumer sentiment.
    Date: 2023–02
  16. By: Laura Nurski
    Abstract: The goal is to identify pitfalls in the process of technology adoption and to provide some lessons for both policy and business
    Date: 2023–03
  17. By: Fazl Barez; Paul Bilokon; Arthur Gervais; Nikita Lisitsyn
    Abstract: This paper explores the novel deep learning Transformers architectures for high-frequency Bitcoin-USDT log-return forecasting and compares them to the traditional Long Short-Term Memory models. A hybrid Transformer model, called \textbf{HFformer}, is then introduced for time series forecasting which incorporates a Transformer encoder, linear decoder, spiking activations, and quantile loss function, and does not use position encoding. Furthermore, possible high-frequency trading strategies for use with the HFformer model are discussed, including trade sizing, trading signal aggregation, and minimal trading threshold. Ultimately, the performance of the HFformer and Long Short-Term Memory models are assessed and results indicate that the HFformer achieves a higher cumulative PnL than the LSTM when trading with multiple signals during backtesting.
    Date: 2023–02
  18. By: Dorinth van Dijk; Jasper de Winter
    Abstract: We extract tone-adjusted, time-varying and hierarchically ordered topics from a large corpus of Dutch financial news and investigate whether these topics are useful for monitoring the business cycle and nowcasting GDP growth in the Netherlands. The financial newspaper articles span the period January 1985 up until January 2021. Our newspaper sentiment indicator has a high concordance with the business cycle. Further, we find newspaper sentiment increases the accuracy of our nowcast for GDP growth using a dynamic fac- tor model, especially in periods of crisis. We conclude that our tone-adjusted newspaper topics contain valuable information not embodied in monthly indicators from statistical offices.
    Keywords: Factor models, topic modeling, nowcasting
    JEL: C8 C38 C55 E3
    Date: 2023–03
  19. By: Arun Kumar Polala; Bernhard Hientzsch
    Abstract: Differential machine learning (DML) is a recently proposed technique that uses samplewise state derivatives to regularize least square fits to learn conditional expectations of functionals of stochastic processes as functions of state variables. Exploiting the derivative information leads to fewer samples than a vanilla ML approach for the same level of precision. This paper extends the methodology to parametric problems where the processes and functionals also depend on model and contract parameters, respectively. In addition, we propose adaptive parameter sampling to improve relative accuracy when the functionals have different magnitudes for different parameter sets. For calibration, we construct pricing surrogates for calibration instruments and optimize over them globally. We discuss strategies for robust calibration. We demonstrate the usefulness of our methodology on one-factor Cheyette models with benchmark rate volatility specification with an extra stochastic volatility factor on (two-curve) caplet prices at different strikes and maturities, first for parametric pricing, and then by calibrating to a given caplet volatility surface. To allow convenient and efficient simulation of processes and functionals and in particular the corresponding computation of samplewise derivatives, we propose to specify the processes and functionals in a low-code way close to mathematical notation which is then used to generate efficient computation of the functionals and derivatives in TensorFlow.
    Date: 2023–02
  20. By: Alper Deniz Karakas
    Abstract: This paper aims to reevaluate the Taylor Rule, through a linear and a nonlinear method, such that its estimated federal funds rates match those actually previously implemented by the Federal Reserve Bank. In the linear method, this paper uses an OLS regression model to find more accurate coefficients within the same Taylor Rule equation in which the dependent variable is the federal funds rate, and the independent variables are the inflation rate, the inflation gap, and the output gap. The intercept in the OLS regression model would capture the constant equilibrium target real interest rate set at 2. The linear OLS method suggests that the Taylor Rule overestimates the output gap and standalone inflation rate's coefficients for the Taylor Rule. The coefficients this paper suggests are shown in equation (2). In the nonlinear method, this paper uses a machine learning system in which the two inputs are the inflation rate and the output gap and the output is the federal funds rate. This system utilizes gradient descent error minimization to create a model that minimizes the error between the estimated federal funds rate and the actual previously implemented federal funds rate. Since the machine learning system allows the model to capture the more realistic nonlinear relationship between the variables, it significantly increases the estimation accuracy as a result. The actual and estimated federal funds rates are almost identical besides three recessions caused by bubble bursts, which the paper addresses in the concluding remarks. Overall, the first method provides theoretical insight while the second suggests a model with improved applicability.
    Date: 2023–02
  21. By: Massimiliano Marcellino (Bocconi University, IGIER, Baffi-Carefin, BIDSA and CEPR); Dalibor Stevanovic (University of Quebec in Montreal and CIRANO)
    Abstract: In this article we study how the demand and supply of information about inflation affect inflation developments. As a proxy for the demand of information, we extract Google Trends (GT) for keywords such as "inflation", "inflation rate", or "price increase". The rationale is that when agents are more interested about inflation, they should search for information about it, and Google is by now a natural source. As a proxy for the supply of information about inflation, we instead use an indicator based on a (standardized) count of the Wall Street Journal (WSJ) articles containing the word "inflat" in their title. We find that measures of demand (GT) and supply (WSJ) of inflation information have a relevant role to understand and predict actual inflation developments, with the more granular information improving expectation formation, especially so during periods when inflation is very high or low. In particular, the full information rational expectation hypothesis is rejected, suggesting that some informational rigidities exist and are waiting to be exploited. Contrary to the existing evidence, we conclude that the media communication and agents attention do play an important role for aggregate inflation expectations, and this remains valid also when controlling for FED communications.
    Keywords: Inflation, Expectations, Google trends, Text analysis
    JEL: C53 C83 D83 D84 E31 E37
    Date: 2022–08
  22. By: Daas, Piet (Eindhoven University of Technology); Hassink, Wolter (Utrecht University); Klijs, Bart (Statistics Netherlands)
    Abstract: A statistical classification model was developed to identify online platform organizations based on the texts on their website. The model was subsequently used to identify all (potential) platform organizations with a website included in the Dutch Business Register. The empirical outcomes of the statistical model were plausible in terms of the words and the bimodal distribution of fitted probabilities, but the results indicated an overestimation of the number of platform organizations. Next, the external validity of the outcomes was investigated through a survey held under the organizations that were identified as a platform organization by the statistical classification model. The response by the organizations to the survey confirmed a substantial number of type-I errors. Furthermore, it revealed a positive association between the fitted probability of the text-based classification model and the organization's response to the survey question on being an online platform organization. The survey results indicated that the text-based classification model can be used to obtain a subpopulation of potential platform organizations from the entire population of businesses with a website.
    Keywords: online platform organizations, external validation, type-I error, machine learning, web pages
    JEL: C81 C83 D20 D83 L20
    Date: 2023–02
  23. By: Wee Ling Tan; Stephen Roberts; Stefan Zohren
    Abstract: We introduce Spatio-Temporal Momentum strategies, a class of models that unify both time-series and cross-sectional momentum strategies by trading assets based on their cross-sectional momentum features over time. While both time-series and cross-sectional momentum strategies are designed to systematically capture momentum risk premia, these strategies are regarded as distinct implementations and do not consider the concurrent relationship and predictability between temporal and cross-sectional momentum features of different assets. We model spatio-temporal momentum with neural networks of varying complexities and demonstrate that a simple neural network with only a single fully connected layer learns to simultaneously generate trading signals for all assets in a portfolio by incorporating both their time-series and cross-sectional momentum features. Backtesting on portfolios of 46 actively-traded US equities and 12 equity index futures contracts, we demonstrate that the model is able to retain its performance over benchmarks in the presence of high transaction costs of up to 5-10 basis points. In particular, we find that the model when coupled with least absolute shrinkage and turnover regularization results in the best performance over various transaction cost scenarios.
    Date: 2023–02
  24. By: Vasilis Syrgkanis; Ruohan Zhan
    Abstract: We consider estimation and inference with data collected from episodic reinforcement learning (RL) algorithms; i.e. adaptive experimentation algorithms that at each period (aka episode) interact multiple times in a sequential manner with a single treated unit. Our goal is to be able to evaluate counterfactual adaptive policies after data collection and to estimate structural parameters such as dynamic treatment effects, which can be used for credit assignment (e.g. what was the effect of the first period action on the final outcome). Such parameters of interest can be framed as solutions to moment equations, but not minimizers of a population loss function, leading to Z-estimation approaches in the case of static data. However, such estimators fail to be asymptotically normal in the case of adaptive data collection. We propose a re-weighted Z-estimation approach with carefully designed adaptive weights to stabilize the episode-varying estimation variance, which results from the nonstationary policy that typical episodic RL algorithms invoke. We identify proper weighting schemes to restore the consistency and asymptotic normality of the re-weighted Z-estimators for target parameters, which allows for hypothesis testing and constructing reliable confidence regions for target parameters of interest. Primary applications include dynamic treatment effect estimation and dynamic off-policy evaluation.
    Date: 2023–02
  25. By: Ivan Guo; Nicolas Langren\'e; Jiahao Wu
    Abstract: In this paper, we introduce two methods to solve the American-style option pricing problem and its dual form at the same time using neural networks. Without applying nested Monte Carlo, the first method uses a series of neural networks to simultaneously compute both the lower and upper bounds of the option price, and the second one accomplishes the same goal with one global network. The avoidance of extra simulations and the use of neural networks significantly reduce the computational complexity and allow us to price Bermudan options with frequent exercise opportunities in high dimensions, as illustrated by the provided numerical experiments. As a by-product, these methods also derive a hedging strategy for the option, which can also be used as a control variate for variance reduction.
    Date: 2023–02

This nep-big issue is ©2023 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.