|
on Big Data |
By: | Hanemaaijer, Kyra (Erasmus University Rotterdam); Marie, Olivier (Erasmus University Rotterdam); Musumeci, Marco (Erasmus University Rotterdam) |
Abstract: | What are the consequences of religious obligations conflicting with civic duties? We investigate this question by evaluating changes in the performance of practicing Muslim students when end-of-secondary-school exams and Ramadan overlapped in the Netherlands. Using administrative data on exam takers and a machine learning model to individually predict fasting probability, we estimate that the grades and pass rate of compliers dropped significantly. This negative impact was especially strong for low achievers and those from religiously segregated schools. Investigating mechanisms, we find evidence that not being able to sleep in the morning before an afternoon exam was particularly detrimental to performance. |
Keywords: | religion, productivity, Ramadan, education, The Netherlands |
JEL: | I2 I24 Z12 J15 |
Date: | 2023–06 |
URL: | http://d.repec.org/n?u=RePEc:iza:izadps:dp16249&r=big |
By: | Wassim Le Lann (UO - Université d'Orléans, LEO - Laboratoire d'Économie d'Orleans [2022-...] - UO - Université d'Orléans - UT - Université de Tours - UCA - Université Clermont Auvergne); Gauthier Delozière (CMB - Centre Marc Bloch - MEAE - Ministère de l'Europe et des Affaires étrangères - Bundesministerium für Bildung und Forschung - M.E.N.E.S.R. - Ministère de l'Education nationale, de l’Enseignement supérieur et de la Recherche - CNRS - Centre National de la Recherche Scientifique, EdD - École de Droit de Sciences Po (Sciences Po) - Sciences Po - Sciences Po); Yann Le Lann (Université de Lille, CERIES) |
Abstract: | In times of global ecological crisis, the responsibility of large corporations in environmental degradation is increasingly pointed out. As a result, there has been a surge in private organizations' pledges to reduce their environmental impact in recent years. In this paper, we demonstrate that companies with poor environmental responsibility have incentives to take such pledges to maintain their ability to attract high-skilled human capital. Through a case study on a French climate movement which was initiated by elite students who threatened to boycott job offers from polluting employers, we find that environmental pledges can significantly attenuate this selection effect. Using a unique and large survey database on the climate movement participants (n=2307) and machine learning classifiers, we find that individuals who initially intended to refuse a job offer from a polluting employer were, on average, three times less likely to hold such intentions after being exposed to a corporate environmental pledge. This result can be explained by the fact that intentions to refuse to work for polluting companies, and reactions to environmental pledges are driven by different factors. Furthermore, we find substantial heterogeneity in the response to environmental pledges, which is primarily explained by career perspectives, beliefs about the ecological crisis and support for radical political action in the name of ecology. |
Keywords: | Climate movement, Greenwashing, Human capital, Organizational behavior, Labor market, Machine Learning |
Date: | 2023–06–24 |
URL: | http://d.repec.org/n?u=RePEc:hal:spmain:hal-04140191&r=big |
By: | Wang, Zuyi; Tejeda, Hernan A.; Kim, Man-Keun |
Keywords: | Agribusiness, Marketing, Research Methods/Statistical Methods |
Date: | 2023 |
URL: | http://d.repec.org/n?u=RePEc:ags:aaea22:335521&r=big |
By: | Evelina Gavrilova; Audun Langørgen; Floris T. Zoutman; Floris Zoutman |
Abstract: | This paper develops a machine-learning method that allows researchers to estimate heterogeneous treatment effects with panel data in a setting with many covariates. Our method, which we name the dynamic causal forest (DCF) method, extends the causal-forest method of Wager and Athey (2018) by allowing for the estimation of dynamic treatment effects in a difference-in-difference setting. Regular causal forests require conditional independence to consistently estimate heterogeneous treatment effects. In contrast, DCFs provide a consistent estimate for heterogeneous treatment effects under the weaker assumption of parallel trends. DCFs can be used to create event-study plots which aid in the inspection of pre-trends and treatment effect dynamics. We provide an empirical application, where DCFs are applied to estimate the incidence of payroll tax on wages paid to employees. We consider treatment effect heterogeneity associated with personal- and firm-level variables. We find that on average the incidence of the tax is shifted onto workers through incidental payments, rather than contracted wages. Heterogeneity is mainly explained by firm-and workforce-level variables. Firms with a large and heterogeneous workforce are most effective in passing on the incidence of the tax to workers. |
Keywords: | causal forest, treatment effect heterogeneity, payroll tax incidence, administrative data |
JEL: | C18 H22 J31 M54 |
Date: | 2023 |
URL: | http://d.repec.org/n?u=RePEc:ces:ceswps:_10532&r=big |
By: | Thibault Collin (Université Paris Dauphine-PSL - PSL - Université Paris sciences et lettres) |
Abstract: | The general scope of this thesis will be to further study the application of artificial neural networks in the context of hedging rainbow options. Due to their inherently complex features, such as the correlated paths that the prices of their underlying assets take or their absence from traded markets, finding an optimal hedging strategy for rainbow options is difficult, and traders usually have to resort to models and methods they know are inaccurate. An alternative approach involving deep learning however recently surfaced in the context of hedging vanilla options [6], and researchers have started to see potential in the use of neural networks for options endowed with exotic features in [5], [12] and [22]. The key to a near-perfect hedge for contingent claims might be hidden behind the training of neural network algorithms [6], and the scope of this research will be to further investigate how those innovative hedging techniques can be extended to rainbow options [22], using recent research [21], and to compare our results with those proposed by the current models and techniques used by traders, such as running Monte-Carlo path simulations. In order to accomplish that, we will try to develop an algorithm capable of designing an innovative and optimal hedging strategy for rainbow options using some intuition developed to hedge vanilla options [21] and price exotics [5]. But although it was shown from past literature to be potentially efficient and cost-effective, the opaque nature of an artificial neural network will make it difficult for the deep learning algorithm to be fully trusted and used as a sole method for hedging purposes, but rather as an additional technique associated with other more reliable models. |
Keywords: | Quantitative finance, deep hedging, deep learning, machine learning, rainbow options, call options, call worst-of options, black scholes, geometric brownian motion |
Date: | 2023–06–04 |
URL: | http://d.repec.org/n?u=RePEc:hal:wpaper:hal-04060013&r=big |
By: | Ilias Chronopoulos; Katerina Chrysikou; George Kapetanios; James Mitchell; Aristeidis Raftapostolos |
Abstract: | In this paper we study neural networks and their approximating power in panel data models. We provide asymptotic guarantees on deep feed-forward neural network estimation of the conditional mean, building on the work of Farrell et al. (2021), and explore latent patterns in the cross-section. We use the proposed estimators to forecast the progression of new COVID-19 cases across the G7 countries during the pandemic. We find significant forecasting gains over both linear panel and nonlinear time-series models. Containment or lockdown policies, as instigated at the national level by governments, are found to have out-of-sample predictive power for new COVID-19 cases. We illustrate how the use of partial derivatives can help open the “black box” of neural networks and facilitate semi-structural analysis: school and workplace closures are found to have been effective policies at restricting the progression of the pandemic across the G7 countries. But our methods illustrate significant heterogeneity and time variation in the effectiveness of specific containment policies. |
Keywords: | Machine Learning; Neural Networks; Panel Data; Nonlinearity; Forecasting; COVID-19; Policy Interventions |
JEL: | C33 C45 |
Date: | 2023–07–05 |
URL: | http://d.repec.org/n?u=RePEc:fip:fedcwq:96408&r=big |
By: | Xinli Yu; Zheng Chen; Yuan Ling; Shujing Dong; Zongyi Liu; Yanbin Lu |
Abstract: | This paper presents a novel study on harnessing Large Language Models' (LLMs) outstanding knowledge and reasoning abilities for explainable financial time series forecasting. The application of machine learning models to financial time series comes with several challenges, including the difficulty in cross-sequence reasoning and inference, the hurdle of incorporating multi-modal signals from historical news, financial knowledge graphs, etc., and the issue of interpreting and explaining the model results. In this paper, we focus on NASDAQ-100 stocks, making use of publicly accessible historical stock price data, company metadata, and historical economic/financial news. We conduct experiments to illustrate the potential of LLMs in offering a unified solution to the aforementioned challenges. Our experiments include trying zero-shot/few-shot inference with GPT-4 and instruction-based fine-tuning with a public LLM model Open LLaMA. We demonstrate our approach outperforms a few baselines, including the widely applied classic ARMA-GARCH model and a gradient-boosting tree model. Through the performance comparison results and a few examples, we find LLMs can make a well-thought decision by reasoning over information from both textual news and price time series and extracting insights, leveraging cross-sequence information, and utilizing the inherent knowledge embedded within the LLM. Additionally, we show that a publicly available LLM such as Open-LLaMA, after fine-tuning, can comprehend the instruction to generate explainable forecasts and achieve reasonable performance, albeit relatively inferior in comparison to GPT-4. |
Date: | 2023–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2306.11025&r=big |
By: | Denis Koshelev (Bank of Russia, Russian Federation); Alexey Ponomarenko (Bank of Russia, Russian Federation); Sergei Seleznev (Bank of Russia, Russian Federation) |
Abstract: | In this paper, we propose a new procedure for unconditional and conditional forecasting in agent-based models. The proposed algorithm is based on the application of amortized neural networks and consists of two steps. The first step simulates artificial datasets from the model. In the second step, a neural network is trained to predict the future values of the variables using the history of observations. The main advantage of the proposed algorithm is its speed. This is due to the fact that, after the training procedure, it can be used to yield predictions for almost any data without additional simulations or the re-estimation of the neural network. |
Keywords: | agent-based models, amortized simulation-based inference, Bayesian models, forecasting, neural networks. |
JEL: | C11 C15 C32 C45 C53 C63 |
Date: | 2023–07 |
URL: | http://d.repec.org/n?u=RePEc:bkr:wpaper:wps115&r=big |
By: | Kai Feng; Han Hong; Ke Tang; Jingyuan Wang |
Abstract: | This paper proposes a statistical framework with which artificial intelligence can improve human decision making. The performance of each human decision maker is first benchmarked against machine predictions; we then replace the decisions made by a subset of the decision makers with the recommendation from the proposed artificial intelligence algorithm. Using a large nationwide dataset of pregnancy outcomes and doctor diagnoses from prepregnancy checkups of reproductive age couples, we experimented with both a heuristic frequentist approach and a Bayesian posterior loss function approach with an application to abnormal birth detection. We find that our algorithm on a test dataset results in a higher overall true positive rate and a lower false positive rate than the diagnoses made by doctors only. We also find that the diagnoses of doctors from rural areas are more frequently replaceable, suggesting that artificial intelligence assisted decision making tends to improve precision more in less developed regions. |
Date: | 2023–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2306.11689&r=big |
By: | Gavrilova, Evelina (Dept. of Business and Management Science, Norwegian School of Economics); Langørgen, Audun (Statistics Norway); Zoutman, Floris T. (Dept. of Business and Management Science, Norwegian School of Economics) |
Abstract: | This paper develops a machine-learning method that allows researchers to estimate heterogeneous treatment effects with panel data in a setting with many covariates. Our method, which we name the dynamic causal forest (DCF) method, extends the causal-forest method of Wager and Athey (2018) by allowing for the estimation of dynamic treatment effects in a difference-in-difference setting. Regular causal forests require conditional independence to consistently estimate heterogeneous treatment effects. In contrast, DCFs provide a consistent estimate for heterogeneous treatment effects under the weaker assumption of parallel trends. DCFs can be used to create event-study plots which aid in the inspection of pre-trends and treatment effect dynamics. We provide an empirical application, where DCFs are applied to estimate the incidence of payroll tax on wages paid to employees. We consider treatment effect heterogeneity associated with personal- and firm-level variables. We find that on average the incidence of the tax is shifted onto workers through incidental payments, rather than contracted wages. Heterogeneity is mainly explained by firm-and workforce-level variables. Firms with a large and heterogeneous workforce are most effective in passing on the incidence of the tax to workers. |
Keywords: | Causal Forest; Treatment Effect Heterogeneity; Payroll Tax Incidence; Administrative Data |
JEL: | C18 H22 J31 M54 |
Date: | 2023–06–29 |
URL: | http://d.repec.org/n?u=RePEc:hhs:nhhfms:2023_009&r=big |
By: | Cassidy K. Buhler; Hande Y. Benson |
Abstract: | The Markowitz mean-variance portfolio optimization model aims to balance expected return and risk when investing. However, there is a significant limitation when solving large portfolio optimization problems efficiently: the large and dense covariance matrix. Since portfolio performance can be potentially improved by considering a wider range of investments, it is imperative to be able to solve large portfolio optimization problems efficiently, typically in microseconds. We propose dimension reduction and increased sparsity as remedies for the covariance matrix. The size reduction is based on predictions from machine learning techniques and the solution to a linear programming problem. We find that using the efficient frontier from the linear formulation is much better at predicting the assets on the Markowitz efficient frontier, compared to the predictions from neural networks. Reducing the covariance matrix based on these predictions decreases both runtime and total iterations. We also present a technique to sparsify the covariance matrix such that it preserves positive semi-definiteness, which improves runtime per iteration. The methods we discuss all achieved similar portfolio expected risk and return as we would obtain from a full dense covariance matrix but with improved optimizer performance. |
Date: | 2023–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2306.12639&r=big |
By: | Marc-André Gosselin; Temel Taskin |
Abstract: | We construct new indicators of the imbalance between demand and supply for the Canadian economy by using natural language processing techniques to analyze earnings calls of publicly listed firms. The results show that the text-based indicators are highly correlated with official inflation data and estimates of the output gap and improve the accuracy of inflation forecasts. This suggests that these indicators could help central banks foresee inflationary pressures in the economy. Our examination of other topics in earnings calls, such as supply chain disruptions and capacity constraints, points to the potential benefits of using textual data to quickly draw insights on a range of relevant topics. We conclude that text-based measures of economic slack should be included in central banks’ monitoring and forecasting toolkits. |
Keywords: | Central bank research; Domestic demand and components; Econometric and statistical methods; Inflation and prices; Potential output |
JEL: | C1 C3 E3 E5 |
Date: | 2023–06 |
URL: | http://d.repec.org/n?u=RePEc:bca:bocadp:23-13&r=big |
By: | Carlo Drago (University of Niccolò Cusano); Loris Di Nallo (University of Cassino e del Lazio Meridionale); Maria Lucetta Russotto (University of Firenze) |
Abstract: | Promoting social information reporting and disclosure can promote sustainable banking. The paper aims to measure banking social sustainability by constructing a new interval-based composite indicator using the Thomson Reuters database. In this work, we propose an approach to constructing interval-based composite indicators that enhance the composite indicator’s construction sensibly, allowing us to measure the uncertainty due to the choices in the composite indicator design. The methodological approach employed is based on a Monte-Carlo simulation and allows for improving the information the composite indicators can obtain. So, we measure the value of the social indicator and its subcomponents and the value’s uncertainty due to the different possible weights. The results show that the best international ESG practices in European banks relate to French and United Kingdom Banks, primarily than Italian banks. Finally, we analyze innovative perspectives and propose policy recommendations, considering the growing attention to the issue of ESG disclosure and its adherence to reality, to support sustainable banking ecosystems. |
Keywords: | Social Index, Sustainable Banking, ESG, Monte-Carlo Simulation, Machine Learning, Interval-based Composite Indicators |
JEL: | G21 Q5 C02 C15 C43 C63 |
Date: | 2023–06 |
URL: | http://d.repec.org/n?u=RePEc:fem:femwpa:2023.13&r=big |
By: | Zhi Su; Danni Wu; Zhenkun Zhou; Junran Wu; Libo Yin |
Abstract: | This paper investigates the significance of consumer opinions in relation to value in China's A-share market. By analyzing a large dataset comprising over 18 million product reviews by customers on JD.com, we demonstrate that sentiments expressed in consumer reviews can influence stock returns, indicating that consumer opinions contain valuable information that can impact the stock market. Our findings show that Customer Negative Sentiment Tendency (CNST) and One-Star Tendency (OST) have a negative effect on expected stock returns, even after controlling for firm characteristics such as market risk, illiquidity, idiosyncratic volatility, and asset growth. Further analysis reveals that the predictive power of CNST is stronger in firms with high sentiment conditions, growth companies, and firms with lower accounting transparency. We also find that CNST negatively predicts revenue surprises, earnings surprises, and cash flow shocks. These results suggest that online satisfaction derived from big data analysis of customer reviews contains novel information about firms' fundamentals. |
Date: | 2023–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2306.12119&r=big |
By: | Fabio Baschetti (Scuola Normale Superiore); Giacomo Bormetti (University of Bologna); Pietro Rossi (University of Bologna; Prometeia S.p.A) |
Abstract: | We propose a neural network-based approach to calibrating stochastic volatility models, which combines the pioneering grid approach by Horvath et al. (2021) with the pointwise two-stage calibration of Bayer and Stemper (2018). Our methodology inherits robustness from the former while not suffering from the need for interpolation/extrapolation techniques, a clear advantage ensured by the pointwise approach. The crucial point to the entire procedure is the generation of implied volatility surfaces on random grids, which one dispenses to the network in the training phase. We support the validity of our calibration technique with several empirical and Monte Carlo experiments for the rough Bergomi and Heston models under a simple but effective parametrization of the forward variance curve. The approach paves the way for valuable applications in financial engineering - for instance, pricing under local stochastic volatility models - and extensions to the fast-growing field of path-dependent volatility models. |
Date: | 2023–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2306.11061&r=big |
By: | Drago, Carlo; Di Nallo, Loris; Russotto, Maria Lucetta |
Abstract: | Promoting social information reporting and disclosure can promote sustainable banking. The paper aims to measure banking social sustainability by constructing a new interval-based composite indicator using the Thomson Reuters database. In this work, we propose an approach to constructing interval-based composite indicators that enhance the composite indicator’s construction sensibly, allowing us to measure the uncertainty due to the choices in the composite indicator design. The methodological approach employed is based on a Monte-Carlo simulation and allows for improving the information the composite indicators can obtain. So, we measure the value of the social indicator and its subcomponents and the value’s uncertainty due to the different possible weights. The results show that the best international ESG practices in European banks relate to French and United Kingdom Banks, primarily than Italian banks. Finally, we analyze innovative perspectives and propose policy recommendations, considering the growing attention to the issue of ESG disclosure and its adherence to reality, to support sustainable banking ecosystems. |
Keywords: | Financial Economics, Research Methods/ Statistical Methods |
Date: | 2023–06–28 |
URL: | http://d.repec.org/n?u=RePEc:ags:feemwp:336986&r=big |
By: | Alex Kim; Maximilian Muhn; Valeri Nikolaev |
Abstract: | Generative AI tools such as ChatGPT can fundamentally change the way investors process information. We probe the economic usefulness of these tools in summarizing complex corporate disclosures using the stock market as a laboratory. The unconstrained summaries are dramatically shorter, often by more than 70% compared to the originals, whereas their information content is amplified. When a document has a positive (negative) sentiment, its summary becomes more positive (negative). More importantly, the summaries are more effective at explaining stock market reactions to the disclosed information. Motivated by these findings, we propose a measure of information "bloat." We show that bloated disclosure is associated with adverse capital markets consequences, such as lower price efficiency and higher information asymmetry. Finally, we show that the model is effective at constructing targeted summaries that identify firms' (non-)financial performance and risks. Collectively, our results indicate that generative language modeling adds considerable value for investors with information processing constraints. |
Date: | 2023–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2306.10224&r=big |
By: | Wassim Le Lann (UO - Université d'Orléans, LEO - Laboratoire d'Économie d'Orleans [2022-...] - UO - Université d'Orléans - UT - Université de Tours - UCA - Université Clermont Auvergne); Gauthier Delozière (CMB - Centre Marc Bloch - MEAE - Ministère de l'Europe et des Affaires étrangères - Bundesministerium für Bildung und Forschung - M.E.N.E.S.R. - Ministère de l'Education nationale, de l’Enseignement supérieur et de la Recherche - CNRS - Centre National de la Recherche Scientifique, EdD - École de Droit de Sciences Po (Sciences Po) - Sciences Po - Sciences Po); Yann Le Lann (Université de Lille, CERIES) |
Abstract: | In times of global ecological crisis, the responsibility of large corporations in environmental degradation is increasingly pointed out. As a result, there has been a surge in private organizations' pledges to reduce their environmental impact in recent years. In this paper, we demonstrate that companies with poor environmental responsibility have incentives to take such pledges to maintain their ability to attract high-skilled human capital. Through a case study on a French climate movement which was initiated by elite students who threatened to boycott job offers from polluting employers, we find that environmental pledges can significantly attenuate this selection effect. Using a unique and large survey database on the climate movement participants (n=2307) and machine learning classifiers, we find that individuals who initially intended to refuse a job offer from a polluting employer were, on average, three times less likely to hold such intentions after being exposed to a corporate environmental pledge. This result can be explained by the fact that intentions to refuse to work for polluting companies, and reactions to environmental pledges are driven by different factors. Furthermore, we find substantial heterogeneity in the response to environmental pledges, which is primarily explained by career perspectives, beliefs about the ecological crisis and support for radical political action in the name of ecology. |
Keywords: | Climate movement, Greenwashing, Human capital, Organizational behavior, Labor market, Machine Learning |
Date: | 2023–06–24 |
URL: | http://d.repec.org/n?u=RePEc:hal:wpaper:hal-04140191&r=big |
By: | Simon Montfort |
Abstract: | Public support and political mobilization are two crucial factors for the adoption of ambitious climate policies in line with the international greenhouse gas reduction targets of the Paris Agreement. Despite their compound importance, they are mainly studied separately. Using a random forest machine-learning model, this article investigates the relative predictive power of key established explanations for public support and mobilization for climate policies. Predictive models may shape future research priorities and contribute to theoretical advancement by showing which predictors are the most and least important. The analysis is based on a pre-election conjoint survey experiment on the Swiss CO2 Act in 2021. Results indicate that beliefs (such as the perceived effectiveness of policies) and policy design preferences (such as for subsidies or tax-related policies) are the most important predictors while other established explanations, such as socio-demographics, issue salience (the relative importance of issues) or political variables (such as the party affiliation) have relatively weak predictive power. Thus, beliefs are an essential factor to consider in addition to explanations that emphasize issue salience and preferences driven by voters' cost-benefit considerations. |
Date: | 2023–06 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2306.10144&r=big |