|
on Big Data |
By: | Ashesh Rambachan; Rahul Singh; Davide Viviano |
Abstract: | While traditional program evaluations typically rely on surveys to measure outcomes, certain economic outcomes such as living standards or environmental quality may be infeasible or costly to collect. As a result, recent empirical work estimates treatment effects using remotely sensed variables (RSVs), such mobile phone activity or satellite images, instead of ground-truth outcome measurements. Common practice predicts the economic outcome from the RSV, using an auxiliary sample of labeled RSVs, and then uses such predictions as the outcome in the experiment. We prove that this approach leads to biased estimates of treatment effects when the RSV is a post-outcome variable. We nonparametrically identify the treatment effect, using an assumption that reflects the logic of recent empirical research: the conditional distribution of the RSV remains stable across both samples, given the outcome and treatment. Our results do not require researchers to know or consistently estimate the relationship between the RSV, outcome, and treatment, which is typically mis-specified with unstructured data. We form a representation of the RSV for downstream causal inference by predicting the outcome and predicting the treatment, with better predictions leading to more precise causal estimates. We re-evaluate the efficacy of a large-scale public program in India, showing that the program's measured effects on local consumption and poverty can be replicated using satellite |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.10959 |
By: | Mihnea Constantinescu (National Bank of Ukraine; University of Amsterdam); Kalle Kappner (Ludwig-Maximilians-Universität München); Nikodem Szumilo (University College London) |
Abstract: | We introduce the Warcast Index, an approach for estimating regional economic activity during periods of extreme uncertainty using publicly available data. We show that combining widely used correlates of economic activity – nightlight intensity, Google Trends, and Twitter activity – can improve the tracking of economic performance and even allow the approximation of monthly economic activity after extreme structural breaks, like war or occupation. We apply this approach to Ukraine during the 2022 war. Our findings show that combining multiple data sources not only improves tracking accuracy compared to single-correlate models, but also provides timely, transparent and flexible data for policy-making in situations where conventional economic data is unavailable or unreliable. We also contribute to the literature on wartime economics by providing a novel analysis of the economic effects of armed conflict with high frequency (monthly) and spatially granular (regional) data. |
Keywords: | estimating GDP, nwcasting GDP, wartime economics, nightlights, Google trends, Twitter data |
JEL: | B41 C82 E01 O11 |
Date: | 2024–09 |
URL: | https://d.repec.org/n?u=RePEc:ukb:wpaper:03/2024 |
By: | Etienne Briand (University of Quebec in Montreal); Massimiliano Marcellino (Bocconi University); Dalibor Stevanovic (University of Quebec in Montreal) |
Abstract: | We investigate the role of attention in shaping inflation dynamics. To measure the general public attention, we utilize Google Trends (GT) data for keywords such as "inflation". For professional attention, we construct an indicator based on the standardized count of Wall Street Journal (WSJ) articles with "inflation" in their titles. Through empirical analysis, we show that attention significantly impacts inflation dynamics, even when accounting for traditional inflation-related factors. Macroeconomic theory suggests that expectations formation is a natural mechanism to explain these findings. We find support for this hypothesis by measuring a decrease in professional forecasters' information rigidity during periods of high attention. In contrast to prior research, our findings highlight the critical roles of media communication and public attention in shaping aggregate inflation expectations. We then develop a theoretical model that captures our stylized facts, showing that both inflation dynamics and forecaster expectations are regime-dependent. Finally, we examine the implications of this framework for the effectiveness of monetary policy. |
Keywords: | Inflation, Expectations, Monetary policy, Google trends, Text analysis |
JEL: | C53 C83 D83 D84 E31 E37 |
Date: | 2024–12 |
URL: | https://d.repec.org/n?u=RePEc:bbh:wpaper:24-05 |
By: | Karel Janda (Institute of Economic Studies, Faculty of Social Sciences, Charles University, Prague, Czech Republic & Department of Banking and Insurance, Faculty of Finance and Accounting, Prague University of Economics and Business, Czech Republic); Mathieu Petit (Institute of Economic Studies, Faculty of Social Sciences, Charles University, Prague, Czech Republic) |
Abstract: | This study addresses the economic rationale behind algorithmic trading in the Electric Vehicle (EV) sector, enhancing the interpretability of Q-learning agents. By integrating EV-specific data, such as Tesla´s stock fundamentals and key supply chain players such as Albemarle and Panasonic Holdings Corporation, this paper uses a Q-Reinforcement Learning (Q-RL) framework to generate a profitable trading agent. The agent´s decisions are analyzed and interpreted using a decision tree to reveal the influence of supply chain dynamics. Tested on a holdout period, the agent achieves monthly profitability above a 2% threshold. The agent shows sensitivity to supply chain instability and identifies potential disruptions impacting Tesla by treating supplier stock movements as proxies for broader economic and market conditions. Indirectly, this approach improves understanding and trust in Q-RL-based algorithmic trading within the EV market. |
Keywords: | Electric Vehicle Supply Chain, Algorithmic Trading, Machine Learning, Q-Reinforcement Learning, Interpretability |
JEL: | G17 Q42 C45 Q55 |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:fau:wpaper:wp2024_40 |
By: | Szymon Lis |
Abstract: | This study conducted a comprehensive review of 71 papers published between 2000 and 2021 that employed various measures of investor sentiment to model returns. The analysis indicates that higher complexity of sentiment measures and models improves the coefficient of determination. However, there was insufficient evidence to support that models incorporating more complex sentiment measures have better predictive power than those employing simpler proxies. Additionally, the significance of sentiment varies based on the asset and time period being analyzed, suggesting that the consensus relying on the BW index as a sentiment measure may be subject to change. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.13180 |
By: | Claudia Biancotti; Carolina Camassa; Andrea Coletta; Oliver Giudice; Aldo Glielmo |
Abstract: | Advancements in large language models (LLMs) have renewed concerns about AI alignment - the consistency between human and AI goals and values. As various jurisdictions enact legislation on AI safety, the concept of alignment must be defined and measured across different domains. This paper proposes an experimental framework to assess whether LLMs adhere to ethical and legal standards in the relatively unexplored context of finance. We prompt nine LLMs to impersonate the CEO of a financial institution and test their willingness to misuse customer assets to repay outstanding corporate debt. Beginning with a baseline configuration, we adjust preferences, incentives and constraints, analyzing the impact of each adjustment with logistic regression. Our findings reveal significant heterogeneity in the baseline propensity for unethical behavior of LLMs. Factors such as risk aversion, profit expectations, and regulatory environment consistently influence misalignment in ways predicted by economic theory, although the magnitude of these effects varies across LLMs. This paper highlights both the benefits and limitations of simulation-based, ex post safety testing. While it can inform financial authorities and institutions aiming to ensure LLM safety, there is a clear trade-off between generality and cost. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.11853 |
By: | Sorouralsadat Fatemi; Yuheng Hu |
Abstract: | Financial trading has been a challenging task, as it requires the integration of vast amounts of data from various modalities. Traditional deep learning and reinforcement learning methods require large training data and often involve encoding various data types into numerical formats for model input, which limits the explainability of model behavior. Recently, LLM-based agents have demonstrated remarkable advancements in handling multi-modal data, enabling them to execute complex, multi-step decision-making tasks while providing insights into their thought processes. This research introduces a multi-modal multi-agent system designed specifically for financial trading tasks. Our framework employs a team of specialized LLM-based agents, each adept at processing and interpreting various forms of financial data, such as textual news reports, candlestick charts, and trading signal charts. A key feature of our approach is the integration of a reflection module, which conducts analyses of historical trading signals and their outcomes. This reflective process is instrumental in enhancing the decision-making capabilities of the system for future trading scenarios. Furthermore, the ablation studies indicate that the visual reflection module plays a crucial role in enhancing the decision-making capabilities of our framework. |
Date: | 2024–10 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.08899 |
By: | Ruicheng Ao; Hongyu Chen; David Simchi-Levi |
Abstract: | In this work, we introduce a new framework for active experimentation, the Prediction-Guided Active Experiment (PGAE), which leverages predictions from an existing machine learning model to guide sampling and experimentation. Specifically, at each time step, an experimental unit is sampled according to a designated sampling distribution, and the actual outcome is observed based on an experimental probability. Otherwise, only a prediction for the outcome is available. We begin by analyzing the non-adaptive case, where full information on the joint distribution of the predictor and the actual outcome is assumed. For this scenario, we derive an optimal experimentation strategy by minimizing the semi-parametric efficiency bound for the class of regular estimators. We then introduce an estimator that meets this efficiency bound, achieving asymptotic optimality. Next, we move to the adaptive case, where the predictor is continuously updated with newly sampled data. We show that the adaptive version of the estimator remains efficient and attains the same semi-parametric bound under certain regularity assumptions. Finally, we validate PGAE's performance through simulations and a semi-synthetic experiment using data from the US Census Bureau. The results underscore the PGAE framework's effectiveness and superiority compared to other existing methods. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.12036 |
By: | Jian Guo; Saizhuo Wang; Yiyan Qi |
Abstract: | Multi-stage decision-making is crucial in various real-world artificial intelligence applications, including recommendation systems, autonomous driving, and quantitative investment systems. In quantitative investment, for example, the process typically involves several sequential stages such as factor mining, alpha prediction, portfolio optimization, and sometimes order execution. While state-of-the-art end-to-end modeling aims to unify these stages into a single global framework, it faces significant challenges: (1) training such a unified neural network consisting of multiple stages between initial inputs and final outputs often leads to suboptimal solutions, or even collapse, and (2) many decision-making scenarios are not easily reducible to standard prediction problems. To overcome these challenges, we propose Guided Learning, a novel methodological framework designed to enhance end-to-end learning in multi-stage decision-making. We introduce the concept of a ``guide'', a function that induces the training of intermediate neural network layers towards some phased goals, directing gradients away from suboptimal collapse. For decision scenarios lacking explicit supervisory labels, we incorporate a utility function that quantifies the ``reward'' of the throughout decision. Additionally, we explore the connections between Guided Learning and classic machine learning paradigms such as supervised, unsupervised, semi-supervised, multi-task, and reinforcement learning. Experiments on quantitative investment strategy building demonstrate that guided learning significantly outperforms both traditional stage-wise approaches and existing end-to-end methods. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.10496 |
By: | Diletta Abbonato |
Abstract: | This article explores public perceptions on the Fourth Industrial Revolution (4IR) through an analysis of social media discourse across six European countries. Using sentiment analysis and machine learning techniques on a dataset of tweets and media articles, we assess how the public reacts to the integration of technologies such as artificial intelligence, robotics, and blockchain into society. The results highlight a significant polarization of opinions, with a shift from neutral to more definitive stances either embracing or resisting technological impacts. Positive sentiments are often associated with technological enhancements in quality of life and economic opportunities, whereas concerns focus on issues of privacy, data security, and ethical implications. This polarization underscores the need for policymakers to engage proactively with the public to address fears and harness the benefits of 4IR technologies. The findings also advocate for digital literacy and public awareness programs to mitigate misinformation and foster an informed public discourse on future technological integration. This study contributes to the ongoing debate on aligning technological advances with societal values and needs, emphasizing the role of informed public opinion in shaping effective policy. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.14230 |
By: | Tsogt-Ochir Enkhbayar |
Abstract: | This paper advances the computational efficiency of Deep Hedging frameworks through the novel integration of Kronecker-Factored Approximate Curvature (K-FAC) optimization. While recent literature has established Deep Hedging as a data-driven alternative to traditional risk management strategies, the computational burden of training neural networks with first-order methods remains a significant impediment to practical implementation. The proposed architecture couples Long Short-Term Memory (LSTM) networks with K-FAC second-order optimization, specifically addressing the challenges of sequential financial data and curvature estimation in recurrent networks. Empirical validation using simulated paths from a calibrated Heston stochastic volatility model demonstrates that the K-FAC implementation achieves marked improvements in convergence dynamics and hedging efficacy. The methodology yields a 78.3% reduction in transaction costs ($t = 56.88$, $p |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.15002 |
By: | Xin Zhang; Zhen Xu; Yue Liu; Mengfang Sun; Tong Zhou; Wenying Sun |
Abstract: | In the current context of accelerated globalization and digitalization, the complexity and uncertainty of financial markets are increasing, and the identification and prevention of economic risks have become a key link in maintaining the stability of the financial system. Traditional risk identification methods often have limitations because they are difficult to cope with the multi-level and dynamically changing complex relationships in financial networks. With the rapid development of financial technology, graph neural network (GNN) technology, as an emerging deep learning method, has gradually shown great potential in the field of financial risk management. GNN can map transaction behaviors, financial institutions, individuals, and their interactive relationships in financial networks into graph structures, and effectively capture potential patterns and abnormal signals in financial data through embedded representation learning. Using this technology, financial institutions can extract valuable information from complex transaction networks, identify hidden dangers or abnormal behaviors that may cause systemic risks in a timely manner, optimize decision-making processes, and improve the accuracy of risk warnings. This paper explores the economic risk identification algorithm based on the GNN algorithm, aiming to provide financial institutions and regulators with more intelligent technical tools to help maintain the security and stability of the financial market. Improving the efficiency of economic risk identification through innovative technical means is expected to further enhance the risk resistance of the financial system and lay the foundation for building a robust global financial system. |
Date: | 2024–10 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.11848 |
By: | Bartosz Bieganowski; Robert \'Slepaczuk |
Abstract: | This paper investigates the enhancement of financial time series forecasting with the use of neural networks through supervised autoencoders (SAE), to improve investment strategy performance. Using the Sharpe and Information Ratios, it specifically examines the impact of noise augmentation and triple barrier labeling on risk-adjusted returns. The study focuses on Bitcoin, Litecoin, and Ethereum as the traded assets from January 1, 2016, to April 30, 2022. Findings indicate that supervised autoencoders, with balanced noise augmentation and bottleneck size, significantly boost strategy effectiveness. However, excessive noise and large bottleneck sizes can impair performance. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.12753 |
By: | Jue Xiao; Tingting Deng; Shuochen Bi |
Abstract: | In recent fast-paced financial markets, investors constantly seek ways to gain an edge and make informed decisions. Although achieving perfect accuracy in stock price predictions remains elusive, artificial intelligence (AI) advancements have significantly enhanced our ability to analyze historical data and identify potential trends. This paper takes AI driven stock price trend prediction as the core research, makes a model training data set of famous Tesla cars from 2015 to 2024, and compares LSTM, GRU, and Transformer Models. The analysis is more consistent with the model of stock trend prediction, and the experimental results show that the accuracy of the LSTM model is 94%. These methods ultimately allow investors to make more informed decisions and gain a clearer insight into market behaviors. |
Date: | 2024–10 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.05790 |
By: | Mabsur Fatin Bin Hossain; Lubna Zahan Lamia; Md Mahmudur Rahman; Md Mosaddek Khan |
Abstract: | Time series forecasting is a key tool in financial markets, helping to predict asset prices and guide investment decisions. In highly volatile markets, such as cryptocurrencies like Bitcoin (BTC) and Ethereum (ETH), forecasting becomes more difficult due to extreme price fluctuations driven by market sentiment, technological changes, and regulatory shifts. Traditionally, forecasting relied on statistical methods, but as markets became more complex, deep learning models like LSTM, Bi-LSTM, and the newer FinBERT-LSTM emerged to capture intricate patterns. Building upon recent advancements and addressing the volatility inherent in cryptocurrency markets, we propose a hybrid model that combines Bidirectional Long Short-Term Memory (Bi-LSTM) networks with FinBERT to enhance forecasting accuracy for these assets. This approach fills a key gap in forecasting volatile financial markets by blending advanced time series models with sentiment analysis, offering valuable insights for investors and analysts navigating unpredictable markets. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.12748 |
By: | Yahui Bai; Yuhe Gao; Runzhe Wan; Sheng Zhang; Rui Song |
Abstract: | In recent years, there has been a growing trend of applying Reinforcement Learning (RL) in financial applications. This approach has shown great potential to solve decision-making tasks in finance. In this survey, we present a comprehensive study of the applications of RL in finance and conduct a series of meta-analyses to investigate the common themes in the literature, such as the factors that most significantly affect RL's performance compared to traditional methods. Moreover, we identify challenges including explainability, Markov Decision Process (MDP) modeling, and robustness that hinder the broader utilization of RL in the financial industry and discuss recent advancements in overcoming these challenges. Finally, we propose future research directions, such as benchmarking, contextual RL, multi-agent RL, and model-based RL to address these challenges and to further enhance the implementation of RL in finance. |
Date: | 2024–10 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.12746 |
By: | Junjie Guo |
Abstract: | This paper provides an empirical study explores the application of deep learning algorithms-Multilayer Perceptron (MLP), Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Transformer-in constructing long-short stock portfolios. Two datasets comprising randomly selected stocks from the S&P500 and NASDAQ indices, each spanning a decade of daily data, are utilized. The models predict daily stock returns based on historical features such as past returns, Relative Strength Index (RSI), trading volume, and volatility. Portfolios are dynamically adjusted by longing stocks with positive predicted returns and shorting those with negative predictions, with equal asset weights. Performance is evaluated over a two-year testing period, focusing on return, Sharpe ratio, and maximum drawdown metrics. The results demonstrate the efficacy of deep learning models in enhancing long-short stock portfolio performance. |
Date: | 2024–10 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.13555 |
By: | Masahiro Kato |
Abstract: | This study introduces a debiasing method for regression estimators, including high-dimensional and nonparametric regression estimators. For example, nonparametric regression methods allow for the estimation of regression functions in a data-driven manner with minimal assumptions; however, these methods typically fail to achieve $\sqrt{n}$-consistency in their convergence rates, and many, including those in machine learning, lack guarantees that their estimators asymptotically follow a normal distribution. To address these challenges, we propose a debiasing technique for nonparametric estimators by adding a bias-correction term to the original estimators, extending the conventional one-step estimator used in semiparametric analysis. Specifically, for each data point, we estimate the conditional expected residual of the original nonparametric estimator, which can, for instance, be computed using kernel (Nadaraya-Watson) regression, and incorporate it as a bias-reduction term. Our theoretical analysis demonstrates that the proposed estimator achieves $\sqrt{n}$-consistency and asymptotic normality under a mild convergence rate condition for both the original nonparametric estimator and the conditional expected residual estimator. Notably, this approach remains model-free as long as the original estimator and the conditional expected residual estimator satisfy the convergence rate condition. The proposed method offers several advantages, including improved estimation accuracy and simplified construction of confidence intervals. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.11748 |
By: | Linying Lv |
Abstract: | I examine the value of information from sell-side analysts by analyzing a large corpus of their written reports. Using embeddings from state-of-the-art large language models, I show that textual information in analyst reports explains 10.19% of contemporaneous stock returns out-of-sample, a value that is economically more significant than quantitative forecasts. I then perform a Shapley value decomposition to assess how much each topic within the reports contributes to explaining stock returns. The results show that analysts' income statement analyses account for more than half of the reports' explanatory power. Expressing these findings in economic terms, I estimate that early acquisition of analysts' reports can yield significant profits. Analysts' information value peeks in the first week following earnings announcements, highlighting their vital role in interpreting new financial data. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.13813 |
By: | Qingyuan Wu (School of Economics and Management, Hanshan Normal University, Chaozhou, China.); William A. Barnett (Department of Economics, University of Kansas, Lawrence, KS 66045, USA and Center for Financial Stability, New York City, NY, USA); Xue Wang (Department of Economics, Emory University, Atlanta, GA, U.S. and Institute of Chinese Financial Studies, Southwestern University of Finance and Economics, Chengdu, China); Junru Zhao (School of Business Administration, Guizhou University of Finance and Economics, Guiyang, Guizhou 550025, China.) |
Abstract: | How does Management's Discussion and Analysis (MD&A) sentiment tone affect corporate innovation? The influence mechanism is straight and very robust. To some extent, the positive sentiment tone, representing a psychological state, serves as an intermediary between macro or micro conditions and corporate innovation. This paper uses a machine learning method to measure the MD&A sentiment tone and get innovation indicators in a broader sense. The fix effect model shows that when the value of a firm's positive sentiment tone increases by 1 unit, its innovation increases by approximately 1.02 units. Eliminating industry and market effects, the exclusive sentiment tone information owned by the firm would still play a strong influence on innovation. |
Keywords: | Discussion and analysis sentiment tone, Corporate innovation, Text analysis, Listed companies |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:kan:wpaper:202417 |
By: | Roman, Shahrear; Hadi; Wuepper, David |
Abstract: | Mechanization is one of the key ingredients for achieving high agricultural productivity. Despite its importance, there is currently no globally comprehensive information about countries’ agricultural mechanization. Here, we propose and demonstrate a machine learning approach, relying on a large, novel training dataset, to not only produce an up-to-date and comprehensive dataset of countries’ average agricultural mechanization, but also a global gridded map at ~ 5km resolution. Comparing our results to previously available data we find major improvements in accuracy, completeness, timeliness etc., and we notice that several countries are by now much more mechanized than reported so far. When investigating the association between mechanization and crop yield gaps we find a strong and robust link: For each 10 percentage point increase in mechanization, the associated crop yield gap decreases by 4 – 5 percentage points. |
Keywords: | Crop Production/Industries, Labor and Human Capital, Production Economics, Productivity Analysis, Research and Development/Tech Change/Emerging Technologies |
Date: | 2024–12–04 |
URL: | https://d.repec.org/n?u=RePEc:ags:ubfred:348369 |
By: | Marco Hening-Tallarico; Pablo Olivares |
Abstract: | The objective of the paper is to price weather derivative contracts based on temperature and precipitation as underlying climate variables. We use a neural network approach combined with time series forecast to value Pacific Rim index in Toronto and Chicago |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.12013 |
By: | Ananya Unnikrishnan |
Abstract: | Reinforcement learning (RL) has emerged as a transformative approach for financial trading, enabling dynamic strategy optimization in complex markets. This study explores the integration of sentiment analysis, derived from large language models (LLMs), into RL frameworks to enhance trading performance. Experiments were conducted on single-stock trading with Apple Inc. (AAPL) and portfolio trading with the ING Corporate Leaders Trust Series B (LEXCX). The sentiment-enhanced RL models demonstrated superior net worth and cumulative profit compared to RL models without sentiment and, in the portfolio experiment, outperformed the actual LEXCX portfolio's buy-and-hold strategy. These results highlight the potential of incorporating qualitative market signals to improve decision-making, bridging the gap between quantitative and qualitative approaches in financial trading. |
Date: | 2024–11 |
URL: | https://d.repec.org/n?u=RePEc:arx:papers:2411.11059 |