|
on Big Data |
By: | Ardyn Nordstrom; Morgan Nordstrom; Matthew D. Webb |
Abstract: | This paper details an innovative methodology to integrate image data into traditional econometric models. Motivated by forecasting sales prices for residential real estate, we harness the power of deep learning to add "information" contained in images as covariates. Specifically, images of homes were categorized and encoded using an ensemble of image classifiers (ResNet-50, VGG16, MobileNet, and Inception V3). Unique features presented within each image were further encoded through panoptic segmentation. Forecasts from a neural network trained on the encoded data results in improved out-of-sample predictive power. We also combine these image-based forecasts with standard hedonic real estate property and location characteristics, resulting in a unified dataset. We show that image-based forecasts increase the accuracy of hedonic forecasts when encoded features are regarded as additional covariates. We also attempt to "explain" which covariates the image-based forecasts are most highly correlated with. The study exemplifies the benefits of interdisciplinary methodologies, merging machine learning and econometrics to harness untapped data sources for more accurate forecasting. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.19915&r=big |
By: | Qishuo Cheng |
Abstract: | In recent decades, financial quantification has emerged and matured rapidly. For financial institutions such as funds, investment institutions are increasingly dissatisfied with the situation of passively constructing investment portfolios with average market returns, and are paying more and more attention to active quantitative strategy investment portfolios. This requires the introduction of active stock investment fund management models. Currently, in my country's stock fund investment market, there are many active quantitative investment strategies, and the algorithms used vary widely, such as SVM, random forest, RNN recurrent memory network, etc. This article focuses on this trend, using the emerging LSTM-GRU gate-controlled long short-term memory network model in the field of financial stock investment as a basis to build a set of active investment stock strategies, and combining it with SVM, which has been widely used in the field of quantitative stock investment. Comparing models such as RNN, theoretically speaking, compared to SVM that simply relies on kernel functions for high-order mapping and classification of data, neural network algorithms such as RNN and LSTM-GRU have better principles and are more suitable for processing financial stock data. Then, through multiple By comparison, it was finally found that the LSTM- GRU gate-controlled long short-term memory network has a better accuracy. By selecting the LSTM-GRU algorithm to construct a trading strategy based on the Shanghai and Shenzhen 300 Index constituent stocks, the parameters were adjusted and the neural layer connection was adjusted. Finally, It has significantly outperformed the benchmark index CSI 300 over the long term. The conclusion of this article is that the research results can provide certain quantitative strategy references for financial institutions to construct active stock investment portfolios. |
Date: | 2024–04 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2404.01624&r=big |
By: | Leogrande, Angelo |
Abstract: | In the following article I take into consideration the role of knowledge workers in the Italian regions. The analysed data refers to the ISTAT-BES database. The metric analysis consists of an in-depth analysis of the trends of the regions and macro-regions, followed by clustering with the k-Means algorithm, the application of machine learning algorithms for prediction, and the presentation of an econometric model with panel methods date. The results are also critically discussed in light of the North-South divide and the economic policy implications. |
Date: | 2024–03–29 |
URL: | http://d.repec.org/n?u=RePEc:osf:socarx:4bv6a&r=big |
By: | Enzo Brox; Michael Lechner |
Abstract: | This article shows how coworker performance affects individual performance evaluation in a teamwork setting at the workplace. We use high-quality data on football matches to measure an important component of individual performance, shooting performance, isolated from collaborative effects. Employing causal machine learning methods, we address the assortative matching of workers and estimate both average and heterogeneous effects. There is substantial evidence for spillover effects in performance evaluations. Coworker shooting performance, meaningfully impacts both, manager decisions and third-party expert evaluations of individual performance. Our results underscore the significant role coworkers play in shaping career advancements and highlight a complementary channel, to productivity gains and learning effects, how coworkers impact career advancement. We characterize the groups of workers that are most and least affected by spillover effects and show that spillover effects are reference point dependent. While positive deviations from a reference point create positive spillover effects, negative deviations are not harmful for coworkers. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.15200&r=big |
By: | Holtemöller, Oliver; Kozyrev, Boris |
Abstract: | In this study, we analyzed the forecasting and nowcasting performance of a generalized regression neural network (GRNN). We provide evidence from Monte Carlo simulations for the relative forecast performance of GRNN depending on the data-generating process. We show that GRNN outperforms an autoregressive benchmark model in many practically relevant cases. Then, we applied GRNN to forecast quarterly German GDP growth by extending univariate GRNN to multivariate and mixed-frequency settings. We could distinguish between "normal" times and situations where the time-series behavior is very different from "normal" times such as during the COVID-19 recession and recovery. GRNN was superior in terms of root mean forecast errors compared to an autoregressive model and to more sophisticated approaches such as dynamic factor models if applied appropriately. |
Keywords: | forecasting, neural network, nowcasting, time series models |
JEL: | C22 C45 C53 |
Date: | 2024 |
URL: | http://d.repec.org/n?u=RePEc:zbw:iwhdps:287749&r=big |
By: | Diakonova, M.; Molina, L.; Mueller, H.; Pérez, J. J.; Rauh, C. |
Abstract: | It is widely accepted that episodes of social unrest, conflict, political tensions and policy uncertainty affect the economy. Nevertheless, the real-time dimension of such relationships is less studied, and it remains unclear how to incorporate them in a forecasting framework. This can be partly explained by a certain divide between the economic and political science contributions in this area, as well as the traditional lack of availability of timely high-frequency indicators measuring such phenomena. The latter constraint, though, is becoming less of a limiting factor through the production of text-based indicators. In this paper we assemble a dataset of such monthly measures of what we call “institutional instability†, for three representative emerging market economies: Brazil, Colombia and Mexico. We then forecast quarterly GDP by adding these new variables to a standard macro-forecasting model using different methods. Our results strongly suggest that capturing institutional instability above a broad set of standard high-frequency indicators is useful when forecasting quarterly GDP. We also analyse relative strengths and weaknesses of the approach. |
Keywords: | Forecasting, Social Unrest, Social Conflict, Policy Uncertainty, Forecasting GDP, Natural Language Processing, Geopolitical Risk |
JEL: | E37 D74 N16 |
Date: | 2024–04–05 |
URL: | http://d.repec.org/n?u=RePEc:cam:camjip:2413&r=big |
By: | Diakonova, M.; Molina, L.; Mueller, H.; Pérez, J. J.; Rauh, C. |
Abstract: | It is widely accepted that episodes of social unrest, conflict, political tensions and policy uncertainty affect the economy. Nevertheless, the real-time dimension of such relationships is less studied, and it remains unclear how to incorporate them in a forecasting framework. This can be partly explained by a certain divide between the economic and political science contributions in this area, as well as the traditional lack of availability of timely high-frequency indicators measuring such phenomena. The latter constraint, though, is becoming less of a limiting factor through the production of text-based indicators. In this paper we assemble a dataset of such monthly measures of what we call “institutional instability†, for three representative emerging market economies: Brazil, Colombia and Mexico. We then forecast quarterly GDP by adding these new variables to a standard macro-forecasting model using different methods. Our results strongly suggest that capturing institutional instability above a broad set of standard high-frequency indicators is useful when forecasting quarterly GDP. We also analyse relative strengths and weaknesses of the approach. |
Keywords: | Forecasting, Social Unrest, Social Conflict, Policy Uncertainty, Forecasting GDP, Natural Language Processing, Geopolitical Risk |
JEL: | E37 D74 N16 |
Date: | 2024–04–05 |
URL: | http://d.repec.org/n?u=RePEc:cam:camdae:2418&r=big |
By: | Jiafu An; Difang Huang; Chen Lin; Mingzhu Tai |
Abstract: | In traditional decision making processes, social biases of human decision makers can lead to unequal economic outcomes for underrepresented social groups, such as women, racial or ethnic minorities. Recently, the increasing popularity of Large language model based artificial intelligence suggests a potential transition from human to AI based decision making. How would this impact the distributional outcomes across social groups? Here we investigate the gender and racial biases of OpenAIs GPT, a widely used LLM, in a high stakes decision making setting, specifically assessing entry level job candidates from diverse social groups. Instructing GPT to score approximately 361000 resumes with randomized social identities, we find that the LLM awards higher assessment scores for female candidates with similar work experience, education, and skills, while lower scores for black male candidates with comparable qualifications. These biases may result in a 1 or 2 percentage point difference in hiring probabilities for otherwise similar candidates at a certain threshold and are consistent across various job positions and subsamples. Meanwhile, we also find stronger pro female and weaker anti black male patterns in democratic states. Our results demonstrate that this LLM based AI system has the potential to mitigate the gender bias, but it may not necessarily cure the racial bias. Further research is needed to comprehend the root causes of these outcomes and develop strategies to minimize the remaining biases in AI systems. As AI based decision making tools are increasingly employed across diverse domains, our findings underscore the necessity of understanding and addressing the potential unequal outcomes to ensure equitable outcomes across social groups. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.15281&r=big |
By: | Divyanshu Daiya; Monika Yadav; Harshit Singh Rao |
Abstract: | In this work, we propose an approach to generalize denoising diffusion probabilistic models for stock market predictions and portfolio management. Present works have demonstrated the efficacy of modeling interstock relations for market time-series forecasting and utilized Graph-based learning models for value prediction and portfolio management. Though convincing, these deterministic approaches still fall short of handling uncertainties i.e., due to the low signal-to-noise ratio of the financial data, it is quite challenging to learn effective deterministic models. Since the probabilistic methods have shown to effectively emulate higher uncertainties for time-series predictions. To this end, we showcase effective utilisation of Denoising Diffusion Probabilistic Models (DDPM), to develop an architecture for providing better market predictions conditioned on the historical financial indicators and inter-stock relations. Additionally, we also provide a novel deterministic architecture MaTCHS which uses Masked Relational Transformer(MRT) to exploit inter-stock relations along with historical stock features. We demonstrate that our model achieves SOTA performance for movement predication and Portfolio management. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.14063&r=big |
By: | Soheila Khajoui; Saeid Dehyadegari; Sayyed Abdolmajid Jalaee |
Abstract: | This study aims at predicting the impact of e-commerce indicators on international trade of the selected OECD countries and Iran, by using the artificial intelligence approach and P-VAR. According to the nature of export, import, GDP, and ICT functions, and the characteristics of nonlinearity, this analysis is performed by using the MPL neural network. The export, import, GDP, and ICT findings were examined with 99 percent accuracy. Using the P-VAR model in the Eviews software, the initial database and predicted data were applied to estimate the impact of e-commerce on international trade. The findings from analyzing the data show that there is a bilateral correlation between e-commerce which means that ICT and international trade affect each other and the Goodness of fit of the studied model is confirmed. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.20310&r=big |
By: | Florian Krach; Josef Teichmann; Hanna Wutte |
Abstract: | Robust utility optimization enables an investor to deal with market uncertainty in a structured way, with the goal of maximizing the worst-case outcome. In this work, we propose a generative adversarial network (GAN) approach to (approximately) solve robust utility optimization problems in general and realistic settings. In particular, we model both the investor and the market by neural networks (NN) and train them in a mini-max zero-sum game. This approach is applicable for any continuous utility function and in realistic market settings with trading costs, where only observable information of the market can be used. A large empirical study shows the versatile usability of our method. Whenever an optimal reference strategy is available, our method performs on par with it and in the (many) settings without known optimal strategy, our method outperforms all other reference strategies. Moreover, we can conclude from our study that the trained path-dependent strategies do not outperform Markovian ones. Lastly, we uncover that our generative approach for learning optimal, (non-) robust investments under trading costs generates universally applicable alternatives to well known asymptotic strategies of idealized settings. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.15243&r=big |
By: | Tetiana Yukhymenko (National Bank of Ukraine); Oleh Sorochan (National Bank of Ukraine) |
Abstract: | This study explores the impact of central bank communications on a range of macrofinancial indicators. Specifically, we examine whether information posted on the National Bank of Ukraine (NBU) website influences foreign exchange (FX) markets and the inflation expectations of experts. Our main results suggest that the NBU's statements and press releases on monetary policy issues do indeed matter. For instance, we find that exchange rate movements and volatility are negatively correlated with the volumes of publications of the NBU on its official website. However, this effect is noticeably larger for volatility than for exchange rate changes. The impact of communication on FX developments is strongest a week after a news release, and it persists further. Furthermore, the inflation expectations of financial experts, though indifferent to NBU updates overall, turn out to be sensitive to monetary policy announcements. The latter reduces the level of expectations and interest rate movement. |
Keywords: | central bank communications, monetary policy, FX market, text analysis |
JEL: | E58 E71 C55 |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:ukb:wpaper:01/2024&r=big |
By: | Dai, Yongsheng; Wang, Hui; Rafferty, Karen; Spence, Ivor; Quinn, Barry |
Abstract: | Time series anomaly detection plays a critical role in various applications, from finance to industrial monitoring. Effective models need to capture both the inherent characteristics of time series data and the unique patterns associated with anomalies. While traditional forecasting-based and reconstruction-based approaches have been successful, they tend to struggle with complex and evolving anomalies. For instance, stock market data exhibits complex and ever-changing fluctuation patterns that defy straightforward modelling. In this paper, we propose a novel approach called TDSRL (Time Series Dual Self-Supervised Representation Learning) for robust anomaly detection. TDSRL leverages synthetic anomaly segments which are artificially generated to simulate real-world anomalies. The key innovation lies in dual self-supervised pretext tasks: one task characterises anomalies in relation to the entire time series, while the other focuses on local anomaly boundaries. Additionally, we introduce a data degradation method that operates in both the time and frequency domains, creating a more natural simulation of real-world anomalies compared to purely synthetic data. Consequently, TDSRL is expected to achieve more accurate predictions of the location and extent of anomalous segments. Our experiments demonstrate that TDSRL outperforms state-of-the-art methods, making it a promising avenue for time series anomaly detection. |
Keywords: | Time series anomaly detection, self-supervised representation learning, contrastive learning, synthetic anomaly |
Date: | 2024 |
URL: | http://d.repec.org/n?u=RePEc:zbw:qmsrps:202403&r=big |
By: | S. Di Luozzo; A. Fronzetti Colladon; M. M. Schiraldi |
Abstract: | The current study proposes an innovative methodology for the profiling of psychological traits of Operations Management (OM) and Supply Chain Management (SCM) professionals. We use innovative methods and tools of text mining and social network analysis to map the demand for relevant skills from a set of job descriptions, with a focus on psychological characteristics. The proposed approach aims to evaluate the market demand for specific traits by combining relevant psychological constructs, text mining techniques, and an innovative measure, namely, the Semantic Brand Score. We apply the proposed methodology to a dataset of job descriptions for OM and SCM professionals, with the objective of providing a mapping of their relevant required skills, including psychological characteristics. In addition, the analysis is then detailed by considering the region of the organization that issues the job description, its organizational size, and the seniority level of the open position in order to understand their nuances. Finally, topic modeling is used to examine key components and their relative significance in job descriptions. By employing a novel methodology and considering contextual factors, we provide an innovative understanding of the attitudinal traits that differentiate professionals. This research contributes to talent management, recruitment practices, and professional development initiatives, since it provides new figures and perspectives to improve the effectiveness and success of Operations Management and Supply Chain Management professionals. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.17546&r=big |
By: | Boming Ning; Kiseop Lee |
Abstract: | Statistical arbitrage is a prevalent trading strategy which takes advantage of mean reverse property of spread of paired stocks. Studies on this strategy often rely heavily on model assumption. In this study, we introduce an innovative model-free and reinforcement learning based framework for statistical arbitrage. For the construction of mean reversion spreads, we establish an empirical reversion time metric and optimize asset coefficients by minimizing this empirical mean reversion time. In the trading phase, we employ a reinforcement learning framework to identify the optimal mean reversion strategy. Diverging from traditional mean reversion strategies that primarily focus on price deviations from a long-term mean, our methodology creatively constructs the state space to encapsulate the recent trends in price movements. Additionally, the reward function is carefully tailored to reflect the unique characteristics of mean reversion trading. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.12180&r=big |
By: | Kaushalya Kularatnam; Tania Stathaki |
Abstract: | As algorithmic trading and electronic markets continue to transform the landscape of financial markets, detecting and deterring rogue agents to maintain a fair and efficient marketplace is crucial. The explosion of large datasets and the continually changing tricks of the trade make it difficult to adapt to new market conditions and detect bad actors. To that end, we propose a framework that can be adapted easily to various problems in the space of detecting market manipulation. Our approach entails initially employing a labelling algorithm which we use to create a training set to learn a weakly supervised model to identify potentially suspicious sequences of order book states. The main goal here is to learn a representation of the order book that can be used to easily compare future events. Subsequently, we posit the incorporation of expert assessment to scrutinize specific flagged order book states. In the event of an expert's unavailability, recourse is taken to the application of a more complex algorithm on the identified suspicious order book states. We then conduct a similarity search between any new representation of the order book against the expert labelled representations to rank the results of the weak learner. We show some preliminary results that are promising to explore further in this direction |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.13429&r=big |
By: | Joachim Wagner (Leuphana Universität Lüneburg, Institut für Volkswirtschaftslehre and Kiel Centre for Globalization) |
Abstract: | The use of digital technologies like artificial intelligence, robotics, or smart devices can be expected to go hand in hand with higher productivity and lower trade costs, and, therefore, to be positively related to export activities. This paper uses firm level data for manufacturing enterprises from the 27 member countries of the European Union to shed further light on this issue by investigating the link between the digitalization intensity of a firm and extensive margins of exports. Applying a new machine-learning estimator, Kernel-Regularized Least Squares (KRLS), which does not impose any restrictive assumptions for the functional form of the relation between margins of exports, digitalization intensity, and any control variables, we find that firms which use more digital technologies do more often export, do more often export to various destinations all over the world, and do export to more different destinations |
Keywords: | Digital technologies, exports, firm level data, Flash Eurobarometer 486, kernel-regularized least squares (KRLS) |
JEL: | D22 F14 |
Date: | 2024–04 |
URL: | http://d.repec.org/n?u=RePEc:lue:wpaper:428&r=big |
By: | Rustam Jamilov (University of Oxford); Alexandre Kohlhas (University of Oxford); Oleksandr Talavera (University of Birmingham); Mao Zhang (University of St Andrews) |
Abstract: | We propose an empirically-motivated theory of business cycles, driven by fluctuations in sentiment towards a small number of firms. We measure firm-level sentiment with computational linguistics and analyst forecast errors. We find that 50 firms can account for over 70% of the unconditional variation in U.S. sentiment and output over the period 2006-2021. The “Granular Sentiment Index”, measuring sentiment towards the 50 firms, is dominated by firms that are closer to the final consumer, i.e. are downstream. To rationalize our findings, we embed endogenous information choice into a general equilibrium model with heterogeneous upstream and downstream firms. We show that attention centers on downstream firms because they act as natural “information agglomerators”. When calibrated to match select moments of U.S. data, the model shows that orthogonal shocks to sentiment of the 20% most downstream firms explain more than 90% of sentiment-driven (and 20% of total) aggregate fluctuations. |
Date: | 2024–02 |
URL: | http://d.repec.org/n?u=RePEc:cfm:wpaper:2414&r=big |
By: | Alexander Berry; Elizabeth M. Maloney; David Neumark |
Abstract: | Stronger enforcement of discrimination laws can help to reduce disparities in economic outcomes with respect to race, ethnicity, and gender in the United States. However, the data necessary to detect possible discrimination and to act to counter it is not publicly available – in particular, data on racial, ethnic, and gender disparities within specific companies. In this paper, we explore and develop methods to use information extracted from publicly available LinkedIn data to measure the racial, ethnic, and gender composition of company workforces. We use predictive tools based on both names and pictures to identify race, ethnicity, and gender. We show that one can use LinkedIn data to obtain reasonably reliable measures of workforce demographic composition by race, ethnicity, and gender, based on validation exercises comparing estimates from scraped LinkedIn data to two sources – ACS data, and company diversity or EEO-1 reports. Next, we apply our methods to study the race, ethnic, and gender composition of workers who were hired and those who experienced mass layoffs at two large companies. Finally, we explore using LinkedIn data to measure race, ethnic, and gender differences in promotion. |
JEL: | J15 J16 J7 |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:nbr:nberwo:32294&r=big |
By: | Taejin Park |
Abstract: | This paper introduces a Large Language Model (LLM)-based multi-agent framework designed to enhance anomaly detection within financial market data, tackling the longstanding challenge of manually verifying system-generated anomaly alerts. The framework harnesses a collaborative network of AI agents, each specialised in distinct functions including data conversion, expert analysis via web research, institutional knowledge utilization or cross-checking and report consolidation and management roles. By coordinating these agents towards a common objective, the framework provides a comprehensive and automated approach for validating and interpreting financial data anomalies. I analyse the S&P 500 index to demonstrate the framework's proficiency in enhancing the efficiency, accuracy and reduction of human intervention in financial market monitoring. The integration of AI's autonomous functionalities with established analytical methods not only underscores the framework's effectiveness in anomaly detection but also signals its broader applicability in supporting financial market monitoring. |
Date: | 2024–03 |
URL: | http://d.repec.org/n?u=RePEc:arx:papers:2403.19735&r=big |