|
on Computational Economics |
| By: | Maximilian Göbel (Brain); Philippe Goulet Coulombe (Université du Québec à Montréal); Karin Klieber (Oesterreichische Nationalbank) |
| Abstract: | Machine learning predictions are typically interpreted as the sum of contributions of predictors. Yet, each out-of-sample prediction can also be expressed as a linear combination of in-sample values of the predicted variable, with weights corresponding to pairwise proximity scores between current and past economic events. While this dual route leads nowhere in some contexts (e.g., large cross-sectional datasets), it provides sparser interpretations in settings with many regressors and little training data—like macroeconomic forecasting. In this case, the sequence of contributions can be visualized as a time series, allowing analysts to explain predictions as quantifiable combinations of historical analogies. Moreover, the weights can be viewed as those of a data portfolio, inspiring new diagnostic measures such as forecast concentration, short position, and turnover. We show how weights can be retrieved seamlessly for (kernel) ridge regression, random forest, boosted trees, and neural networks. Then, we apply these tools to analyze postpandemic forecasts of inflation, GDP growth, and recession probabilities. In all cases, the approach opens the black box from a new angle and demonstrates how machine learning models leverage history partly repeating itself. |
| Date: | 2025–03–27 |
| URL: | https://d.repec.org/n?u=RePEc:onb:oenbwp:265 |
| By: | Jennifer Peña; Katherine Jara; Fernando Sierra |
| Abstract: | This paper investigates whether artificial intelligence techniques—encompassing both machine learning and deep learning models—can enhance the accuracy of now-casts for Chile’s monthly economic activity index (IMACEC). The analysis relies on a large and diverse real-time dataset that includes both traditional macroeco-nomic variables and high-frequency monthly administrative data (from electronic tax records). Three main findings emerge. First, nonlinear models—particularly XGBoost—achieve the lowest root mean squared errors, whereas linear regularized approaches such as SVR and LASSO also show competitive performance. This highlights the value of flexible nonlinear methods and regularized linear approaches when dealing with heterogeneous data. Second, features derived from electronic tax records—such as trade credit volumes and sectoral sales by region—consistently rank among the most important predictors across models. Third, the strongest-performing models—XGBoost, SVR, and LASSO—achieve lower errors than tra-ditional econometric benchmarks, which rely solely on standard macroeconomic aggregates and exclude non-traditional datasets. Overall, the findings show that timely administrative data, combined with AI approaches, can significantly improve economic surveillance and decision-making. |
| Date: | 2025–12 |
| URL: | https://d.repec.org/n?u=RePEc:chb:bcchwp:1058 |
| By: | Solomon Polachek; Kenneth Romano; Ozlem Tonguc |
| Abstract: | This study examines how large language models (LLMs) respond to varying stake sizes in the Dictator and Ultimatum games using the high-stakes design introduced by Andersen et al. (2011). We test ten leading LLMs chosen for their accessibility, prominence, and differences in reasoning capabilities. Results reveal substantial variation across models: Only 5 of 10 models exhibit strategic behavior by offering more in the Ultimatum Game (UG) than in the Dictator Game (DG). Relative to humans, 4 models are consistently more generous, 2 consistently less, and 4 vary with stake size. Only 1 model shows a monotonic decline in UG offers as stakes increase; the remaining 9 are non-monotonic or stable. Unlike humans, most models reduce UG offers when endowed with wealth. Prompting for "human-like" decisions generally increases generosity in the UG. These findings are important for evaluating whether LLMs can serve as realistic proxies for human subjects in behavioral experiments and highlight key limitations and future directions for model development. |
| Keywords: | Ultimatum Game, Dictator Game, fairness, payoff stakes, artificial intelligence |
| JEL: | D01 C72 C90 |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:crm:wpaper:26110 |
| By: | Martin Biewen; Stefan Glaisner; Simon Zeller |
| Abstract: | This paper explores distributional random forests as a flexible machine learning method for analysing income distributions. Distributional random forests avoid parametric assumptions, capture complex interactions among covariates, and, once trained, provide full estimates of conditional income distributions. From these, any type of distributional index such as measures of location, inequality and poverty risk can be readily computed. They can also efficiently process grouped income data and be used as inputs for distributional decomposition methods. We consider four types of applications: (i) estimating income distributions for granular population subgroups, (ii) analysing distributional change over time, (iii) small-area estimation of income distributions, and (iv) purging spatial income distributions of differences in spatial characteristics. Our application based on the German Microcensus provides new results on the socio-economic and spatial structure of the German income distribution. |
| Keywords: | inequality, poverty, small-area estimation, grouped income data |
| JEL: | D31 I32 |
| Date: | 2026–02 |
| URL: | https://d.repec.org/n?u=RePEc:crm:wpaper:26051 |
| By: | Pérez-Lechuga, Gilberto; Venegas-Martínez, Francisco |
| Abstract: | Background: The vehicle routing problem (VRP) is of great importance in the Industry 4.0 era because enabling technologies such as the internet of things (IoT), artificial intelligence (AI), big data, and geographic information systems (GISs) allows for real-time solutions to versions of the problem, adapting to changing conditions such as traffic or fluctuating demand. Methods: In this paper, we model and optimize a classic multi-link distribution network topology, including randomness in travel times, vehicle availability times, and product demands, using a hybrid approach of nested linear stochastic programming and Monte Carlo simulation under a time-window scheme. The proposed solution is compared with cutting-edge metaheuristics such as Ant Colony Optimization (ACO), Tabu Search (TS), and Simulated Annealing (SA). Results: The results suggest that the proposed method is computationally efficient and scalable to large models, although convergence and accuracy are strongly influenced by the probability distributions used. Conclusions: The developed proposal constitutes a viable alternative for solving real-world, large-scale modeling cases for transportation management in the supply chain. |
| Keywords: | vehicle routing problem; stochastic modeling; Monte Carlo simulation; supply chain management; metaheuristics; logistics optimization |
| JEL: | L60 |
| Date: | 2026–01–01 |
| URL: | https://d.repec.org/n?u=RePEc:pra:mprapa:128859 |
| By: | Ristolainen, Kim |
| Abstract: | We develop a novel sentiment measure from survey forecasts that captures the component of beliefs arising from the systematic misaggregation of public information relative to a machine benchmark based on the same information set. We extend this sentiment measure historically for a panel of 78 countries using machine learning models trained on BERT embeddings of historical news articles (1903-2020). The backcasted sentiment shows that shocks in median sentiment predict credit booms in the non-tradable corporate sector, which prior research has linked to financial crises. We further find that this sentiment component is shaped by memory-related dynamics, as the time elapsed since major crises and the share of young-to-old people in the population predict surges in optimism even when recent economic developments are controlled for. Taken together, the findings provide new historical evidence consistent with the Minsky-Kindleberger view on financial crises. |
| Keywords: | Survey data, Sentiment, Memory, Machine Learning, Text Data, Credit growth, Financial Crisis |
| JEL: | E44 E51 G01 D84 G41 E32 |
| Date: | 2026 |
| URL: | https://d.repec.org/n?u=RePEc:zbw:bofrdp:340165 |
| By: | Eduardo Levy Yeyati; César M. Ciappa; Milagros Onofri |
| Abstract: | Recent work measures ideological positioning and drift in large language models (LLMs), but typically assumes that those measurements are invariant to the language of evaluation. This paper tests that assumption using the full Political Compass questionnaire in English and Spanish across three generations of OpenAI models, together with a benchmark comparison against recent Qwen and Mistral releases. Using matched item-level responses, we estimate within-model Spanish–English displacement and assess how language choice affects cross-model comparisons. We find that measured ideological coordinates remain in the same broad region across languages, but are not language-invariant. Spanish–English shifts differ in sign and magnitude across models and axes, and in several cases amount to a substantial share of the inter-model dispersion typically interpreted as ideological drift in English-only audits. The implication is methodological: ideological drift should not be treated as a language-invariant property of a model, but as a measurement outcome conditional on language choice and instrument design. Multilingual audits should therefore report language-specific placements and within-model cross-language displacement rather than extrapolating from English-only measurements. |
| Keywords: | large language models, ideological drift, multilingual evaluation, Political Compass, language dependence |
| JEL: | C83 C90 C18 D72 |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:udt:wpgobi:wp_gob_2026_07 |
| By: | Tiwari, Sapan (RMIT University); Jafari, Afshin; Pemberton, Steve; Ziemke, Dominik |
| Abstract: | Cycling network evaluation and agent-based transport simulations commonly rely on shortest-path routing, implicitly assuming that cyclists minimise travel distance. However, empirical evidence shows that cycling route choice reflects trade-offs between safety, comfort, infrastructure quality, and topography. This study develops a behaviourally informed cycling routing framework that integrates public participation GIS (PPGIS) based route-choice modelling with agent-based simulation. Marginal utilities estimated using a Path Size Logit (PSL) model are transformed into link-level impedance factors and embedded within the agent-based transport simulation model MATSim, enabling cyclists’ behavioural preferences to directly influence network-wide route assignment while holding travel demand constant. The framework is evaluated against both shortest-path routing and observed cycling routes using the same origin-destination pairs. Results show that impedance-based routing more closely reproduces observed route characteristics, particularly in terms of exposure to low-stress links, speed environments, and cycling infrastructure use. At the network level, behaviourally informed routing increases low-stress exposure by 31.4% and reduces high-stress exposure by 41.5%, while the use of off-road and protected cycling facilities increases by 118.5%. Average exposure to higher-speed traffic environments decreases by 21.3%, accompanied by a modest 3.9% increase in trip length. Embedding behavioural impedance within the agent-based model also substantially alters the emergent exposure of cyclists to motorised traffic, reducing total network-wide exposure by up to 43.5% relative to shortest-path assignment and redistributing cycling flows away from high-speed arterial corridors toward lower-stress alternatives. These findings demonstrate that conventional shortest-path routing in agent-based models can systematically misrepresent cyclist exposure and infrastructure utilisation. The proposed framework provides a practical method for integrating behavioural evidence into agent-based cycling routing, enabling more realistic evaluation of cycling networks, safety outcomes, and infrastructure investments. |
| Date: | 2026–04–18 |
| URL: | https://d.repec.org/n?u=RePEc:osf:socarx:kt94s_v1 |
| By: | Michelle Yin; Hoa Vu; Claudia Persico |
| Abstract: | A rapidly growing literature estimates AI's labor-market effects using large language models (LLMs) to self-assess occupational exposure. We demonstrate these measures are highly fragile. Replicating the dominant rubric with three frontier models on identical tasks, we find a 3.6-fold divergence in mean exposure with agreement as low as 57%. This measurement instability alters downstream empirical conclusions: in a difference-in-differences framework, individual-level coefficient magnitudes vary 2.4-fold across annotators, and county level estimates flip from a significant negative to an insignificant positive depending on annotators. We formalize this non-classical measurement error, highlighting the risks of treating evolving LLMs as static instruments. |
| JEL: | C81 J23 J24 O33 |
| Date: | 2026–04 |
| URL: | https://d.repec.org/n?u=RePEc:nbr:nberwo:35110 |