nep-mst New Economics Papers
on Market Microstructure
Issue of 2025–08–18
five papers chosen by
Thanos Verousis, Vlerick Business School


  1. Order Book Filtration and Directional Signal Extraction at High Frequency By Aditya Nittur Anantha; Shashi Jain; Prithwish Maiti
  2. Low-Rank Structured Nonparametric Prediction of Instantaneous Volatility By Sung Hoon Choi; Donggyu Kim
  3. ByteGen: A Tokenizer-Free Generative Model for Orderbook Events in Byte Space By Yang Li; Zhi Chen
  4. Retail Investors’ Contrarian Behavior Around News, Attention, and the Momentum Effect By Patrick Luo; Enrichetta Ravina; Marco C. Sammon; Luis M. Viceira
  5. AI-Powered Trading, Algorithmic Collusion, and Price Efficiency By Winston Wei Dou; Itay Goldstein; Yan Ji

  1. By: Aditya Nittur Anantha; Shashi Jain; Prithwish Maiti
    Abstract: With the advent of electronic capital markets and algorithmic trading agents, the number of events in tick-by-tick market data has exploded. A large fraction of these orders is transient. Their ephemeral character degrades the informativeness of directional alphas derived from the limit order book (LOB) state. We investigate whether directional signals such as order book imbalance (OBI) can be improved by structurally filtering high-frequency LOB data. Three real-time, observable filtration schemes: based on order lifetime, update count, and inter-update delay. These are used to recompute OBI on structurally filtered event streams. To assess the effect of filtration, we implement a three-layer diagnostic framework: contemporaneous correlation with returns, explanatory power under discretized regime counts, and causal coherence via Hawkes excitation norms. Empirical results show that structural filtration improves directional signal clarity in correlation and regime-based metrics, but leads to only limited gains in causal excitation strength. In contrast, OBI computed using trade events exhibits stronger causal alignment with future price movements. These findings highlight the importance of differentiating between associative and causal diagnostics when designing high-frequency directional signals.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.22712
  2. By: Sung Hoon Choi; Donggyu Kim
    Abstract: Based on It\^o semimartingale models, several studies have proposed methods for forecasting intraday volatility using high-frequency financial data. These approaches typically rely on restrictive parametric assumptions and are often vulnerable to model misspecification. To address this issue, we introduce a novel nonparametric prediction method for the future intraday instantaneous volatility process during trading hours, which leverages both previous days' data and the current day's observed intraday data. Our approach imposes an interday-by-intraday matrix representation of the instantaneous volatility, which is decomposed into a low-rank conditional expectation component and a noise matrix. To predict the future conditional expected volatility vector, we exploit this low-rank structure and propose the Structural Intraday-volatility Prediction (SIP) procedure. We establish the asymptotic properties of the SIP estimator and demonstrate its effectiveness through an out-of-sample prediction study using real high-frequency trading data.
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2507.22173
  3. By: Yang Li; Zhi Chen
    Abstract: Generative modeling of high-frequency limit order book (LOB) dynamics is a critical yet unsolved challenge in quantitative finance, essential for robust market simulation and strategy backtesting. Existing approaches are often constrained by simplifying stochastic assumptions or, in the case of modern deep learning models like Transformers, rely on tokenization schemes that affect the high-precision, numerical nature of financial data through discretization and binning. To address these limitations, we introduce ByteGen, a novel generative model that operates directly on the raw byte streams of LOB events. Our approach treats the problem as an autoregressive next-byte prediction task, for which we design a compact and efficient 32-byte packed binary format to represent market messages without information loss. The core novelty of our work is the complete elimination of feature engineering and tokenization, enabling the model to learn market dynamics from its most fundamental representation. We achieve this by adapting the H-Net architecture, a hybrid Mamba-Transformer model that uses a dynamic chunking mechanism to discover the inherent structure of market messages without predefined rules. Our primary contributions are: 1) the first end-to-end, byte-level framework for LOB modeling; 2) an efficient packed data representation; and 3) a comprehensive evaluation on high-frequency data. Trained on over 34 million events from CME Bitcoin futures, ByteGen successfully reproduces key stylized facts of financial markets, generating realistic price distributions, heavy-tailed returns, and bursty event timing. Our findings demonstrate that learning directly from byte space is a promising and highly flexible paradigm for modeling complex financial systems, achieving competitive performance on standard market quality metrics without the biases of tokenization.
    Date: 2025–08
    URL: https://d.repec.org/n?u=RePEc:arx:papers:2508.02247
  4. By: Patrick Luo; Enrichetta Ravina; Marco C. Sammon; Luis M. Viceira
    Abstract: Using a large and representative panel of U.S. brokerage accounts, we show that retail investors trade as contrarians after large earnings surprises, especially for loser stocks, and that such contrarian trading contributes to price momentum and post earnings announcement drift (PEAD). We show that extreme return streaks and surprises are not enough for stocks to exhibit PEAD and momentum and that the intensity of contrarian retail trading plays a key role: the PEAD of loser stocks with bad earnings surprises becomes increasingly more negative as retail buying pressure increases, and he PEAD of the stocks with the highest past returns and largest earnings surprises is the most positive for the stocks with the biggest net retail outflow. Finer sorts confirm the results, as do sorts by firm size and institutional ownership level. Younger and more attentive individuals are more likely to be contrarian, and a firm’s dividend yield, leverage, size, book to market, and analyst coverage are associated with the fraction of contrarian trades they face around earnings announcements. The disposition effect and stale limit orders, while present in our sample, do not explain our results. Our findings are consistent with investors’ conservatism, sticky beliefs, and cognitive uncertainty, as well as an incorrect belief in the Law of Small Numbers.
    JEL: G0 G11 G12 G4 G41 G5
    Date: 2025–08
    URL: https://d.repec.org/n?u=RePEc:nbr:nberwo:34086
  5. By: Winston Wei Dou; Itay Goldstein; Yan Ji
    Abstract: The integration of algorithmic trading with reinforcement learning, termed AI-powered trading, is transforming financial markets. Alongside the benefits, it raises concerns for collusion. This study first develops a model to explore the possibility of collusion among informed speculators in a theoretical environment. We then conduct simulation experiments, replacing the speculators in the model with informed AI speculators who trade based on reinforcement-learning algorithms. We show that they autonomously sustain collusive supra-competitive profits without agreement, communication, or intent. Such collusion undermines competition and market efficiency. We demonstrate that two separate mechanisms are underlying this collusion and characterize when each one arises.
    JEL: D43 G10 G14 L13
    Date: 2025–07
    URL: https://d.repec.org/n?u=RePEc:nbr:nberwo:34054

This nep-mst issue is ©2025 by Thanos Verousis. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.