ML Microstructure Signals

A machine learning system for predicting short-term mid-price moves from order book features. Features comprehensive order flow analysis, multiple ML models (LightGBM, LSTM, Transformer), and professional backtesting framework with transaction costs simulation.

Download Research Paper

Introduction

Market microstructure analysis represents one of the most challenging and rewarding areas of quantitative finance. This project explores how machine learning can extract predictive signals from order book dynamics, going beyond traditional technical indicators to understand the underlying market structure.

The system analyzes order flow imbalance (OFI), spread dynamics, depth variations, and microprice calculations to predict short-term mid-price movements. By combining multiple ML models—from classical approaches like Logistic Regression to advanced sequence models like LSTM and Transformers—the project provides a comprehensive framework for microstructure-based trading signal generation.

Technical Implementation

The project is built using a modular architecture with Hydra for configuration management. Feature extraction includes order book analysis at multiple levels, capturing both immediate market conditions and longer-term patterns. The system processes high-frequency data efficiently using NumPy and Pandas, while PyTorch handles the deep learning components.

MLflow integration enables comprehensive experiment tracking, allowing systematic comparison of different model architectures and hyperparameters. The backtesting framework simulates realistic trading conditions, including transaction costs and slippage, providing more accurate performance estimates than simple paper trading.

"Understanding market microstructure isn't just about predicting prices—it's about understanding the mechanics of how markets function. Every order, every trade, tells a story about supply and demand dynamics."

— Ismail Moudden, Researcher & Developer

Key Features

The system implements multiple ML models, each optimized for different aspects of microstructure prediction:

  • Order Flow Imbalance (OFI): Multi-level analysis of buy vs. sell pressure
  • Spread & Depth Features: Capturing liquidity and market tightness
  • Microprice Calculation: Weighted mid-price reflecting order book depth
  • Sequence Models: LSTM and Transformer architectures for temporal patterns
  • Walk-Forward Analysis: Robust validation preventing overfitting
  • Real-time Dashboard: Streamlit interface for live signal visualization

The comprehensive backtesting framework includes transaction cost simulation, slippage modeling, and regime change detection, ensuring that signals are tested under realistic market conditions.

Research Impact

This project represents ongoing research into market microstructure and machine learning applications in finance. The open-source nature of the codebase allows for collaboration and further development, contributing to the broader quantitative finance community.

A comprehensive research paper documents the methodology, results, and insights gained from this project, providing both theoretical foundations and practical implementation details for others interested in microstructure analysis.

View on GitHub