Towards Generalized Trading Intelligence: A Hybrid DRL and MARL Approach for Arbitrage and High-Frequency Markets
Harshvardhan Paliwal Paliwal
Paper Contents
Abstract
The merging of Deep Reinforcement Learning (DRL) Tech and Multi-AgentReinforcement Learning (MARL) Tech provides a unique perspective to tackleproblems pertaining to modern financial markets, including arbitrage as well as highfrequency trading (HFT). This paper presents a hybrid architecture employingMARL for coordinating multiple competing trading agents, while DRL is utilized foruncertainty-driven strategic decision-making. Incorporating lessons from modernDRL trading system implementations, multi-agent optimization algorithms, real-timearbitrage systems, and adaptive models, we offer the capability to learn dynamictrading strategies in both collaborative and competitive environments. Thisframework seeks to integrate HFT decisions made at the microstructure level withthe more overarching market arbitrage activities.Experimental verification is performed using praise data (historical restriction order book) and multistage price flows. The hybrid model shows improved profitability, reduced departures, and improved risk cleaning returns compared to independent DRL or MARL agents. Most important innovations include integration of communication protocols between agents, the formation of environments of multiagent dynamics, and reward functional technology for timecritical market events. The results suggest that generalized commercial information systems rooted in the synergyof DRL and MARLprovide a robust and scalable strategy for navigating volatility and fragmentation in the radiofrequency market.
Copyright
Copyright © 2025 Harshvardhan Paliwal. This is an open access article distributed under the Creative Commons Attribution License.