From Centralized to Self-Supervised: Pursuing Realistic Multi-Agent Reinforcement Learning
Published in arXiv, 2023
Recommended citation: Xiang, V., Cross, L., Fränken, J. P., & Haber, N. (2023). From Centralized to Self-Supervised: Pursuing Realistic Multi-Agent Reinforcement Learning. arXiv preprint arXiv:2312.08662. https://arxiv.org/abs/2312.08662
Abstract
In real-world environments, autonomous agents rely on their egocentric observations. They must learn adaptive strategies to interact with others who possess mixed motivations, discernible only through visible cues. Several Multi-Agent Reinforcement Learning (MARL) methods adopt centralized approaches that involve either centralized training or reward-sharing, often violating the realistic ways in which living organisms, like animals or humans, process information and interact. MARL strategies deploying decentralized training with intrinsic motivation offer a self-supervised approach, enable agents to develop flexible social strategies through the interaction of autonomous agents. However, by contrasting the self-supervised and centralized methods, we reveal that populations trained with reward-sharing methods surpass those using self-supervised methods in a mixed-motive environment. We link this superiority to specialized role emergence and an agent’s expertise in its role. Interestingly, this gap shrinks in pure-motive settings, emphasizing the need for evaluations in more complex, realistic environments (mixed-motive). Our preliminary results suggest a gap in population performance that can be closed by improving self-supervised methods and thereby pushing MARL closer to real-world readiness.