Synthesising Reward Machines for Cooperative Multi-Agent Reinforcement Learning
Giovanni Varricchione, Natasha Alechina , Mehdi Dastani , and 1 more author
In Proceedings of the 20th European Conference on Multi-Agent Systems (EUMAS 2023) , 2023
Reward machines have recently been proposed as a means of encoding team tasks in cooperative multi-agent reinforcement learning. The resulting multi-agent reward machine is then decomposed into individual reward machines, one for each member of the team, allowing agents to learn in a decentralised manner while still achieving the team task. However, current work assumes the multi-agent reward machine to be given. In this paper, we show how reward machines for team tasks can be synthesised automatically from an Alternating-Time Temporal Logic specification of the desired team behaviour and a high-level abstraction of the agents’ environment. We present results suggesting that our automated approach has comparable, if not better, sample efficiency than reward machines generated by hand for multi-agent tasks.