Skip to main navigation Skip to search Skip to main content

Achieving collective welfare in multi-agent reinforcement learning via suggestion sharing

  • Yue Jin
  • , Shuangqing Wei
  • , Giovanni Montana

Research output: Contribution to JournalArticlepeer-review

Abstract

In human society, the conflict between self-interest and collective well-being often obstructs efforts to achieve shared welfare. Related concepts like the Tragedy of the Commons and Social Dilemmas frequently manifest in our daily lives. As artificial agents increasingly serve as autonomous proxies for humans, we propose a novel multi-agent reinforcement learning (MARL) method to address this issue - learning policies to maximise collective returns even when individual agents’ interests conflict with the collective one. Unlike traditional cooperative MARL solutions that involve sharing rewards, values, and policies or designing intrinsic rewards to encourage agents to learn collectively optimal policies, we propose a novel MARL approach where agents exchange action suggestions. Our method reveals less private information compared to sharing rewards, values, or policies, while enabling effective cooperation without the need to design intrinsic rewards. Our algorithm is supported by our theoretical analysis that establishes a bound on the discrepancy between collective and individual objectives, demonstrating how sharing suggestions can align agents’ behaviours with the collective objective. Experimental results demonstrate that our algorithm performs competitively with baselines that rely on value or policy sharing or intrinsic rewards.
Original languageEnglish
Article number190
Number of pages27
JournalMachine Learning
Volume114
DOIs
Publication statusPublished - 15 Jul 2025

Keywords

  • Multi-agent reinforcement learning
  • Collective welfare
  • Cooperation
  • Suggestion sharing

Fingerprint

Dive into the research topics of 'Achieving collective welfare in multi-agent reinforcement learning via suggestion sharing'. Together they form a unique fingerprint.

Cite this