From single agent to multiple agents

In standard Reinforcement Learning our agent was alone in its environment: it was not cooperating or collaborating with other agents. When we do multi-agents reinforcement learning (MARL), we are in a situation where we have multiple agents that share and interact in a common environment.

For instance, you can think of a warehouse where multiple robots need to navigate to load and unload packages. Or a road with several autonomous vehicles.

In these examples, we have multiple agents interacting in the environment and with the other agents. This implies defining a multi-agents system. But first, let’s understand the different types of multi-agent environments.

Different types of multi-agent environments

Given that in a multi-agent system, agents interact with other agents, we can have different types of environments:

Cooperative environments: where your agents needs to maximize the common benefits. For instance, in a warehouse, robots must collaborate to load and unload the packages as efficiently (as fast as possible).
Competitive/Adversarial environments: in that case, your agent want to maximize its benefits by minimizing the opponent ones. For example, in a game of tennis, each agent wants to beat the other agent.
Mixed of both adversarial and cooperative

A standard way to train MARL is Self-Play.