Divergent depths of strategic reasoning among AI agents undermine teamwork, but agents that infer and adapt to partners’ Theory-of-Mind depth recover most coordination losses across matrix, navigation and Overcooked tasks.
Theory of Mind (ToM) refers to the ability to reason about others' mental states, and higher-order ToM involves considering that others also possess their own ToM. Equipping large language model (LLM)-driven agents with ToM has long been considered to improve their coordination in multiagent collaborative tasks. However, we find that misaligned ToM orders-mismatches in the depth of ToM reasoning between agents-can lead to insufficient or excessive reasoning about others, thereby impairing their coordination. To address this issue, we design an adaptive ToM (A-ToM) agent, which can align in ToM orders with its partner. Based on prior interactions, the agent estimates the partner's likely ToM order and leverages this estimation to predict the partner's action, thereby facilitating behavioral coordination. We conduct empirical evaluations on four multi-agent coordination tasks: a repeated matrix game, two grid navigation tasks and an Overcooked task. The results validate our findings on ToM alignment and demonstrate the effectiveness of our A-ToM agent. Furthermore, we discuss the generalizability of our A-ToM to non-LLM-based agents, as well as what would diminish the importance of ToM alignment.
Summary
Main Finding
Misalignment in Theory of Mind (ToM) order between agents—i.e., differences in how deeply agents model others’ reasoning—can harm coordination: too little or too much recursive reasoning leads to poor joint behavior. An adaptive ToM (A-ToM) agent that infers its partner’s ToM order from past interactions and uses that estimate to predict the partner’s actions restores alignment and improves coordination across a range of multiagent tasks.
Key Points
- Theory of Mind (ToM) order denotes how many levels of “I think that you think …” reasoning an agent uses. Higher-order ToM means deeper recursive modeling.
- Order mismatches (agents using different ToM depths) can produce insufficient or excessive anticipation of others’ actions, reducing coordination quality.
- A-ToM: an agent that adaptively estimates its partner’s ToM order from prior interactions and conditions its predictions and action choices on that estimate.
- Empirical evaluation across four coordination environments (a repeated matrix game, two grid navigation tasks, and an Overcooked task) shows:
- Clear performance degradation when ToM orders are mismatched.
- A-ToM agents recover coordination performance by aligning effective ToM depth with partners.
- The approach is argued to generalize beyond LLM-driven partners to other agent classes; the authors also identify settings where ToM alignment becomes less important (e.g., when explicit coordination protocols or strong signaling render deep inference unnecessary).
Data & Methods
- Agents: LLM-driven agents with configurable ToM reasoning depth; comparison includes fixed-order agents versus the proposed adaptive agent.
- A-ToM mechanism: based on interaction history, the agent estimates the partner’s likely ToM order and uses that estimate to predict the partner’s next action; those predictions inform the agent’s policy/choice. (Paper reports implementation details and estimation procedure; summary here reflects the conceptual design.)
- Evaluation environments:
- Repeated matrix game (iterated strategic interactions with payoffs relying on coordination).
- Two grid navigation tasks (spatial coordination requiring complementary movement/planning).
- Overcooked (a standard collaborative multiagent domain with role complementarity and temporally extended coordination).
- Metrics: coordination performance measured by task-specific joint payoffs/success rates (e.g., joint reward, task completion/time), compared across matched, mismatched, and A-ToM-aligned pairings.
- Robustness checks: tests of generalizability to non-LLM agents and analysis of conditions reducing the need for ToM alignment (e.g., availability of explicit conventions or highly aligned incentives).
Implications for AI Economics
- Coordination efficiency and welfare: Heterogeneous reasoning depths among AI agents can create persistent inefficiencies in markets, platforms, and collaborative production. Adaptive modeling of partners’ reasoning levels can reduce coordination losses and improve aggregate outcomes.
- Mechanism and platform design: Platforms and market designers should account for heterogeneity in agents’ strategic sophistication. Providing tools for rapid inference (or standardizing interfaces/conventions) can substitute for deep ToM and reduce frictions.
- Contracting and team formation: In settings where AI agents represent human principals or act autonomously in teams, aligning ToM orders (or using adaptive agents) can raise joint productivity and reduce the need for costly renegotiation or corrective mechanisms.
- Strategic behavior and manipulation: Agents that infer others’ reasoning depth may be vulnerable to strategic misrepresentation (agents behaving to induce a mistaken ToM estimate). Designers must consider incentives to mislead and potential robustness measures (e.g., conservatism, verification, meta-reasoning about deception).
- Regulation and safety: Ensuring transparency about agents’ decision rules or promoting standard coordination protocols could limit harmful dynamics caused by divergent internal models and reduce negative externalities (e.g., market instability from mismatched anticipatory strategies).
- When ToM alignment matters less: In many economic applications, explicit contracts, institutional rules, strong common incentives, or cheap, reliable signaling can render deep ToM inference unnecessary. Resources spent on adaptive ToM may yield diminishing returns in such environments.
- Broader relevance: The results highlight an underappreciated dimension of agent heterogeneity—reasoning depth—that complements more familiar heterogeneities (preferences, information, computation). Accounting for it can improve models of strategic interaction in automated markets, bargaining, matching platforms, and multiagent service ecosystems.
Assessment
Claims (10)
| Claim | Direction | Confidence | Outcome | Details |
|---|---|---|---|---|
| Misalignment in Theory-of-Mind (ToM) order between agents (i.e., agents using different recursive reasoning depths) degrades coordination performance. Team Performance | negative | medium | coordination performance (joint payoff, task success rate, task completion/time) |
0.11
|
| Both too little and too much recursive reasoning (i.e., too shallow or too deep ToM) can produce poor joint behavior — miscalibrated anticipation harms coordination. Team Performance | negative | medium | coordination performance (joint payoff, success rate) |
0.11
|
| An adaptive ToM (A-ToM) agent that infers its partner's ToM order from prior interactions and conditions its predictions and actions on that estimate restores alignment and improves coordination. Team Performance | positive | medium | coordination performance (joint payoff, success rate, task completion time) |
0.11
|
| A-ToM recovers coordination performance by aligning its effective ToM depth with partners across a range of multiagent tasks. Team Performance | positive | medium | coordination performance (joint payoff, success rate) |
0.11
|
| Empirical evaluation was performed across four coordination environments: a repeated matrix game, two grid navigation tasks, and an Overcooked task. Team Performance | positive | high | coordination performance (joint payoff, success rate) as used in experiments |
0.18
|
| The core findings (harm from ToM order mismatches and benefits from A-ToM) are robust to partners beyond LLM-driven agents. Team Performance | positive | medium | coordination performance (joint payoff, success rate) when paired with non-LLM agents |
0.11
|
| ToM alignment matters less (i.e., misalignment has smaller effect) in settings with explicit coordination protocols, strong signaling, or standardized conventions. Team Performance | null_result | medium | difference in coordination performance between matched and mismatched ToM orders (joint payoff, success rate) under explicit-protocol / strong-signaling conditions |
0.11
|
| Agents that attempt to infer others' reasoning depth may be vulnerable to strategic misrepresentation (partners could behave to induce incorrect ToM estimates). Decision Quality | negative | medium | vulnerability to strategic manipulation (qualitative risk and proposed mitigations; no specific quantitative metric provided in summary) |
0.11
|
| Heterogeneity in agents' reasoning depth is an underappreciated source of coordination inefficiency in economic settings; adaptive modeling can improve aggregate outcomes (welfare, efficiency) in markets, platforms, and teams. Organizational Efficiency | mixed | low | aggregate coordination efficiency/welfare (joint productivity, reduced renegotiation costs) — claimed qualitatively rather than measured directly in economic field settings |
0.05
|
| The A-ToM mechanism operates by estimating a partner's likely ToM order from interaction history and using that estimate to predict the partner's next action which then informs the agent's policy choices. Team Performance | positive | high | accuracy/usefulness of inferred ToM order for partner-action prediction and subsequent coordination performance (measured via joint payoff / success rate in experiments) |
0.18
|