Divergent depths of strategic reasoning among AI agents undermine teamwork, but agents that infer and adapt to partners’ Theory-of-Mind depth recover most coordination losses across matrix, navigation and Overcooked tasks.

Adaptive Theory of Mind for LLM-based Multi-Agent Coordination

Chunjiang Mu, Ya Zeng, Qiaosheng Zhang, Kun Shao, Chen Chu, Hao Guo, Danyang Jia, Zhen Wang, Shuyue Hu · March 17, 2026

arxiv descriptive medium evidence 7/10 relevance Source PDF

Mismatch in recursive reasoning depth between agents degrades coordination, while an adaptive ToM agent that infers partners' reasoning depth restores alignment and improves joint performance across multiple coordination tasks.

Theory of Mind (ToM) refers to the ability to reason about others' mental states, and higher-order ToM involves considering that others also possess their own ToM. Equipping large language model (LLM)-driven agents with ToM has long been considered to improve their coordination in multiagent collaborative tasks. However, we find that misaligned ToM orders-mismatches in the depth of ToM reasoning between agents-can lead to insufficient or excessive reasoning about others, thereby impairing their coordination. To address this issue, we design an adaptive ToM (A-ToM) agent, which can align in ToM orders with its partner. Based on prior interactions, the agent estimates the partner's likely ToM order and leverages this estimation to predict the partner's action, thereby facilitating behavioral coordination. We conduct empirical evaluations on four multi-agent coordination tasks: a repeated matrix game, two grid navigation tasks and an Overcooked task. The results validate our findings on ToM alignment and demonstrate the effectiveness of our A-ToM agent. Furthermore, we discuss the generalizability of our A-ToM to non-LLM-based agents, as well as what would diminish the importance of ToM alignment.

Summary

Main Finding

Misalignment in Theory of Mind (ToM) order between agents—i.e., differences in how deeply agents model others’ reasoning—can harm coordination: too little or too much recursive reasoning leads to poor joint behavior. An adaptive ToM (A-ToM) agent that infers its partner’s ToM order from past interactions and uses that estimate to predict the partner’s actions restores alignment and improves coordination across a range of multiagent tasks.

Key Points

Theory of Mind (ToM) order denotes how many levels of “I think that you think …” reasoning an agent uses. Higher-order ToM means deeper recursive modeling.
Order mismatches (agents using different ToM depths) can produce insufficient or excessive anticipation of others’ actions, reducing coordination quality.
A-ToM: an agent that adaptively estimates its partner’s ToM order from prior interactions and conditions its predictions and action choices on that estimate.
Empirical evaluation across four coordination environments (a repeated matrix game, two grid navigation tasks, and an Overcooked task) shows:
- Clear performance degradation when ToM orders are mismatched.
- A-ToM agents recover coordination performance by aligning effective ToM depth with partners.
The approach is argued to generalize beyond LLM-driven partners to other agent classes; the authors also identify settings where ToM alignment becomes less important (e.g., when explicit coordination protocols or strong signaling render deep inference unnecessary).

Data & Methods

Agents: LLM-driven agents with configurable ToM reasoning depth; comparison includes fixed-order agents versus the proposed adaptive agent.
A-ToM mechanism: based on interaction history, the agent estimates the partner’s likely ToM order and uses that estimate to predict the partner’s next action; those predictions inform the agent’s policy/choice. (Paper reports implementation details and estimation procedure; summary here reflects the conceptual design.)
Evaluation environments:
- Repeated matrix game (iterated strategic interactions with payoffs relying on coordination).
- Two grid navigation tasks (spatial coordination requiring complementary movement/planning).
- Overcooked (a standard collaborative multiagent domain with role complementarity and temporally extended coordination).
Metrics: coordination performance measured by task-specific joint payoffs/success rates (e.g., joint reward, task completion/time), compared across matched, mismatched, and A-ToM-aligned pairings.
Robustness checks: tests of generalizability to non-LLM agents and analysis of conditions reducing the need for ToM alignment (e.g., availability of explicit conventions or highly aligned incentives).

Implications for AI Economics

Coordination efficiency and welfare: Heterogeneous reasoning depths among AI agents can create persistent inefficiencies in markets, platforms, and collaborative production. Adaptive modeling of partners’ reasoning levels can reduce coordination losses and improve aggregate outcomes.
Mechanism and platform design: Platforms and market designers should account for heterogeneity in agents’ strategic sophistication. Providing tools for rapid inference (or standardizing interfaces/conventions) can substitute for deep ToM and reduce frictions.
Contracting and team formation: In settings where AI agents represent human principals or act autonomously in teams, aligning ToM orders (or using adaptive agents) can raise joint productivity and reduce the need for costly renegotiation or corrective mechanisms.
Strategic behavior and manipulation: Agents that infer others’ reasoning depth may be vulnerable to strategic misrepresentation (agents behaving to induce a mistaken ToM estimate). Designers must consider incentives to mislead and potential robustness measures (e.g., conservatism, verification, meta-reasoning about deception).
Regulation and safety: Ensuring transparency about agents’ decision rules or promoting standard coordination protocols could limit harmful dynamics caused by divergent internal models and reduce negative externalities (e.g., market instability from mismatched anticipatory strategies).
When ToM alignment matters less: In many economic applications, explicit contracts, institutional rules, strong common incentives, or cheap, reliable signaling can render deep ToM inference unnecessary. Resources spent on adaptive ToM may yield diminishing returns in such environments.
Broader relevance: The results highlight an underappreciated dimension of agent heterogeneity—reasoning depth—that complements more familiar heterogeneities (preferences, information, computation). Accounting for it can improve models of strategic interaction in automated markets, bargaining, matching platforms, and multiagent service ecosystems.

Assessment

Paper Typedescriptive Evidence Strengthmedium — The paper systematically demonstrates the mechanism across multiple simulated environments (matrix game, two grid tasks, Overcooked) and includes robustness checks, which supports internal validity for the claim that ToM misalignment harms coordination and that adaptation helps; however, evidence is confined to simulated agents (primarily LLM-driven) and synthetic tasks, with no field or real-market data, limited agent heterogeneity, and potential sensitivity to model/hyperparameter choices, so external validity for real-world economic settings is limited. Methods Rigormedium — The design manipulates the key variable (ToM order) and compares suitably controlled conditions, reports multiple environments and metrics, and includes robustness analyses; but the description lacks information on sample sizes, statistical inference (e.g., statistical significance, confidence intervals), sensitivity to LLM prompts/temperature, and exploration of strategic manipulation by partners, leaving some methodological details and threats to inference unaddressed. SampleSimulated multiagent interactions using configurable LLM-driven agents with fixed ToM depths and an adaptive A-ToM agent that infers partner ToM order from interaction history; evaluated across four environments: a repeated coordination matrix game, two grid-navigation coordination tasks, and the Overcooked collaborative domain; metrics are task-specific joint payoffs, success rates, and completion times; robustness checks include interactions with non-LLM agent classes and alternative signaling/coordination regimes. Themeshuman_ai_collab productivity org_design adoption IdentificationControlled simulation experiments that manipulate agents' Theory-of-Mind (ToM) order and compare outcomes across matched, mismatched, and adaptive (A-ToM) pairings; causal claims rely on within-environment contrasts (same tasks, varying only ToM depth and algorithmic adaptation) and robustness checks including non-LLM agents and alternative coordination regimes. GeneralizabilityResults are from simulated environments and may not transfer directly to real-world markets, firms, or human–AI teams., Primary agents are LLM-driven; other deployed AI architectures or constrained agents may behave differently., Evaluations focus on pairwise or small-team settings—scaling to many-agent platforms or market-level interactions is untested., Potential strategic behavior (deliberate deception to manipulate inferred ToM) and incentive effects in economic settings are noted but not empirically resolved., Task domains (matrix game, grids, Overcooked) capture coordination primitives but omit richer institutional, contractual, and information structures present in real economies.

Claims (10)

Claim	Direction	Confidence	Outcome	Details
Misalignment in Theory-of-Mind (ToM) order between agents (i.e., agents using different recursive reasoning depths) degrades coordination performance. Team Performance	negative	medium	coordination performance (joint payoff, task success rate, task completion/time)	0.11
Both too little and too much recursive reasoning (i.e., too shallow or too deep ToM) can produce poor joint behavior — miscalibrated anticipation harms coordination. Team Performance	negative	medium	coordination performance (joint payoff, success rate)	0.11
An adaptive ToM (A-ToM) agent that infers its partner's ToM order from prior interactions and conditions its predictions and actions on that estimate restores alignment and improves coordination. Team Performance	positive	medium	coordination performance (joint payoff, success rate, task completion time)	0.11
A-ToM recovers coordination performance by aligning its effective ToM depth with partners across a range of multiagent tasks. Team Performance	positive	medium	coordination performance (joint payoff, success rate)	0.11
Empirical evaluation was performed across four coordination environments: a repeated matrix game, two grid navigation tasks, and an Overcooked task. Team Performance	positive	high	coordination performance (joint payoff, success rate) as used in experiments	0.18
The core findings (harm from ToM order mismatches and benefits from A-ToM) are robust to partners beyond LLM-driven agents. Team Performance	positive	medium	coordination performance (joint payoff, success rate) when paired with non-LLM agents	0.11
ToM alignment matters less (i.e., misalignment has smaller effect) in settings with explicit coordination protocols, strong signaling, or standardized conventions. Team Performance	null_result	medium	difference in coordination performance between matched and mismatched ToM orders (joint payoff, success rate) under explicit-protocol / strong-signaling conditions	0.11
Agents that attempt to infer others' reasoning depth may be vulnerable to strategic misrepresentation (partners could behave to induce incorrect ToM estimates). Decision Quality	negative	medium	vulnerability to strategic manipulation (qualitative risk and proposed mitigations; no specific quantitative metric provided in summary)	0.11
Heterogeneity in agents' reasoning depth is an underappreciated source of coordination inefficiency in economic settings; adaptive modeling can improve aggregate outcomes (welfare, efficiency) in markets, platforms, and teams. Organizational Efficiency	mixed	low	aggregate coordination efficiency/welfare (joint productivity, reduced renegotiation costs) — claimed qualitatively rather than measured directly in economic field settings	0.05
The A-ToM mechanism operates by estimating a partner's likely ToM order from interaction history and using that estimate to predict the partner's next action which then informs the agent's policy choices. Team Performance	positive	high	accuracy/usefulness of inferred ToM order for partner-action prediction and subsequent coordination performance (measured via joint payoff / success rate in experiments)	0.18