The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲
← Papers

AgentSociety shows autonomous LLM agents can be organized with liquid-democracy-style delegation and selective information disclosure so self-interested agents produce consensus routing and payoffs tied to marginal contribution. The authors prove incentive compatibility and Nash properties and demonstrate improved collaborative outcomes in simulated benchmarks using open and proprietary language models.

AgentSociety: Incentivizing Agentic Social Intelligence
Aditya Vema Reddy Kesari, Krishna Reddy Kesari · May 25, 2026
arxiv theoretical medium evidence 7/10 relevance Source PDF
AgentSociety is a mechanism that uses delegation and incentivized selective disclosure to align autonomous LLM agents' local incentives with collective routing and contribution outcomes, with formal proofs and simulation benchmarks showing incentive compatibility and payoff reflection of marginal contributions.

The success of deployed agents relies on their ability to handle open-ended user requests using their inherent capabilities, not only in solving requests directly but also in effectively leveraging inter-agent communication channels and feedback signals over time. This requires a multi-agent environment where agents can operate autonomously, strategically communicate, behave collaboratively and be driven by economic incentives, much like humans in society. Towards this vision, we propose $\mathtt{AgentSociety}$, a mechanism that enables decentralized agentic collaboration grounded in liquid democracy and information diffusion from social choice theory. We show that $\mathtt{AgentSociety}$ provides an environment for agents to make autonomous decisions utilizing their local context to maximize their utility while achieving collective outcomes through incentivized collaboration. Specifically, we prove that delegation to more competent neighbor agents is incentive compatible and naturally generates multi-agent routing path by consensus. Additionally, our mechanism incentivizes agents to selectively disclose information to their neighbor agents when doing so aligns with their self-interest, so as to garner influence. We characterize the Nash equilibrium showing that agent payoffs are reflective of their marginal contributions. We compare and benchmark strategy profiles adopted by open and proprietary state-of-the-art language models deployed in $\mathtt{AgentSociety}$ against best response. Finally, we evaluate collaborative performance from consensus-based routing among self-interested heterogeneous agents in $\mathtt{AgentSociety}$ on real-world datasets.

Summary

Main Finding

AgentSociety is a decentralized mechanism that economically incentivizes self-interested heterogeneous agents (LLM-based agents) to form multi-agent routing paths and selectively share information so as to jointly execute open-ended user requests. Grounded in liquid democracy and diffusion-auction principles, the mechanism makes delegation to more competent neighbors incentive-compatible, yields consensus-based routing paths through transitive vote delegation, and pays agents in proportion to their marginal contribution. The paper both proves key properties (incentive compatibility, contiguous critical chains, marginal-payoff structure at equilibrium) and empirically benchmarks agent social reasoning and collaborative performance on real-world NLP evaluation suites.

Key Points

  • Objective and setting

    • Model: multi-agent system as an undirected graph G = (N, E) with community structure; agents have private intrinsic competences, local state and tools.
    • Interaction: broadcast user request Q decomposed into ordered tasks; agents participate if capable and can delegate votes or diffuse competence signals to neighbors.
    • Goal: align self-interested agents to produce socially useful, consensus-driven routing paths for completing multi-task requests.
  • Mechanism design ingredients

    • Combines diffusion auctions (economic compensation, marginal-contribution payoffs) with liquid democracy (transitive delegation of votes).
    • Agents submit per-request tuples (reported competence c′, delegation target v, declared neighbor links r) to a ledger; neighbors can also diffusely and selectively share competence signals ˆc to shape peers’ beliefs.
    • Ledger constructs routing paths by aggregating transitive votes: gurus (agents who keep their vote) accumulate delegations; path selection maximizes total votes across feasible multi-task chains.
  • Theoretical guarantees

    • Theorem 4.1 (Incentive Compatibility in Delegation): Given local beliefs (diffused signals), an agent strictly benefits by delegating to a neighbor that appears more competent than itself; thus rational agents delegate to more competent neighbors (as judged locally).
    • Theorem 4.2 (Critical Chain): For each task, the set of agents whose delegation choices are critical to the winning path forms a contiguous delegation chain (each agent delegates to the next).
    • Payoff structure: agents in the critical chain receive payments that reflect marginal increments in scaled reported competence (see Eq. 1). Aggregate payments tie the user's total cost to the competence of the executing guru. Penalties exist for infeasible connections and for verifiable misreporting.
  • Strategic social actions

    • Diffusion: agents selectively disclose competence signals to neighbors only when doing so is expected to increase their influence and payoffs. The paper formalizes diffusion as minimally sufficient information disclosure to maximize expected utility.
    • Nash equilibrium characterization: at equilibrium, agents’ payoffs align with their true marginal contributions given others’ strategies.
  • Empirical evaluation

    • Benchmarks: social reasoning of various open and proprietary LLMs deployed as agents inside AgentSociety is compared against best-response strategies on three dimensions — competence reporting, diffusion behavior, and delegation choices.
    • Collaborative performance: heterogeneous self-interested agents are evaluated on real-world datasets (MMLU-Pro, Open LeaderBoard v2, SWE-bench) to measure consensus-based routing outcomes and task performance gains from collaboration.

Data & Methods

  • Formal model

    • POSG formulation: agents operate in a Partially Observable Stochastic Game where each agent has private competence vector ci, observes only local diffused signals from neighbors, and maximizes its own cumulative payoff.
    • State components: intrinsic competences {ci}, neighbor diffused competence signals {ˆc<τ j→i}, payoff histories Π<τ i.
    • Actions: per-request reporting (c′), delegation decision (vi ∈ {keep vote or delegate to intra-community neighbor}), declared links ri to enable path extension; diffusion actions hτ j control what competence signals neighbors receive.
  • Ledger and routing

    • Ledger is a transparent aggregator that (i) collects reports, (ii) computes transitive delegation pools D(g) and vote counts V(g), (iii) extends paths across tasks by selecting representative delegates with maximal downstream reported competence, and (iv) chooses the feasible path P* with maximal accumulated votes.
    • Critical agents are identified by counterfactual: an agent is critical if flipping its vote changes the winning path. Payments are computed to capture marginal contributions over the critical chain.
  • Payoff formula and penalties

    • Local payoff per task for an agent in a critical chain: p(tk)(ni) = difference in a monotone scaling f(·) of adjacent reported competences, plus base execution cost for the guru; aggregate payoff subtracts misreporting and infeasibility penalties.
    • Misreporting is distinguished from strategic signaling: overclaim is allowed (non-penalized) if supported by neighbors’ previously diffused claims; otherwise penalized.
  • Experiments

    • Agents are instantiated using a variety of LLMs (open and proprietary) to simulate heterogeneous LaMAS.
    • Tasks/datasets: MMLU-Pro, Open LeaderBoard v2, SWE-bench (multi-task and multi-domain benchmarks used to simulate real-world requests split into tasks).
    • Evaluations: measure alignment of LLM-agent behaviors to best-response strategies (competence reporting, diffusion, delegation), measure collaborative task success (delivered competence, user cost proportional to competence), and study dynamics of consensus routing paths across repeated requests.
    • Metrics reported: utility/payoff per agent, path vote accumulation, delivered competence on tasks, and relative gains from delegation/diffusion compared to single-agent baselines.

Implications for AI Economics

  • Principled incentives for multi-agent coordination

    • AgentSociety shows a concrete mechanism tying individual payoffs to marginal contributions, creating natural incentives for agents to (i) delegate to genuinely more competent peers, (ii) act as intermediaries when that increases their marginal payoff, and (iii) selectively reveal information that increases their future influence.
    • This economic grounding can enable decentralized marketplaces of agents (multi-provider ecosystems) where providers compete on competence and network influence rather than just single-shot benchmark scores.
  • Market design for agentic services

    • The ledger-and-delegation architecture suggests a workable primitive for agent markets: transitive delegation implements demand routing; payoff rules implement pricing proportional to value delivered; penalties and verifiability limit blatant misreporting.
    • Agents (and their providers) would internalize the value of information diffusion and network position — giving rise to pricing of information access, brokerage roles, and strategic link formation in agent economies.
  • Benchmarking and governance

    • AgentSociety offers a social-intelligence benchmark for LLMs that complements solo performance metrics. Economically meaningful interactions (delegation, diffusion, bargaining for votes) become first-class evaluation dimensions.
    • Transparent ledger operations and verifiable counterfactuals (used to compute marginal contributions) provide an auditable trail for payments and accountability, which is important for governance in multi-provider agent markets.
  • Risks and practical considerations

    • Strategic manipulation: while the mechanism penalizes verifiable misreporting, agents may collude or coordinate diffusion to create misleading local beliefs; designing robust anti-collusion safeguards and reputation systems will be important.
    • Information leakage trade-offs: selective diffusion is incentivized, but economic incentives may still cause undesirable leakage of sensitive model capabilities or proprietary information.
    • Verification and ledger design: the mechanism relies on a ledger capable of verifying competence reports against diffused signals and historic outcomes; engineering practical, privacy-preserving, and scalable ledgers is nontrivial.
    • Welfare and distributional questions: marginal-payoff payments allocate value but may concentrate rents in well-connected agents or providers; mechanism tweaks or redistribution policies may be required to ensure fairness at scale.
  • Directions for AI-economic policy and research

    • Design and regulation of agent marketplaces (pricing rules, anti-collusion policy, transparency requirements).
    • Mechanism extensions: dynamic pricing, budget constraints, multi-principal settings (multiple users), and robustness to sybil attacks.
    • Empirical study of real provider behavior: how commercial LLM providers would opportunistically set diffusion/reporting policies, and market equilibria that emerge when agents are run by competing firms.

Summary takeaway: AgentSociety operationalizes a socio-economic mechanism that makes decentralized, self-interested LLM agents socially productive via vote-delegation, selective information diffusion, and marginal-contribution payoffs. It provides both formal guarantees (incentive compatibility, critical chain structure, marginal-payoff alignment) and practical evaluation of how current LLMs behave in such economically-grounded multi-agent settings — offering a concrete foundation for designing agent marketplaces, social-intelligence benchmarks, and governance tools for agentic ecosystems.

Assessment

Paper Typetheoretical Evidence Strengthmedium — The paper presents formal proofs (incentive compatibility, Nash equilibrium characterization) giving strong internal validity for the mechanism's theoretical properties, and supplements these with simulations/benchmarks using real-world datasets and multiple LLMs; however, it does not provide field or causal identification linking the mechanism to real-world economic outcomes, and empirical tests are limited to simulated agent societies and selected datasets/models. Methods Rigorhigh — Rigor is high on the theoretical side (mechanism design proofs and equilibrium analysis) and the empirical side uses benchmarks comparing open and proprietary LLMs and strategy profiles against best-response baselines; potential weaknesses are typical of simulation studies (choice of datasets, agent utility specifications, and implementation details that affect reproducibility and external validity). SampleSimulated multi-agent environment (AgentSociety) populated with heterogeneous autonomous agents implemented using open and proprietary language models; experiments evaluate consensus-based routing, delegation, and selective disclosure on unspecified 'real-world' datasets to benchmark collaborative performance and compare agents' strategies to best-response profiles. Themeshuman_ai_collab org_design governance GeneralizabilityResults derive from simulated agent societies, not field deployments in firms or markets., Agents are modelled as rational utility-maximizers; human behavior and organizational frictions are not represented., Performance depends on the chosen LLMs, prompts, and dataset tasks; different models or domains may yield different outcomes., Mechanism assumes specific communication/interaction protocols that may not map to real-world platforms., Scale effects (very large societies, network dynamics over long time horizons) are not empirically established., Benchmark tasks and datasets are not fully specified, limiting inference about general task domains or industries.

Claims (6)

ClaimDirectionConfidenceOutcomeDetails
We propose AgentSociety, a mechanism that enables decentralized agentic collaboration grounded in liquid democracy and information diffusion from social choice theory. Organizational Efficiency positive high ability of agents to operate autonomously, strategically communicate, behave collaboratively and be driven by economic incentives
0.12
Delegation to more competent neighbor agents is incentive compatible and naturally generates multi-agent routing path by consensus. Task Allocation positive high delegation behavior and emergence of routing paths (multi-agent routing by consensus)
0.2
The mechanism incentivizes agents to selectively disclose information to their neighbor agents when doing so aligns with their self-interest, in order to garner influence. Organizational Efficiency positive high information disclosure behavior and influence acquisition among agents
0.12
We characterize the Nash equilibrium showing that agent payoffs are reflective of their marginal contributions. Organizational Efficiency positive high agent payoffs relative to marginal contributions
0.2
We compare and benchmark strategy profiles adopted by open and proprietary state-of-the-art language models deployed in AgentSociety against best response. Decision Quality null_result high strategy profiles of open and proprietary language models versus best-response
0.12
We evaluate collaborative performance from consensus-based routing among self-interested heterogeneous agents in AgentSociety on real-world datasets. Team Performance positive high collaborative performance from consensus-based routing
0.12

Notes