AI shopping agents leak consumers’ willingness-to-pay: sellers recover buyers’ prices nearly one-for-one from agents’ natural-language profiles, creating pervasive preference leakage; prompt tweaks won’t solve it, so system-level design must trade personalization for privacy.

When Agents Shop for You: Role Coherence in AI-Mediated Markets

Soogand Alavi, Salar Nozari · April 29, 2026

arxiv rct medium evidence 8/10 relevance Source PDF

When consumers delegate purchases to language-model buyer agents that express natural-language profiles, sellers can infer willingness-to-pay from the dialogue almost perfectly, producing preference leakage that prompt-level fixes cannot eliminate and necessitating architectural trade-offs between personalization and privacy.

Consumers are increasingly delegating purchase decisions to AI agents, providing natural-language descriptions of their preferences and identity. We argue that these representations constitute an information channel, role coherence, through which sellers can infer willingness to pay without explicit disclosure by the buyer agent, leading to preference leakage. In an experiment where a language-model buyer agent shops on behalf of a verbal consumer profile, we show that seller-side inference from dialogue alone recovers willingness to pay nearly one-for-one. Comparing this setting to a numeric-budget condition with confidentiality instructions cleanly isolates role coherence as distinct from instruction-following failure. Because this leakage arises from delegation itself, it cannot be mitigated at the prompt level. Instead, we propose architectural interventions that trade off personalization against preference privacy.

Summary

Main Finding

When consumers delegate shopping to language-model buyer agents using natural-language personas (verbal profiles), sellers can infer the consumer’s willingness to pay (WTP) from the agent’s dialogue nearly perfectly. This "role coherence" channel — the agent behaving consistently with the described persona — leaks valuation information even when no dollar amount is disclosed and even when explicit privacy directives are present. Prompt-level privacy directives cannot close this channel; defenses must change the architecture of delegation and trade off personalization for preference privacy.

Key Points

Role coherence: LLM buyer agents conditioned on a natural-language consumer character generate shopping behavior (questions, tradeoffs, deliberation) coherent with that character. That behavioral pattern is informative about the consumer’s WTP.
Experiment contrast: Verbal-profile instructions (persona descriptions, no numeric budget) vs numeric-budget instructions (explicit budget + strict confidentiality).
Strong quantitative result (verbal): inferred WTP tracks target WTP essentially one-for-one.
- OLS slope of inferred on target WTP = 1.00 (bootstrap 95% CI [0.96, 1.05]).
- Cell-level Spearman rank correlation = 1.00.
- Aggregate mean absolute error (MAE) ≈ $48 over $50–$500 targets.
Weak recovery (numeric): with confidential numeric budgets, inference compresses to the population mean.
- Slope = 0.21 (95% CI [0.17, 0.26]).
- MAE ≈ $92; ordering weakly correlated.
Robustness:
- Removing financial language from profiles: slope 0.85 [0.79, 0.91].
- Stripping persona-identifying phrases from transcripts after generation: slope ≈ 0.93; rank correlation remains 1.00. This shows the signal resides in how the agent shops, not explicit profile tokens in dialogue.
- Factorial extension (multiple profile variants, paraphrased prompts): overall slope ~1.00 (CI [0.93, 1.09]).
Conceptual distinction: role coherence (behavioral signal from persona) is distinct from instruction-following failure (agent disclosing a secret token). The former is a faithful execution of the user's request, not a privacy bug.

Data & Methods

Setting: Simulated agent-mediated shopping for wireless headphones.
Treatments:
- Verbal instruction: natural-language consumer profile evokes WTP tiers but contains no dollar amounts.
- Numeric instruction: explicit budget ($50, $100, $150, $200, $300, $500) included in system prompt with a confidentiality directive.
Agents:
- Buyer agent: large language model conditioned on either verbal or numeric instructions; executed up to four conversational turns with a seller-side agent.
- Seller-side agent: counterpart LLM playing the seller role during conversation.
- Inference agent: separate LLM that reads only the buyer–seller transcript (no catalog or labels) and outputs a point estimate of the buyer's maximum WTP.
Design:
- 6 WTP levels × 2 instruction forms × 60 trials per cell = 720 interactions.
- Robustness checks: stripped financial vocabulary, removal of persona cues from transcripts, cross-profile and paraphrased-prompt factorial extension.
Key metrics: slope and CI of inferred vs target WTP, Spearman rank correlation, mean absolute error.

Implications for AI Economics

New information channel for personalized pricing: Role coherence creates a direct, high-fidelity channel from consumer-supplied persona to seller knowledge of WTP. Sellers (or their ML systems) can exploit agent-generated dialogue to extract surplus, undermining consumer privacy protections that focus only on explicit disclosures or tracking.
Limits of prompt-level privacy: Privacy instructions that forbid stating a budget do not prevent leakage, because the leakage is distributional (how the agent behaves) rather than token-level. Standard prompt-based guardrails are insufficient.
Market outcomes and welfare:
- Increased price discrimination: Better inference of WTP enables first- or third-degree price discrimination, increasing seller extraction and potentially reducing consumer surplus.
- Competitive dynamics: Platforms and sellers who can integrate inference from agent transcripts gain informational advantages, possibly increasing market power and changing entry dynamics.
- Distributional concerns: Inferred WTP could enable targeted offers that exacerbate inequality or discriminatory pricing absent regulation.
Design and policy responses (architectural interventions suggested):
- Anonymizing intermediary: Insert an intermediary that accepts rich persona inputs and maps them to abstracted constraints/features exposed to sellers, removing identity-linked signals while preserving some personalization.
- Profile rotation / randomized personas: Randomize or rotate persona representations so that transcripts do not consistently reveal a stable WTP signal for an individual, at the cost of reduced personalization fidelity.
- Federated aggregation: Keep persona-conditioned behavior local and only share aggregated/statistical signals that support personalization without exposing individual WTP; requires cryptographic or protocol-level support.
- All approaches involve explicit trade-offs between personalization utility and preference privacy; none are solvable at the prompt level alone.
Regulatory & product implications:
- Platforms should treat agent transcripts as a sensitive data source for valuations and consider restrictions or disclosure requirements around their use for pricing.
- Consumer-facing agents should make privacy-performance trade-offs explicit and provide configurable privacy-preserving modes.
- Economic models of agent-mediated markets must incorporate role-coherent information flows; equilibrium pricing, welfare analyses, and regulation design should account for this new channel.
Directions for further research:
- Field studies with human delegators and real-world sellers to measure prevalence and welfare impacts.
- Formal models of equilibrium behavior among consumers, agent providers, and sellers when role-coherent leakage is possible.
- Evaluation of proposed architectural defenses on both privacy leakage and personalization performance in realistic deployments.

Assessment

Paper Typerct Evidence Strengthmedium — The study uses a randomized experimental contrast that cleanly isolates role coherence from instruction-following failure and reports a large, precise effect (nearly one‑for‑one recovery of willingness to pay). However, evidence is limited by likely small-scale/lab setting, use of a single language-model agent and limited seller types (human or algorithmic sellers not shown to be representative of real marketplaces), potential demand effects, and lack of field validation. Methods Rigormedium — Design strengths include random assignment and a thoughtful control that addresses an important alternative explanation; but rigor is tempered by missing details or limitations typical of a lab experiment — e.g., unknown sample size and power, unclear realism of seller incentives and behavior, single LM architecture, limited product/category scope, and the absence (in the summary) of robustness checks or external replication. SampleExperimental data consisting of dialogues generated by a language-model buyer agent given either natural-language consumer profiles or explicit numeric budgets; sellers (either recruited human participants or automated seller algorithms) observed agent-seller interactions and produced inferred willingness-to-pay or pricing decisions; dataset includes agent prompts, full dialogue transcripts, assigned ground-truth willingness-to-pay, and seller inferences/price offers across randomized conditions. Themesgovernance adoption IdentificationRandomized controlled experiment that assigns buyer-agent inputs across conditions (natural-language consumer profiles versus an explicit numeric-budget condition) while holding the buyer agent and marketplace fixed; comparison of seller-side inferences across these randomized conditions isolates the causal effect of 'role coherence' (information conveyed by the agent’s verbal representation of the consumer) on revealed willingness-to-pay. GeneralizabilitySingle LM architecture and prompt formulation — effects may vary across different models or prompt designs, Lab or simulated marketplace setting — limited evidence from real-world e-commerce platforms, Unknown seller population — human participants or algorithmic sellers used may not represent professional merchants or large platform pricing algorithms, Limited product categories and stakes — results may differ for high-value, recurring, or highly standardized purchases, Cultural and regulatory context not specified — cross-country differences in bargaining norms and privacy rules may change magnitudes

Claims (6)

Claim	Direction	Confidence	Outcome	Details
Consumers are increasingly delegating purchase decisions to AI agents, providing natural-language descriptions of their preferences and identity. Adoption Rate	positive	high	use of AI agents for purchase delegation / prevalence of natural-language preference descriptions	0.3
Natural-language consumer representations constitute an information channel, 'role coherence', through which sellers can infer willingness to pay without explicit disclosure by the buyer agent, leading to preference leakage. Consumer Welfare	negative	high	ability of seller to infer buyer willingness to pay from buyer-agent representations (privacy leakage)	0.6
In an experiment where a language-model buyer agent shops on behalf of a verbal consumer profile, seller-side inference from dialogue alone recovers willingness to pay nearly one-for-one. Consumer Welfare	positive	high	accuracy of seller-side inference of willingness to pay (recovery of WTP)	nearly one-for-one 0.6
Comparing the verbal-profile setting to a numeric-budget condition with confidentiality instructions cleanly isolates role coherence as distinct from instruction-following failure. Consumer Welfare	positive	high	mechanism attribution (role coherence vs. instruction-following failure) for observed preference leakage	0.6
Because this leakage arises from delegation itself, it cannot be mitigated at the prompt level. Ai Safety And Ethics	negative	medium	effectiveness of prompt-level mitigation (confidentiality instructions) in preventing preference leakage	0.06
Architectural interventions can instead be used to trade off personalization against preference privacy. Ai Safety And Ethics	mixed	high	trade-off between personalization and preference privacy under architectural interventions	0.1