AI shopping agents leak consumers’ willingness-to-pay: sellers recover buyers’ prices nearly one-for-one from agents’ natural-language profiles, creating pervasive preference leakage; prompt tweaks won’t solve it, so system-level design must trade personalization for privacy.
Consumers are increasingly delegating purchase decisions to AI agents, providing natural-language descriptions of their preferences and identity. We argue that these representations constitute an information channel, role coherence, through which sellers can infer willingness to pay without explicit disclosure by the buyer agent, leading to preference leakage. In an experiment where a language-model buyer agent shops on behalf of a verbal consumer profile, we show that seller-side inference from dialogue alone recovers willingness to pay nearly one-for-one. Comparing this setting to a numeric-budget condition with confidentiality instructions cleanly isolates role coherence as distinct from instruction-following failure. Because this leakage arises from delegation itself, it cannot be mitigated at the prompt level. Instead, we propose architectural interventions that trade off personalization against preference privacy.
Summary
Main Finding
When consumers delegate shopping to language-model buyer agents using natural-language personas (verbal profiles), sellers can infer the consumer’s willingness to pay (WTP) from the agent’s dialogue nearly perfectly. This "role coherence" channel — the agent behaving consistently with the described persona — leaks valuation information even when no dollar amount is disclosed and even when explicit privacy directives are present. Prompt-level privacy directives cannot close this channel; defenses must change the architecture of delegation and trade off personalization for preference privacy.
Key Points
- Role coherence: LLM buyer agents conditioned on a natural-language consumer character generate shopping behavior (questions, tradeoffs, deliberation) coherent with that character. That behavioral pattern is informative about the consumer’s WTP.
- Experiment contrast: Verbal-profile instructions (persona descriptions, no numeric budget) vs numeric-budget instructions (explicit budget + strict confidentiality).
- Strong quantitative result (verbal): inferred WTP tracks target WTP essentially one-for-one.
- OLS slope of inferred on target WTP = 1.00 (bootstrap 95% CI [0.96, 1.05]).
- Cell-level Spearman rank correlation = 1.00.
- Aggregate mean absolute error (MAE) ≈ $48 over $50–$500 targets.
- Weak recovery (numeric): with confidential numeric budgets, inference compresses to the population mean.
- Slope = 0.21 (95% CI [0.17, 0.26]).
- MAE ≈ $92; ordering weakly correlated.
- Robustness:
- Removing financial language from profiles: slope 0.85 [0.79, 0.91].
- Stripping persona-identifying phrases from transcripts after generation: slope ≈ 0.93; rank correlation remains 1.00. This shows the signal resides in how the agent shops, not explicit profile tokens in dialogue.
- Factorial extension (multiple profile variants, paraphrased prompts): overall slope ~1.00 (CI [0.93, 1.09]).
- Conceptual distinction: role coherence (behavioral signal from persona) is distinct from instruction-following failure (agent disclosing a secret token). The former is a faithful execution of the user's request, not a privacy bug.
Data & Methods
- Setting: Simulated agent-mediated shopping for wireless headphones.
- Treatments:
- Verbal instruction: natural-language consumer profile evokes WTP tiers but contains no dollar amounts.
- Numeric instruction: explicit budget ($50, $100, $150, $200, $300, $500) included in system prompt with a confidentiality directive.
- Agents:
- Buyer agent: large language model conditioned on either verbal or numeric instructions; executed up to four conversational turns with a seller-side agent.
- Seller-side agent: counterpart LLM playing the seller role during conversation.
- Inference agent: separate LLM that reads only the buyer–seller transcript (no catalog or labels) and outputs a point estimate of the buyer's maximum WTP.
- Design:
- 6 WTP levels × 2 instruction forms × 60 trials per cell = 720 interactions.
- Robustness checks: stripped financial vocabulary, removal of persona cues from transcripts, cross-profile and paraphrased-prompt factorial extension.
- Key metrics: slope and CI of inferred vs target WTP, Spearman rank correlation, mean absolute error.
Implications for AI Economics
- New information channel for personalized pricing: Role coherence creates a direct, high-fidelity channel from consumer-supplied persona to seller knowledge of WTP. Sellers (or their ML systems) can exploit agent-generated dialogue to extract surplus, undermining consumer privacy protections that focus only on explicit disclosures or tracking.
- Limits of prompt-level privacy: Privacy instructions that forbid stating a budget do not prevent leakage, because the leakage is distributional (how the agent behaves) rather than token-level. Standard prompt-based guardrails are insufficient.
- Market outcomes and welfare:
- Increased price discrimination: Better inference of WTP enables first- or third-degree price discrimination, increasing seller extraction and potentially reducing consumer surplus.
- Competitive dynamics: Platforms and sellers who can integrate inference from agent transcripts gain informational advantages, possibly increasing market power and changing entry dynamics.
- Distributional concerns: Inferred WTP could enable targeted offers that exacerbate inequality or discriminatory pricing absent regulation.
- Design and policy responses (architectural interventions suggested):
- Anonymizing intermediary: Insert an intermediary that accepts rich persona inputs and maps them to abstracted constraints/features exposed to sellers, removing identity-linked signals while preserving some personalization.
- Profile rotation / randomized personas: Randomize or rotate persona representations so that transcripts do not consistently reveal a stable WTP signal for an individual, at the cost of reduced personalization fidelity.
- Federated aggregation: Keep persona-conditioned behavior local and only share aggregated/statistical signals that support personalization without exposing individual WTP; requires cryptographic or protocol-level support.
- All approaches involve explicit trade-offs between personalization utility and preference privacy; none are solvable at the prompt level alone.
- Regulatory & product implications:
- Platforms should treat agent transcripts as a sensitive data source for valuations and consider restrictions or disclosure requirements around their use for pricing.
- Consumer-facing agents should make privacy-performance trade-offs explicit and provide configurable privacy-preserving modes.
- Economic models of agent-mediated markets must incorporate role-coherent information flows; equilibrium pricing, welfare analyses, and regulation design should account for this new channel.
- Directions for further research:
- Field studies with human delegators and real-world sellers to measure prevalence and welfare impacts.
- Formal models of equilibrium behavior among consumers, agent providers, and sellers when role-coherent leakage is possible.
- Evaluation of proposed architectural defenses on both privacy leakage and personalization performance in realistic deployments.
Assessment
Claims (6)
| Claim | Direction | Confidence | Outcome | Details |
|---|---|---|---|---|
| Consumers are increasingly delegating purchase decisions to AI agents, providing natural-language descriptions of their preferences and identity. Adoption Rate | positive | high | use of AI agents for purchase delegation / prevalence of natural-language preference descriptions |
0.3
|
| Natural-language consumer representations constitute an information channel, 'role coherence', through which sellers can infer willingness to pay without explicit disclosure by the buyer agent, leading to preference leakage. Consumer Welfare | negative | high | ability of seller to infer buyer willingness to pay from buyer-agent representations (privacy leakage) |
0.6
|
| In an experiment where a language-model buyer agent shops on behalf of a verbal consumer profile, seller-side inference from dialogue alone recovers willingness to pay nearly one-for-one. Consumer Welfare | positive | high | accuracy of seller-side inference of willingness to pay (recovery of WTP) |
nearly one-for-one
0.6
|
| Comparing the verbal-profile setting to a numeric-budget condition with confidentiality instructions cleanly isolates role coherence as distinct from instruction-following failure. Consumer Welfare | positive | high | mechanism attribution (role coherence vs. instruction-following failure) for observed preference leakage |
0.6
|
| Because this leakage arises from delegation itself, it cannot be mitigated at the prompt level. Ai Safety And Ethics | negative | medium | effectiveness of prompt-level mitigation (confidentiality instructions) in preventing preference leakage |
0.06
|
| Architectural interventions can instead be used to trade off personalization against preference privacy. Ai Safety And Ethics | mixed | high | trade-off between personalization and preference privacy under architectural interventions |
0.1
|