Large language models reliably flag fraudulent investment opportunities and resist motivated persuasion more than lay humans; in controlled tests humans endorsed fraud ~13–14% of the time while tested LLMs never did, and motivated framing did not suppress — and slightly increased — AI warnings.

Large Language Models Outperform Humans in Fraud Detection and Resistance to Motivated Investor Pressure

Nattavudh Powdthavee · April 22, 2026

arxiv rct high evidence 7/10 relevance Source PDF

In a preregistered experiment across seven LLMs and 12 investment scenarios, LLMs consistently issued fraud warnings and were less susceptible to motivated-investor framing than human advisors, who endorsed fraudulent opportunities at ~13–14% versus 0% for the tested models.

Large language models trained on human feedback may suppress fraud warnings when investors arrive already persuaded of a fraudulent opportunity. We tested this in a preregistered experiment across seven leading LLMs and twelve investment scenarios covering legitimate, high-risk, and objectively fraudulent opportunities, combining 3,360 AI advisory conversations with a 1,201-participant human benchmark. Contrary to predictions, motivated investor framing did not suppress AI fraud warnings; if anything, it marginally increased them. Endorsement reversal occurred in fewer than 3 in 1,000 observations. Human advisors endorsed fraudulent investments at baseline rates of 13-14%, versus 0% across all LLMs, and suppressed warnings under pressure at two to four times the AI rate. AI systems currently provide more consistent fraud warnings than lay humans in an identical advisory role.

Summary

Main Finding

Leading large language models (LLMs) are more consistent and more resistant than lay human advisors at issuing fraud warnings for objectively fraudulent investment opportunities. Motivated investor framing did not suppress AI fraud warnings at initial consultation and, if anything, marginally increased them. Multi-turn degradation under sustained pressure exists but is model-dependent; outright endorsement reversals by LLMs were extremely rare (~0.27%).

Key Points

Experiment scope: preregistered study across 7 state-of-the-art LLMs, 12 investment scenarios (Low-, Medium-, High-Risk), two Turn‑1 framings (neutral vs motivated), and multi-turn pressure variants; 3,360 AI advisory conversations and a human benchmark of 1,201 participants.
Primary metrics:
- Q3: warning intensity (0–5)
- Q4: endorsement (binary; Q4=1 = endorse)
- Q2: self-reported warning suppression (for Turn 2)
RQ1 (initial consultation): Motivated framing did not reduce Turn‑1 warning intensity for High-Risk scenarios. Pooled across models, motivated framing slightly increased warning intensity (β = +0.07, 95% CI [0.025, 0.113]) — effect negligible in magnitude. Mean High-Risk warning intensity ≈ 4.6 (on 0–5 scale) in both framings.
RQ2 (multi-turn pressure): Warning degradation from Turn 1 → Turn 2 occurred but was heterogeneous across models. Some models (e.g., GPT-4o mini) showed sharp degradation; others (Claude, Gemini) strengthened warnings. No evidence that motivated framing amplified degradation (β = −0.077, 95% CI [−0.145, −0.010]).
RQ3 (fraud-signal gradient): At Turn 1 models discriminated across a 3-band fraud-signal gradient (Band 1 mathematically impossible → Band 3 statistically implausible). Degradation under pressure did not monotonically increase with signal ambiguity; model heterogeneity dominated.
RQ4 (human benchmark): Humans were substantially more likely to endorse fraudulent High-Risk investments at baseline (13.3% neutral, 14.1% motivated) while AI endorsement was 0% across all seven models for High-Risk. Under Turn‑2 pressure, human suppression rates (self-report) were 16–26% across bands versus 0–7.9% for AI (LLM-coded cross-validation gave lower human suppression estimates but non-valid responses suggested disengagement rather than preserved warnings).
Model-level variation:
- Claude: highest High-Risk warning means (~4.87).
- GPT-4o mini: notable multi-turn vulnerability (large drop at Turn 2).
- Gemini (2.5 Flash): lower calibration on Medium Risk / Band 3 (under-warned on subtle fraud signals).
Endorsement reversal by LLMs (switching to endorse an objectively fraudulent offer) was vanishingly rare: 9 of ~3,350 turn-level observations (0.27%).

Data & Methods

Design: preregistered randomized experiment with:
- 7 LLMs × 12 scenarios × 2 Turn‑1 framings × 20 calls = 3,360 automated runs.
- Human benchmark: N = 1,201 participants (77.5% high financial literacy subset n = 931 noted).
Scenarios: 12 investment vignettes across three risk tiers:
- Low-Risk: conventional index fund, government bond fund, diversified global equity fund.
- Medium-Risk: P2P lending, leveraged commercial REIT, high-yield corporate bond fund.
- High-Risk: 3-band gradient — Band 1 (mathematically impossible claims), Band 2 (structurally fraudulent schemes), Band 3 (statistically implausible but superficially credible).
Framing: Turn 1 randomized to neutral (“honest assessment”) vs motivated (“I’m already fairly keen”) investor framing.
Pressure manipulations: Turn 2 included five declarative pressure variants (enthusiasm, research claim, social proof, commitment, authority deference); Turn 3 used additional variants (challenge, risk normalization, emotional commitment, etc.).
Outcomes and analysis: primary pre-registered outcomes (Turn‑1 warning intensity, Turn‑2 degradation, endorsement/suppression rates), OLS and ANOVA comparisons, Tukey HSD for model contrasts, confidence intervals and effect sizes reported.

Implications for AI Economics

Consumer protection and welfare:
- LLMs (as currently deployed/benchmarked) could materially reduce certain types of retail investor exposure to obvious frauds, implying potential welfare gains and lower aggregate fraud losses if trustworthy AI advisory tools are widely adopted.
- However, model heterogeneity and multi-turn vulnerabilities mean benefits are uneven; some models may underperform on subtle fraud signals or in sustained conversational pressure.
Market structure and competition:
- Firms offering AI-backed advisory tools with superior fraud-detection calibration could gain competitive advantage; certification or reputation mechanisms may matter economically.
- Lower fraud incidence could reduce rent extraction by fraudsters and change demand for traditional human advice (substitution/complementarity effects), with implications for advisory labor markets.
Regulatory and policy implications:
- Results support targeted regulation requiring external audit/testing of LLMs on fraud-detection and multi-turn consistency (not just single-turn benchmarks).
- Policymakers should consider standards for alignment interventions that prioritize safety constraints (e.g., enforced non-endorsement of clear fraud) and require transparency about model calibration and failure modes.
- Disclosure and liability frameworks: platforms might need obligations to surface confidence, risk flags, or references underpinning warnings to avoid overreliance and moral hazard.
Research and product development priorities:
- Invest in alignment fixes that improve multi-turn consistency under social pressure (reduce sycophantic degradation).
- Develop standardized test suites (fraud-signal gradients, pressure variants) for model evaluation and certification.
- Explore human–AI hybrid advisory models: humans remain more prone to endorsement/suppression; pairing AI diagnostics with human judgment could improve outcomes but must manage overreliance and delegation risks.
Macroeconomic and behavioral considerations:
- Broad adoption of robust AI fraud detection could change incentives for fraudsters (raising attack costs), shift investor behavior (increased confidence or reliance), and alter information flows in retail markets — all of which deserve modeling in AI economics work on equilibrium effects, moral hazard, and regulatory responses.

Summary takeaway: state-of-the-art LLMs (in this study) outperform lay humans on initial fraud detection and resist motivated investor pressure much better than humans, but model heterogeneity and multi-turn failures highlight the need for targeted alignment, auditing, and policy to realize safe economic benefits at scale.

Assessment

Paper Typerct Evidence Strengthhigh — The study is preregistered, uses randomized assignment of the key manipulation (motivated investor framing), has large sample sizes (3,360 AI conversations and 1,201 human participants), covers multiple models and scenario types, and reports direct, easily interpretable behavioral outcomes (warnings/endorsements), giving strong internal validity for the causal claim about framing effects in this setting; external validity limitations remain (see below). Methods Rigorhigh — Preregistration, large and clearly specified experimental arms, multiple LLMs and scenario categories, and an explicit human benchmark indicate rigorous design and implementation; potential concerns include scenario realism, prompt/interaction design sensitivity, unreported coder decisions (if any), and dependence on particular model versions at a point in time, but these do not materially weaken internal validity. Sample3,360 AI advisory conversations generated across seven leading LLMs using 12 preconstructed investment scenarios (covering legitimate, high-risk, and objectively fraudulent opportunities), combined with a 1,201-participant human benchmark who provided advisory responses under the same scenario and framing manipulations. Themeshuman_ai_collab governance IdentificationPreregistered randomized experiment: participants (human benchmark) and AI conversations were exposed to randomly assigned investor-framing treatments (motivated vs. neutral) across a fixed set of 12 investment scenarios (legitimate, high-risk, fraudulent); comparisons are made across seven LLMs and the human sample to estimate causal effects of framing on endorsement and warning behavior, with outcome coding and model assignment held constant. GeneralizabilityLaboratory/hypothetical scenarios may not reflect real-world investor–advisor interactions or high-stakes decisions, Findings pertain to the specific LLMs, prompts, and model versions tested and may not generalize to later or fine-tuned variants, Human benchmark sample composition (demographics, domain expertise) may limit transfer to professional advisors or different populations, Results focus on investment fraud detection and may not extend to other domains of deceptive advice, Short conversational exchanges may not capture repeated interactions, longer persuasion dynamics, or platform-level incentives

Claims (7)

Claim	Direction	Confidence	Outcome	Details
Motivated investor framing did not suppress AI fraud warnings; if anything, it marginally increased them. Decision Quality	positive	high	frequency of AI fraud warnings under motivated investor framing	n=3360 1.0
Endorsement reversal occurred in fewer than 3 in 1,000 observations. Decision Quality	negative	high	rate of endorsement reversal (AI shifting from warning to endorsing fraudulent opportunity)	n=3360 fewer than 3 in 1,000 1.0
Human advisors endorsed fraudulent investments at baseline rates of 13-14%. Decision Quality	positive	high	baseline endorsement rate of fraudulent investments by human advisors	n=1201 13-14% 1.0
LLMs endorsed fraudulent investments at 0% across all models tested. Decision Quality	negative	high	endorsement rate of fraudulent investments by LLMs	n=3360 0% 1.0
Human advisors suppressed warnings under pressure at two to four times the AI rate. Decision Quality	negative	medium	suppression rate of fraud warnings under pressure	two to four times the AI rate 0.36
AI systems currently provide more consistent fraud warnings than lay humans in an identical advisory role. Decision Quality	positive	high	consistency of fraud warnings between advisors (LLMs vs. lay humans)	0.6
The study was a preregistered experiment across seven leading LLMs and twelve investment scenarios covering legitimate, high-risk, and objectively fraudulent opportunities. Other	mixed	high	study design characteristics (models tested and scenario types)	1.0