AI tools have collapsed the gap between naming and executing causal methods, letting polished but potentially invalid analyses spread widely; the author proposes an 'Analysis Contract' — a pre-commitment regime requiring a method-data contract, a data audit, and explicit disconfirmation criteria — to restore auditability and trust.
"Vibe coding" and "vibe analytics" have been framed as a democratization of technical capability. This paper argues that AI-assisted methodology more broadly, or what I call "vibe methodology," also democratizes the failure modes specific to each domain. When AI assists with methods whose validity depends on assumptions that cannot be verified from the output alone (a class I call "vibe inference"), the failure surface is structurally different: the output does not reliably signal invalidity, and when it does, recognizing the signal requires the expertise the workflow bypasses. I focus on "vibe econometrics," the subset of AI-assisted causal analysis where identification can be named faster than it can be audited. The claim of this paper is not that AI invents inferential failures that did not previously exist, but that it changes their incidence, observability, and persuasive force enough to create a practically distinct governance problem. This results in three failure modes: method-data mismatch, where AI bypasses expertise at execution; confidence laundering, where AI amplifies the credibility of formatted output; and invisible forking, which spans both. What is new is not the failure modes but AI's industrialization of their packaging. The barrier between naming a method and executing it has collapsed, and weak foundations, dressed as rigorous analysis, now reach audiences at a scale, speed, and polish that previously required expertise. I propose the Analysis Contract, a pre-commitment framework that adapts the logic of pre-analysis plans and the Causal Roadmap to the AI-assisted setting. The contract imposes three conditions before a causal claim is made: a method-data contract, a data audit, and a pre-commitment statement defining what would count as a disconfirming result. The framework generalizes across domains of vibe inference through domain-specific instantiation.
Summary
Main Finding
Vibe Econometrics argues that AI-assisted methods for causal analysis (“vibe econometrics”) do not create new inferential failures but materially change their incidence, observability, and persuasive force. This creates a distinct governance problem: AI collapses the barrier between naming a method and executing it, enabling structurally‑different failures (method-data mismatch, confidence laundering, and invisible forking). The paper proposes an “Analysis Contract” — a pre-commitment framework (method-data contract, data audit, and pre‑specified disconfirmation criteria) — to reintroduce methodological friction and reduce these failure modes.
Key Points
- Vibe methodology = using AI to execute domain-specific methods via natural‑language prompts. Vibe inference = those uses where validity depends on assumptions not verifiable from output alone (e.g., causal claims, legal reasoning, scientific inference).
- Vibe econometrics = AI-assisted causal inference (DiD, RDD, propensity scores, ITS, etc.). It is a sharp example because method names confer credibility while their identifying assumptions are invisible in polished outputs.
- Two workflow stages matter:
- Production: AI executes the method; removes friction formerly imposed by coding and forced assumption checks.
- Reception: Polished AI outputs (tables, plots, interpretive text) reach audiences who may not detect foundational problems.
- Three failure modes:
- Method‑data mismatch — methods applied to data that do not meet identifying assumptions (valid arithmetic, invalid causal claim). Examples: DiD on aggregated claims with incompatible aggregation; propensity matching on unharmonized job titles across legacy firms; ITS across a measurement-instrument change.
- Confidence laundering — AI formats weak or invalid analyses into authoritative, persuasive presentations (regression tables, diagnostics, confident prose), increasing audience trust even when evidence is invalid.
- Invisible forking — rapid, unlogged specification search via prompts; default lack of audit trail; iterations framed as refinement rather than selective reporting.
- AI amplifies these through model miscalibration (overconfident phrasing), sycophancy (models echo what the user seems to want), and human cognitive surrender (users adopt AI outputs with little scrutiny).
- Downstream disclosure (journals/policies) has limited effect; upstream structural commitments are needed.
- Proposal: Analysis Contract — three pre-commitment conditions before making a causal claim:
- Method‑data contract: explicit statement matching identification strategy to data features and assumptions.
- Data audit: independent checks of data provenance, measurement compatibility, and pre-processing that could violate assumptions.
- Pre‑commitment disconfirmation criteria: what results would count as falsifying evidence (analogous to pre-analysis plans / Causal Roadmap).
- Pre-commitment must be complemented by adversarial review and mechanical checks because AIs can produce confident affirmations that mislead even compliant analysts.
- The framework is domain-general but requires domain-specific instantiation (templates, checklists, automated audits).
Data & Methods
- Type: Conceptual working paper (May 2026) combining theory, literature synthesis, and composite illustrative cases drawn from the author’s applied experience and AI assistance.
- Methodology:
- Taxonomy construction: maps “vibe methodology” landscape along two axes (failure observability from output alone; feedback loop speed), isolating “vibe inference” and focusing on “vibe econometrics.”
- Failure‑mode analysis: develops three failure modes with process diagrams (production vs reception) and concrete vignettes across sectors (healthcare claims, post‑merger pay audits, state education ITS).
- Governance proposal: Adapts logic from pre-analysis plans and the Causal Roadmap (Petersen & van der Laan, 2014; Dang et al., 2023) into the Analysis Contract.
- Literature grounding: integrates recent work on AI and scientific practice (Lin & Sohail 2026; Arbour et al. 2026; Shen & Tamkin 2026), LLM behavior (miscalibration, RLHF sycophancy), and classic methodology concerns (Gelman & Loken 2013).
- Use of AI in producing the paper: author disclosed use of Claude Opus 4, Sonnet 4, ChatGPT 5, and Perplexity for drafting, revision, simulated reviews, and building appendices; primary sources were downloaded and verified by the author.
- No primary empirical dataset; claims are supported via illustrative examples and synthesis of existing empirical and experimental literature (e.g., Becker et al. 2025 on perceived speedups; studies on LLM confidence).
Implications for AI Economics
- Research practice: Economists must treat AI as a change to workflow structure, not merely a productivity tool. Methodological competence (assumption checking, data provenance) remains essential; training should emphasize meta‑skills (critical discernment, causal literacy, pre-commitment practices).
- Replicability & credibility: Polished AI outputs increase the risk of convincing but invalid causal claims entering policy/firms. Journals, funders, and agencies should require upstream commitments (e.g., Analysis Contract artifacts) and machine‑readable audit trails, not only retrospective disclosure.
- Tooling and tooling standards: Developers of analytical AI should build features that preserve provenance (prompt and spec logging, reproducible notebooks, automated assumption checks) and support mechanized audits (e.g., data‑schema checks, instrument change detection, harmonization checks).
- Organizational governance: Firms and public agencies should adopt institutional processes that require method‑data matching and independent data audits before decisions or public claims. Analytics teams need role separation (producer, auditor, decision reviewer) to counter cognitive surrender and invisible forking.
- Policy & regulation: Regulators who rely on AI-produced analyses (e.g., pay equity audits, program evaluations) must require documented pre-commitments and independent verification. Disclosure policies alone are insufficient.
- Research agenda: Empirical measurement of the prevalence and impact of vibe econometrics (surveys, audits of organizational analytics outputs), RCTs testing effectiveness of Analysis Contract components, development of automated specification‑logging systems, and experiments on adversarial review workflows.
- Broader economic consequences: By changing how persuasive evidence is produced and consumed, AI may influence market behavior and policy allocation—heightening the social value of robust governance to prevent misallocation based on spurious but polished causal claims.
If you want, I can: - Convert the Analysis Contract into a one‑page operational checklist for analytics teams; - Sketch a machine‑readable template for method‑data contracts that could be embedded in analysis pipelines; or - Identify empirical designs to measure how often the three failure modes occur in practice.
Assessment
Claims (8)
| Claim | Direction | Confidence | Outcome | Details |
|---|---|---|---|---|
| AI-assisted methodology ("vibe methodology") democratizes the failure modes specific to each domain. Automation Exposure | negative | high | democratization of domain-specific inferential failure modes (i.e., more widespread occurrence of those failures) |
0.02
|
| When AI assists with methods whose validity depends on assumptions that cannot be verified from the output alone ("vibe inference"), the failure surface is structurally different: the output does not reliably signal invalidity, and when it does, recognizing the signal requires the expertise the workflow bypasses. Decision Quality | negative | high | observability/detectability of invalid inference and requirement of expert knowledge to detect invalidity |
0.02
|
| AI changes the incidence, observability, and persuasive force of inferential failures enough to create a practically distinct governance problem (even if it does not invent previously nonexistent inferential failures). Governance And Regulation | negative | high | governance challenge arising from changed incidence, observability, and persuasiveness of inferential failures |
0.02
|
| AI industrializes the packaging of existing inferential failure modes: the barrier between naming a method and executing it has collapsed, allowing weak foundations, dressed as rigorous analysis, to reach audiences at a scale, speed, and polish that previously required expertise. Adoption Rate | negative | high | scale/speed/polish of dissemination of weak analyses (i.e., reach/adoption of low-quality analyses) |
0.02
|
| There are three practical failure modes produced or amplified by AI-assisted causal analysis: (1) method-data mismatch, where AI bypasses expertise at execution; (2) confidence laundering, where AI amplifies the credibility of formatted output; and (3) invisible forking, which spans both. Error Rate | negative | high | types of inferential failure modes arising in AI-assisted causal analysis |
0.02
|
| The novel governance problem is not that AI creates new failure modes, but that AI changes their incidence, observability, and persuasive force enough to require different governance responses. Governance And Regulation | mixed | high | need for adapted governance responses to AI-mediated inferential failures |
0.02
|
| The Analysis Contract, a proposed pre-commitment framework, can adapt the logic of pre-analysis plans and the Causal Roadmap to the AI-assisted setting by imposing three conditions before a causal claim is made: a method-data contract, a data audit, and a pre-commitment statement defining what would count as a disconfirming result. Governance And Regulation | positive | high | governance of AI-assisted causal claims / credibility of causal claims under AI assistance |
0.02
|
| The Analysis Contract framework generalizes across domains of vibe inference through domain-specific instantiation. Governance And Regulation | positive | high | applicability/generalizability of the Analysis Contract across domains |
0.02
|