Evidence (2340 claims)
Adoption
5267 claims
Productivity
4560 claims
Governance
4137 claims
Human-AI Collaboration
3103 claims
Labor Markets
2506 claims
Innovation
2354 claims
Org Design
2340 claims
Skills & Training
1945 claims
Inequality
1322 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 378 | 106 | 59 | 455 | 1007 |
| Governance & Regulation | 379 | 176 | 116 | 58 | 739 |
| Research Productivity | 240 | 96 | 34 | 294 | 668 |
| Organizational Efficiency | 370 | 82 | 63 | 35 | 553 |
| Technology Adoption Rate | 296 | 118 | 66 | 29 | 513 |
| Firm Productivity | 277 | 34 | 68 | 10 | 394 |
| AI Safety & Ethics | 117 | 177 | 44 | 24 | 364 |
| Output Quality | 244 | 61 | 23 | 26 | 354 |
| Market Structure | 107 | 123 | 85 | 14 | 334 |
| Decision Quality | 168 | 74 | 37 | 19 | 301 |
| Fiscal & Macroeconomic | 75 | 52 | 32 | 21 | 187 |
| Employment Level | 70 | 32 | 74 | 8 | 186 |
| Skill Acquisition | 89 | 32 | 39 | 9 | 169 |
| Firm Revenue | 96 | 34 | 22 | — | 152 |
| Innovation Output | 106 | 12 | 21 | 11 | 151 |
| Consumer Welfare | 70 | 30 | 37 | 7 | 144 |
| Regulatory Compliance | 52 | 61 | 13 | 3 | 129 |
| Inequality Measures | 24 | 68 | 31 | 4 | 127 |
| Task Allocation | 75 | 11 | 29 | 6 | 121 |
| Training Effectiveness | 55 | 12 | 12 | 16 | 96 |
| Error Rate | 42 | 48 | 6 | — | 96 |
| Worker Satisfaction | 45 | 32 | 11 | 6 | 94 |
| Task Completion Time | 78 | 5 | 4 | 2 | 89 |
| Wages & Compensation | 46 | 13 | 19 | 5 | 83 |
| Team Performance | 44 | 9 | 15 | 7 | 76 |
| Hiring & Recruitment | 39 | 4 | 6 | 3 | 52 |
| Automation Exposure | 18 | 17 | 9 | 5 | 50 |
| Job Displacement | 5 | 31 | 12 | — | 48 |
| Social Protection | 21 | 10 | 6 | 2 | 39 |
| Developer Productivity | 29 | 3 | 3 | 1 | 36 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
| Skill Obsolescence | 3 | 19 | 2 | — | 24 |
| Creative Output | 15 | 5 | 3 | 1 | 24 |
| Labor Share of Income | 10 | 4 | 9 | — | 23 |
Org Design
Remove filter
Extending existing behavioral frameworks (e.g., TAM, JD–R, Organizational Trust) to the AI-augmented workplace constitutes a theoretical contribution of the paper.
Theoretical elaboration and integration presented in the paper; contribution characterized as an extension of pre-existing models to AI contexts (no quantitative validation described in the summary).
The paper proposes a five-phase strategic roadmap for phased organizational implementation that integrates HRM practice redesign, psychological support systems, and evidence-based governance mechanisms.
Prescriptive/strategic proposal based on the paper's theoretical synthesis and applied recommendations (roadmap described in the paper; summary contains no implementation trial data).
The paper develops a comprehensive, multi-dimensional organizational psychology framework for preparing the U.S. workforce for AI integration composed of six interdependent dimensions: human–AI symbiosis, trust and transparency, job redesign, AI-enabled recruitment and selection, learning and adaptation, and ethical AI governance.
Conceptual framework derived from theoretical integration (TAM, Human–AI Symbiosis Theory, JD–R Model, Organizational Trust Theory) and review of AI–HRM literature; framework construction is a theoretical contribution of the paper (no empirical validation reported in the summary).
One-way ANOVA confirmed that observed improvements in yield, water use, WUE, and energy consumption were highly significant.
Statistical validation reported as one-way ANOVA with F and p values for wheat yield (F(1,18)=1335.66, p<0.001), water use (F(1,18)=15228.16, p<0.001), WUE (F(1,18)=13065.49, p<0.001), and energy consumption (F(1,18)=24312.67, p<0.001). Degrees of freedom imply 20 total observations (df between=1, df within=18).
Water-use efficiency (WUE) improved by 109% under AI-assisted irrigation (ANOVA F(1,18) = 13065.49, p < 0.001).
Reported WUE improvement percentage and one-way ANOVA treatment effect for WUE: F(1,18) = 13065.49, p < 0.001 from the field experiments.
AI-assisted irrigation decreased energy consumption by 30% (p < 0.001).
Field experiment results with one-way ANOVA showing treatment effect for energy consumption: F(1,18) = 24312.67, p < 0.001. Percentage change reported in the paper.
AI-assisted irrigation reduced water use by 36% (p < 0.001).
Field experiment results with one-way ANOVA showing treatment effect for water use: F(1,18) = 15228.16, p < 0.001. Percentage change reported directly in the paper.
AI-assisted irrigation increased wheat yield by 35% (p < 0.001).
Field experiment results with one-way ANOVA showing treatment effect for wheat yield: F(1,18) = 1335.66, p < 0.001. Percentage change reported directly in the paper.
Medicaid, as the largest public purchaser of healthcare services in the United States, occupies a strategic position to drive systemic change through its supply chain.
Descriptive evidence from publicly available statistics and literature on Medicaid's scale and purchasing role (cited policy/literature sources within the paper); conceptual argument linking purchasing scale to leverage in supply chains.
AESP is implemented as an open-source TypeScript SDK with 208 tests and ten modules.
Implementation claim in the paper: TypeScript SDK, 208 tests, ten modules; verifiable by inspecting the repository and test suite.
AESP is built on an ACE-GF-based cryptographic substrate.
Paper states ACE-GF is used as the cryptographic substrate; implementation referenced in SDK.
AESP employs HKDF-based context-isolated privacy with batched consolidation.
Cryptographic design described in the paper; HKDF-based isolation and batched consolidation listed as mechanisms.
AESP uses EIP-712 dual-signed commitments with escrow to bind agent actions to human consent.
Protocol description cites EIP-712 dual-signed commitments with escrow as a core mechanism; implementation stated in SDK.
AESP provides human-in-the-loop review with automatic, explicit, and biometric tiers.
Design specification in the paper describing three tiers of human review; implementation claimed in the SDK.
AESP includes a deterministic eight-check policy engine with tiered escalation.
Protocol specification and implementation details described in the paper; presence asserted in the SDK implementation.
AI is often touted for its potential to revolutionize productivity.
Authors' observation about prevailing claims in public, industry, and academic discourse (qualitative observation; the excerpt does not cite specific sources).
The authors propose 'thick entertainment' as a framework for evaluating AI-generated cultural content — one that considers entertainment's role in meaning-making, identity formation, and social connection rather than simply minimizing harm.
Explicit conceptual proposal put forward by the authors in the paper (normative/framework contribution).
The study contributes to theory by developing a human-grounded decision analytics perspective and to practice by providing practical advice to executives and analytics leaders.
Author-stated contributions based on the conceptual framework and practical recommendations included in the paper. No practitioner evaluation or citation analysis provided.
The study reframes AI as an augmentation mechanism rather than a substitute for managerial judgment and extends organizational decision theory to account for socio-technical decision systems.
Theoretical contribution asserted by the paper based on its literature synthesis and conceptual development (claim about extension of theory rather than empirical test).
The paper develops an integrative conceptual framework that explains how human judgment, algorithmic intelligence, and organizational context interact to shape decision quality and organizational outcomes.
Author-constructed conceptual framework based on synthesized literature across decision sciences, management, and information systems (framework described as output of the meta-analysis; no empirical validation reported in abstract).
Curated (human-authored) Skills substantially improve agent task success on average (+16.2 percentage points).
Aggregate result reported over the SkillsBench benchmark: comparison of pass rates between baseline (no Skills) and curated-Skills conditions across the benchmark. SkillsBench comprises 86 tasks across 11 domains; evaluations used 7 agent–model configurations and 7,308 execution trajectories to compute pass rates and deltas.
Two regimes emerge: an inequality-increasing regime when AI is proprietary (concentrated control), rents concentrate because firms capture most gains (low ξ), and complementary assets are concentrated.
Model regime characterization and calibrated simulations showing rising firm profits and aggregate inequality under proprietary-AI assumptions and low rent-sharing elasticity.
Generative AI shifts economic value toward concentrated complementary assets (firm-level capital, proprietary data/algorithms), increasing firm profits and rents captured by asset owners.
Model results from a task-based framework with heterogeneous firms and complementary assets; calibration via MSM to six empirical moments; counterfactuals show increased profit shares when AI confers advantages to firms owning complementary assets.
From interview-based evidence the authors constructed a conceptual framework that integrates empirical insights with existing theories to explain how human–AI interaction alters design cognition.
Synthesis of qualitative interview findings with literature on creative cognition and design thinking; framework presented as an output of the study (framework construction described in paper).
A PaaS layer enables industry-specific customization (complex contract logic, milestone handling, multi-entity consolidation).
Paper's architectural proposal; described as the role of PaaS in the hybrid framework. This is a design claim, not a measured outcome in the summary.
A SaaS layer should provide standardized accounting, invoicing, and reporting workflows for the EPC industry.
Architectural proposition in the paper: design recommendation rather than an empirically isolated test. The claim is descriptive of the proposed architecture.
Core supply‑chain management challenges targeted by simulation are production layout, product strategy, and managing volume and variety.
Survey and critique of simulation applications presented in the paper; conceptual taxonomy of application areas.
The paper proposes a 'manufacturing operation tree'—an organizationally structured framework—to guide development of more realistic, validated, and industry‑relevant simulation models.
Conceptual/modeling output in the paper (diagram and explanation of the manufacturing operation tree); theoretical development rather than empirical testing.
Standardizing datasets, benchmarks, and evaluation protocols (including real-time metrics and resource/latency measurements) is necessary to improve comparability and deployment relevance.
Surveyed inconsistencies and methodological shortcomings motivate the recommendation for standardization; many papers call for better benchmarks.
Hybrid architectures combining rule-based filters with ML classifiers and ensembles are used to improve detection performance and reduce false positives.
Comparative analysis and examples from the literature where multi-stage or hybrid pipelines are proposed and evaluated.
Perceived customer value is the core determinant of value-based pricing (VBP) decisions in digital marketing.
Systematic Literature Review (SLR) of 30 scholarly articles (Scopus, 2020–2025) coded into thematic categories; multiple included studies emphasize perceived value as central to pricing decisions.
Over 56% of comments were classified as formulaic, implying patterned, low-information responses dominate agent interaction.
Lexical-structural analysis and pattern detection (embedding/lexical measures) applied to ~2.8M comments; classification operationalized as 'formulaic comments' based on repetitive lexical/structural features, yielding >56% of comments labeled formulaic.
Topics about AI identity, consciousness, and memory comprised 9.7% of topical niches but attracted 20.1% of posting volume, indicating disproportionate attention to introspection.
Topic modeling that identified topical niches and tagged self-referential themes (AI identity, consciousness, memory); comparison of share of topical niches (9.7%) versus share of posting volume (20.1%) in the 23-day Moltbook dataset (47,241 agents; 361,605 posts).
Moltbook activity over 23 days included 47,241 unique agents, 361,605 posts, and ~2.8 million comments.
Full dataset of Moltbook activity collected over a 23-day period; counts of unique agent IDs, posts, and comments as reported in the paper.
A hybrid architecture where cross-domain integrators encapsulate complex subgraphs into well-structured “resource slices” reduces price volatility (approximately 70–75%) without losing throughput.
Ablation experiments comparing baseline decentralised market vs hybrid integrator architecture across simulation configurations (subset of the 1,620 runs, multiple random seeds per configuration). The paper reports ~70–75% reduction in measured price volatility metrics for hybrid vs non-hybrid cases while throughput remained statistically indistinguishable.
Agents detected up to 65% of vulnerabilities in some experimental settings.
Reported detection rate maxima from the study's experiments on certain model/scaffold/task combinations.
The authors constructed a contamination-free dataset of 22 real-world smart-contract security incidents that postdate every evaluated model's release.
Curation procedure described in the methods: 22 incidents selected to occur after all model release dates to prevent leakage.
This study expanded the evaluation matrix to 26 agent configurations spanning four model families and three scaffolding approaches.
Methods reported in this study specifying 26 agent configurations, four model families, and three scaffolds.
EVMbench (OpenAI, Paradigm, OtterSec) reported agents detecting up to 45.6% of vulnerabilities and achieving exploitation on 72.2% of a curated subset.
Reported metrics from the original EVMbench paper/benchmark (as summarized in this study).
Integrating AI (notably ML and NLP) meaningfully automates routine software engineering tasks across requirements management, code generation, testing, and maintenance.
Systematic literature review of prior AI-for-SE work combined with an empirical survey of software engineering professionals reporting usage and examples of tool-supported automation; sample size for the survey not specified in the summary.
PRF design decomposes into two independent dimensions: feedback source (where feedback text comes from) and feedback model (how that feedback is used to refine the query).
Paper's conceptual framing and controlled experiments that isolate and vary these two factors independently.
The paper proposes specific operational and market recommendations: firms should invest in middleware and co-design partnerships; policymakers should fund shared QCSC infrastructure and workforce programs; researchers should prioritize interoperable middleware, scheduling models, and economic experiments on access-pricing.
Explicit recommendations section synthesizing prior architectural and economic analysis; prescriptive assertions based on conceptual arguments rather than experimental validation.
Middleware standardization and interoperable APIs reduce switching costs and foster competition; lack of standards risks vendor lock-in and higher long-run costs.
Economic and systems-design argument drawing on well-understood effects of standardization in software ecosystems; no empirical QCSC-standardization case studies provided.
QCSC reference architecture elements — e.g., QPU integration patterns, low-latency interconnects, orchestration and scheduling middleware, unified programming environments, data staging strategies — are required components to address current friction.
System decomposition and interface requirements derived from use-case analysis; proposed architecture components listed and motivated; no experimental validation.
Policy recommendations include subsidizing complementary investments (data governance, training) rather than technology-only incentives; encouraging standards and interoperability; and funding evaluation studies to measure distributional effects and long-run productivity impacts.
Authors' policy section proposing these interventions based on case findings and broader policy implications.
The authors propose a conceptual optimisation framework emphasizing three pillars: digital integration (tech stack & data), collaboration (processes & governance), and continuous improvement (metrics, feedback loops).
Paper presents a conceptual framework derived from cross-case findings; theoretical/conceptual contribution rather than empirical estimation.
Explanations must be tailored to stakeholders (clinicians, regulators, customers) and integrated into decision processes to be useful (human-centered design principle).
Thematic coding of design and HCI literature within the review; draws on empirical studies and design guidance recommending stakeholder-specific explanation formats and integration into decision workflows.
The forecasting model was deployed with a human-in-the-loop mechanism that triggers on critical forecast deviations.
Pilot description in the paper documenting integration of H-in-the-loop rules for critical deviations during pilot deployment (single-case deployment evidence).
The framework explicitly targets SME-specific risks (data scarcity, limited skills/budgets, and change resistance) and proposes mitigations such as staged pilots, human-in-the-loop designs, and clear governance.
Design rationale and operational recommendations within the paper addressing SME constraints (conceptual; no large-N testing).
An MLOps layer is included to provide continuous integration/deployment, monitoring, retraining, and governance for sustainable model maintenance.
Framework/component specification in the paper describing an MLOps layer and its responsibilities (conceptual design).