Evidence (4175 claims)
Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 758 | 199 | 100 | 900 | 2007 |
| Governance & Regulation | 826 | 400 | 191 | 122 | 1563 |
| Organizational Efficiency | 777 | 193 | 124 | 84 | 1189 |
| Technology Adoption Rate | 635 | 233 | 124 | 97 | 1098 |
| Research Productivity | 422 | 128 | 57 | 336 | 954 |
| Output Quality | 476 | 179 | 59 | 47 | 761 |
| Decision Quality | 328 | 177 | 81 | 47 | 640 |
| Firm Productivity | 435 | 57 | 88 | 20 | 606 |
| AI Safety & Ethics | 218 | 277 | 65 | 33 | 599 |
| Market Structure | 180 | 170 | 123 | 24 | 502 |
| Task Allocation | 213 | 64 | 72 | 33 | 387 |
| Skill Acquisition | 170 | 61 | 61 | 17 | 309 |
| Innovation Output | 203 | 27 | 43 | 18 | 292 |
| Employment Level | 105 | 54 | 107 | 13 | 281 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 117 | 63 | 42 | 11 | 233 |
| Firm Revenue | 153 | 48 | 26 | 3 | 230 |
| Task Completion Time | 173 | 31 | 8 | 12 | 225 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Worker Satisfaction | 89 | 65 | 22 | 12 | 188 |
| Error Rate | 69 | 92 | 10 | 2 | 173 |
| Regulatory Compliance | 77 | 69 | 14 | 5 | 165 |
| Automation Exposure | 56 | 56 | 26 | 13 | 154 |
| Training Effectiveness | 94 | 21 | 13 | 19 | 149 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Team Performance | 86 | 17 | 27 | 10 | 141 |
| Developer Productivity | 95 | 17 | 14 | 6 | 133 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 52 | 7 | 8 | 3 | 70 |
| Creative Output | 31 | 18 | 8 | 3 | 61 |
| Skill Obsolescence | 5 | 46 | 6 | 1 | 58 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 19 | 17 | — | 53 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Org Design
Remove filter
Using distributed systems as a principled foundation is a useful approach for creating and evaluating LLM teams.
Primary methodological proposal of the paper; supported by conceptual argument and (per the paper) mappings between distributed-systems concepts and LLM team design (specific experimental validation not detailed in the excerpt).
Large language models (LLMs) are growing increasingly capable.
Statement in the paper's introduction/abstract summarizing the field; based on observed progress in LLM development cited by the authors (no experimental sample size provided in the excerpt).
O artigo discute implicações gerenciais e de políticas públicas para reduzir fricção, acelerar adoção responsável e orientar investimentos em produtividade e inclusão.
Seção de discussão mencionada no resumo abordando encargos gerenciais e políticas públicas; não há avaliação empírica de políticas no resumo.
O artigo entrega instrumentos replicáveis — a escala SCF-30, um checklist de governança mínima de IA e uma matriz 30-60-90 dias — para uso prático.
Afirmação explícita no resumo de que instrumentos replicáveis são disponibilizados; presunção de inclusão dos instrumentos no corpo do artigo.
The authors curated a set of guidelines called the Incentive-Tuning Framework to aid researchers in designing effective incentive schemes for human–AI decision-making studies.
Authors' contribution described in the paper: development of a framework (framework content and evaluation details not provided in excerpt).
The intelligent scheduling model incorporates legal, contractual, skill-based, and preference-aware constraints to generate equitable and efficient rosters.
Methodological description of constraints encoded in the optimization model for scheduling; experimental validation of resulting rosters reported (conflict reduction and fairness metrics), but specific constraint formulations and datasets are not detailed in the excerpt.
The performance evaluation framework combines structured metrics (task completion, attendance, punctuality) with unstructured feedback (patient surveys, peer reviews) analyzed using natural language processing.
Methodological description in the paper of the performance evaluation module and use of NLP for unstructured feedback analysis; implementation details and dataset sizes not specified in the excerpt.
The proposed AI-driven HRM framework integrates forecasting, optimization, and performance evaluation to enhance workforce planning, staff scheduling, and continuous assessment.
Methodological contribution described in the paper: framework design with three core modules (demand forecasting, intelligent scheduling, performance evaluation); validated via experiments on synthetic and real hospital datasets (dataset sizes not specified in the text).
Persistent environmental state induces history sensitivity (dependence of long-run behavior on past trajectories and initial conditions) unless the overall system is globally contracting.
Formal theorem and proof showing that persistence of environmental variables creates non-autonomous/memory-dependent closed-loop behavior, and that only the special case of global contraction removes this history dependence (mathematical analysis of sensitivity to initial conditions).
Under dissipativity assumptions the induced closed-loop system admits a bounded forward-invariant region, guaranteeing viability of the dynamics without requiring global optimality.
A proven structural result (theorem) in the paper: mathematical proof using dissipativity hypotheses on components of the feedback architecture showing existence of a bounded forward-invariant set for the closed-loop dynamics. (The claim is theoretical; no empirical sample size.)
Regional peer effects of DT improve firms' resource allocation (RA), which in turn bolsters enterprise resilience (ER).
Mediation/ mechanism analysis on the 2013–2022 Chinese A-share manufacturing panel showing that RA mediates the relationship between regional peer DT and ER.
Industrial peer effects of DT enhance firms' innovation capability (IC), which in turn strengthens enterprise resilience (ER).
Mediation/ mechanism analysis on the same 2013–2022 Chinese A-share manufacturing panel showing that IC mediates the relationship between industrial peer DT and ER.
Digital transformation (DT) exhibits significant industrial and regional peer effects.
Empirical analysis using panel data of Chinese manufacturing enterprises listed on the Shanghai and Shenzhen A-share markets from 2013 to 2022; peer-effect regressions conducted within interlocking directorate networks (IDNs).
AI significantly enhances supplier stability in sports enterprises (SE).
Empirical estimation using a dual machine learning (DML) model on panel data of 45 Chinese listed sports enterprises (2012–2023); authors report a statistically significant positive effect of AI on supplier stability.
Extending existing behavioral frameworks (e.g., TAM, JD–R, Organizational Trust) to the AI-augmented workplace constitutes a theoretical contribution of the paper.
Theoretical elaboration and integration presented in the paper; contribution characterized as an extension of pre-existing models to AI contexts (no quantitative validation described in the summary).
The paper proposes a five-phase strategic roadmap for phased organizational implementation that integrates HRM practice redesign, psychological support systems, and evidence-based governance mechanisms.
Prescriptive/strategic proposal based on the paper's theoretical synthesis and applied recommendations (roadmap described in the paper; summary contains no implementation trial data).
The paper develops a comprehensive, multi-dimensional organizational psychology framework for preparing the U.S. workforce for AI integration composed of six interdependent dimensions: human–AI symbiosis, trust and transparency, job redesign, AI-enabled recruitment and selection, learning and adaptation, and ethical AI governance.
Conceptual framework derived from theoretical integration (TAM, Human–AI Symbiosis Theory, JD–R Model, Organizational Trust Theory) and review of AI–HRM literature; framework construction is a theoretical contribution of the paper (no empirical validation reported in the summary).
One-way ANOVA confirmed that observed improvements in yield, water use, WUE, and energy consumption were highly significant.
Statistical validation reported as one-way ANOVA with F and p values for wheat yield (F(1,18)=1335.66, p<0.001), water use (F(1,18)=15228.16, p<0.001), WUE (F(1,18)=13065.49, p<0.001), and energy consumption (F(1,18)=24312.67, p<0.001). Degrees of freedom imply 20 total observations (df between=1, df within=18).
Water-use efficiency (WUE) improved by 109% under AI-assisted irrigation (ANOVA F(1,18) = 13065.49, p < 0.001).
Reported WUE improvement percentage and one-way ANOVA treatment effect for WUE: F(1,18) = 13065.49, p < 0.001 from the field experiments.
AI-assisted irrigation decreased energy consumption by 30% (p < 0.001).
Field experiment results with one-way ANOVA showing treatment effect for energy consumption: F(1,18) = 24312.67, p < 0.001. Percentage change reported in the paper.
AI-assisted irrigation reduced water use by 36% (p < 0.001).
Field experiment results with one-way ANOVA showing treatment effect for water use: F(1,18) = 15228.16, p < 0.001. Percentage change reported directly in the paper.
AI-assisted irrigation increased wheat yield by 35% (p < 0.001).
Field experiment results with one-way ANOVA showing treatment effect for wheat yield: F(1,18) = 1335.66, p < 0.001. Percentage change reported directly in the paper.
Medicaid, as the largest public purchaser of healthcare services in the United States, occupies a strategic position to drive systemic change through its supply chain.
Descriptive evidence from publicly available statistics and literature on Medicaid's scale and purchasing role (cited policy/literature sources within the paper); conceptual argument linking purchasing scale to leverage in supply chains.
AESP is implemented as an open-source TypeScript SDK with 208 tests and ten modules.
Implementation claim in the paper: TypeScript SDK, 208 tests, ten modules; verifiable by inspecting the repository and test suite.
AESP is built on an ACE-GF-based cryptographic substrate.
Paper states ACE-GF is used as the cryptographic substrate; implementation referenced in SDK.
AESP employs HKDF-based context-isolated privacy with batched consolidation.
Cryptographic design described in the paper; HKDF-based isolation and batched consolidation listed as mechanisms.
AESP uses EIP-712 dual-signed commitments with escrow to bind agent actions to human consent.
Protocol description cites EIP-712 dual-signed commitments with escrow as a core mechanism; implementation stated in SDK.
AESP provides human-in-the-loop review with automatic, explicit, and biometric tiers.
Design specification in the paper describing three tiers of human review; implementation claimed in the SDK.
AESP includes a deterministic eight-check policy engine with tiered escalation.
Protocol specification and implementation details described in the paper; presence asserted in the SDK implementation.
AI is often touted for its potential to revolutionize productivity.
Authors' observation about prevailing claims in public, industry, and academic discourse (qualitative observation; the excerpt does not cite specific sources).
The authors propose 'thick entertainment' as a framework for evaluating AI-generated cultural content — one that considers entertainment's role in meaning-making, identity formation, and social connection rather than simply minimizing harm.
Explicit conceptual proposal put forward by the authors in the paper (normative/framework contribution).
The study contributes to theory by developing a human-grounded decision analytics perspective and to practice by providing practical advice to executives and analytics leaders.
Author-stated contributions based on the conceptual framework and practical recommendations included in the paper. No practitioner evaluation or citation analysis provided.
The study reframes AI as an augmentation mechanism rather than a substitute for managerial judgment and extends organizational decision theory to account for socio-technical decision systems.
Theoretical contribution asserted by the paper based on its literature synthesis and conceptual development (claim about extension of theory rather than empirical test).
The paper develops an integrative conceptual framework that explains how human judgment, algorithmic intelligence, and organizational context interact to shape decision quality and organizational outcomes.
Author-constructed conceptual framework based on synthesized literature across decision sciences, management, and information systems (framework described as output of the meta-analysis; no empirical validation reported in abstract).
Curated (human-authored) Skills substantially improve agent task success on average (+16.2 percentage points).
Aggregate result reported over the SkillsBench benchmark: comparison of pass rates between baseline (no Skills) and curated-Skills conditions across the benchmark. SkillsBench comprises 86 tasks across 11 domains; evaluations used 7 agent–model configurations and 7,308 execution trajectories to compute pass rates and deltas.
Two regimes emerge: an inequality-increasing regime when AI is proprietary (concentrated control), rents concentrate because firms capture most gains (low ξ), and complementary assets are concentrated.
Model regime characterization and calibrated simulations showing rising firm profits and aggregate inequality under proprietary-AI assumptions and low rent-sharing elasticity.
Generative AI shifts economic value toward concentrated complementary assets (firm-level capital, proprietary data/algorithms), increasing firm profits and rents captured by asset owners.
Model results from a task-based framework with heterogeneous firms and complementary assets; calibration via MSM to six empirical moments; counterfactuals show increased profit shares when AI confers advantages to firms owning complementary assets.
From interview-based evidence the authors constructed a conceptual framework that integrates empirical insights with existing theories to explain how human–AI interaction alters design cognition.
Synthesis of qualitative interview findings with literature on creative cognition and design thinking; framework presented as an output of the study (framework construction described in paper).
A PaaS layer enables industry-specific customization (complex contract logic, milestone handling, multi-entity consolidation).
Paper's architectural proposal; described as the role of PaaS in the hybrid framework. This is a design claim, not a measured outcome in the summary.
A SaaS layer should provide standardized accounting, invoicing, and reporting workflows for the EPC industry.
Architectural proposition in the paper: design recommendation rather than an empirically isolated test. The claim is descriptive of the proposed architecture.
Core supply‑chain management challenges targeted by simulation are production layout, product strategy, and managing volume and variety.
Survey and critique of simulation applications presented in the paper; conceptual taxonomy of application areas.
The paper proposes a 'manufacturing operation tree'—an organizationally structured framework—to guide development of more realistic, validated, and industry‑relevant simulation models.
Conceptual/modeling output in the paper (diagram and explanation of the manufacturing operation tree); theoretical development rather than empirical testing.
Standardizing datasets, benchmarks, and evaluation protocols (including real-time metrics and resource/latency measurements) is necessary to improve comparability and deployment relevance.
Surveyed inconsistencies and methodological shortcomings motivate the recommendation for standardization; many papers call for better benchmarks.
Hybrid architectures combining rule-based filters with ML classifiers and ensembles are used to improve detection performance and reduce false positives.
Comparative analysis and examples from the literature where multi-stage or hybrid pipelines are proposed and evaluated.
Perceived customer value is the core determinant of value-based pricing (VBP) decisions in digital marketing.
Systematic Literature Review (SLR) of 30 scholarly articles (Scopus, 2020–2025) coded into thematic categories; multiple included studies emphasize perceived value as central to pricing decisions.
Over 56% of comments were classified as formulaic, implying patterned, low-information responses dominate agent interaction.
Lexical-structural analysis and pattern detection (embedding/lexical measures) applied to ~2.8M comments; classification operationalized as 'formulaic comments' based on repetitive lexical/structural features, yielding >56% of comments labeled formulaic.
Topics about AI identity, consciousness, and memory comprised 9.7% of topical niches but attracted 20.1% of posting volume, indicating disproportionate attention to introspection.
Topic modeling that identified topical niches and tagged self-referential themes (AI identity, consciousness, memory); comparison of share of topical niches (9.7%) versus share of posting volume (20.1%) in the 23-day Moltbook dataset (47,241 agents; 361,605 posts).
Moltbook activity over 23 days included 47,241 unique agents, 361,605 posts, and ~2.8 million comments.
Full dataset of Moltbook activity collected over a 23-day period; counts of unique agent IDs, posts, and comments as reported in the paper.
A hybrid architecture where cross-domain integrators encapsulate complex subgraphs into well-structured “resource slices” reduces price volatility (approximately 70–75%) without losing throughput.
Ablation experiments comparing baseline decentralised market vs hybrid integrator architecture across simulation configurations (subset of the 1,620 runs, multiple random seeds per configuration). The paper reports ~70–75% reduction in measured price volatility metrics for hybrid vs non-hybrid cases while throughput remained statistically indistinguishable.
Agents detected up to 65% of vulnerabilities in some experimental settings.
Reported detection rate maxima from the study's experiments on certain model/scaffold/task combinations.