Evidence (6574 claims)
Adoption
8625 claims
Productivity
7686 claims
Governance
6917 claims
Human-AI Collaboration
6574 claims
Org Design
4189 claims
Innovation
4131 claims
Labor Markets
3588 claims
Skills & Training
2985 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 761 | 200 | 101 | 904 | 2020 |
| Governance & Regulation | 829 | 400 | 191 | 122 | 1566 |
| Organizational Efficiency | 784 | 193 | 125 | 84 | 1197 |
| Technology Adoption Rate | 637 | 236 | 124 | 97 | 1103 |
| Research Productivity | 431 | 131 | 58 | 340 | 972 |
| Output Quality | 481 | 183 | 59 | 47 | 770 |
| Decision Quality | 332 | 177 | 82 | 49 | 647 |
| Firm Productivity | 439 | 57 | 88 | 20 | 610 |
| AI Safety & Ethics | 218 | 279 | 66 | 33 | 602 |
| Market Structure | 181 | 170 | 123 | 24 | 503 |
| Task Allocation | 214 | 64 | 72 | 33 | 388 |
| Skill Acquisition | 174 | 62 | 62 | 17 | 315 |
| Innovation Output | 204 | 27 | 45 | 18 | 295 |
| Employment Level | 105 | 54 | 108 | 13 | 282 |
| Fiscal & Macroeconomic | 132 | 69 | 43 | 26 | 277 |
| Consumer Welfare | 117 | 63 | 42 | 11 | 233 |
| Firm Revenue | 154 | 48 | 26 | 3 | 231 |
| Task Completion Time | 173 | 31 | 8 | 12 | 225 |
| Inequality Measures | 44 | 123 | 50 | 6 | 223 |
| Worker Satisfaction | 89 | 65 | 22 | 12 | 188 |
| Error Rate | 71 | 92 | 10 | 2 | 175 |
| Regulatory Compliance | 77 | 69 | 14 | 5 | 165 |
| Automation Exposure | 58 | 56 | 26 | 13 | 156 |
| Training Effectiveness | 96 | 21 | 14 | 19 | 152 |
| Wages & Compensation | 77 | 37 | 25 | 6 | 145 |
| Team Performance | 86 | 17 | 27 | 10 | 141 |
| Developer Productivity | 95 | 17 | 14 | 6 | 133 |
| Job Displacement | 12 | 81 | 21 | 1 | 115 |
| Hiring & Recruitment | 52 | 7 | 8 | 3 | 70 |
| Creative Output | 32 | 20 | 8 | 3 | 64 |
| Skill Obsolescence | 5 | 47 | 6 | 1 | 59 |
| Social Protection | 28 | 16 | 8 | 2 | 54 |
| Labor Share of Income | 17 | 19 | 17 | — | 53 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Human Ai Collab
Remove filter
The promoting effect of artificial intelligence on new quality productive forces is more pronounced in Jiangsu and Zhejiang provinces.
Heterogeneity tests on the Yangtze River Delta panel data comparing regional subsamples; authors report stronger positive effects in Jiangsu and Zhejiang.
The positive effect of artificial intelligence on firms' new quality productive forces remains robust after addressing endogeneity concerns and conducting robustness checks.
Authors report endogeneity-corrected estimations and multiple robustness checks on the same panel dataset and constructed firm-level indicators; specific endogeneity correction methods and robustness checks are not detailed in the excerpt.
Artificial intelligence significantly promotes the growth of new quality productive forces in new energy vehicle firms.
Panel data analysis of new energy vehicle firms in the Yangtze River Delta from 2001 to 2023; firm-level indicators of artificial intelligence and new quality productive forces constructed; regression estimation showing a significant positive effect.
Proactive, edge-side prompt optimization can substantially reduce inference costs without sacrificing coding quality.
Aggregate experimental results on token reductions and preserved/improved task accuracy reported in the paper.
Compared with LLMLingua-2 at matched compression rates, our method consistently achieves superior OckScore performance across all evaluated backends.
Head-to-head experimental comparison reported in the paper between the proposed middleware and LLMLingua-2 (matched compression rates) measuring OckScore.
Ablation studies indicate that the gains come primarily from the structural rewriting stage rather than simple function-name extraction.
Ablation experiments reported in the paper comparing full rewrite pipeline versus variants (e.g., function-name extraction only).
Prompt compression via the middleware preserves or improves task accuracy on the evaluated benchmark.
Reported task accuracy comparisons on OMH-Polyglot before and after applying middleware across evaluated backends.
The middleware reduces total tokens (prompt + completion) by up to 18.8 percent.
Empirical measurements reported in the paper comparing total token usage (prompt + completion) with and without middleware.
Across three commercial LLM backends, the middleware reduces prompt tokens by 34–47 percent.
Empirical results reported from experiments on OMH-Polyglot across three commercial LLM backends (aggregate token counts before vs. after middleware).
We introduce a pre-flight, edge-side prompt-rewriting middleware that runs locally (using Llama 3.2 (3B)) to perform cross-lingual translation into English, structural rewriting into a compact task-oriented format, and regex-validated rewrite-with-fallback safeguards to ensure the optimized prompt is never larger than the original.
System implementation and design described in the paper (local Llama 3.2 (3B) model, translation, rewriting, and rewrite-with-fallback mechanism).
Addressing these issues entails building dynamic evaluation testbeds involving adaptive counterparties, treating institutions as design primitives, and preserving human agency as a structural feature of the systems we build.
Specific prescriptive recommendations listed by the authors as part of the proposed research paradigm; offered as proposed methods rather than empirically validated interventions in the excerpt.
The paper calls for a non-solipsistic research paradigm that treats interdependence as a core design principle rather than approaching cooperation as a task to solve.
Normative/research-agenda claim made by the authors; stated in the paper as a recommended change in research approach without empirical tests.
Closing this gap requires AI that participates in cooperation: the equilibrium-selection process through which multiple actors navigate their interdependence.
Prescriptive/theoretical recommendation by the authors; framed as necessary to address the earlier-claimed train-test-deploy gap, without empirical demonstration in the excerpt.
AI's central challenge is shifting from capability to coexistence.
Author's conceptual assertion in the paper; no empirical data, sample, or experiment reported.
Together, these measures can properly establish a behavioral‑regulation model for brain‑privacy protection.
Concluding synthesis in the paper arguing that combined measures would yield the proposed regulatory model (normative conclusion without empirical validation).
Implement a 'pre‑market regulatory sandbox + post‑market tracking' regime to manage product risks.
Prescriptive policy design proposed in the paper (conceptual recommendation; no empirical pilot data reported).
Establish a compliance filing‑review mechanism for BCI privacy policies.
Policy recommendation in the paper proposing a procedural compliance mechanism (normative proposal without empirical testing).
Apply the principles of lawfulness, legitimacy, necessity and good‑faith to all brain‑privacy processing.
Policy recommendation formulated in the paper (prescriptive legal proposal; no empirical evaluation included).
A behavioral‑regulation model better reflects the multi‑interest, non‑exclusive nature of brain privacy and balances risk control with innovation.
Normative policy argument and conceptual comparison of regulatory models presented in the paper (theoretical, not empirically tested).
The machines are increasingly becoming competent.
Authorial assertion about the trend in AI capability (no metrics or studies provided in the excerpt).
The concept of co-intelligence describes a new cognitive ecology where the human and artificial minds mutually influence one another to come up with ways of comprehending, creating and making choices that neither of them could accomplish individually.
Conceptual claim attributed to Ethan Mollick (2024) and extended by the author — described conceptually rather than demonstrated empirically in the excerpt.
None of the past technologies have spread into so many aspects of human life, so fast.
Author's comparative assertion about the speed and breadth of AI diffusion relative to prior technologies (no empirical comparison provided in the excerpt).
Artificial intelligence has become a partner in our everyday activities: it dictates our emails, diagnoses our diseases, educates our young children, controls our budgets, creates our artworks, and influences the policies made by governments and corporations.
Authorial assertion listing domains of current AI use (no empirical study or quantified data provided in the excerpt).
The internet had to cope with more or less a decade before it could reach one billion users; social media did it in half times.
Comparative historical adoption claim presented by the author (no citation or empirical method given in the excerpt).
Less than a year after its debut, hundreds of millions of individuals on all seven continents were using large language models, in virtually every field of professional activity, and in most languages.
Authorial assertion summarizing global LLM adoption (no specific study, dataset, or methodology provided in the excerpt).
There were now a hundred million ChatGPT users in two months.
Authorial assertion in the text citing a user-count milestone for ChatGPT (no study or data source provided in the excerpt).
This provocation introduces fiduciary design as a guiding principle and argues that conversational AI trust and accountability could be unified into a single design and legal paradigm.
Proposal/argument presented in the paper (conceptual design + legal framing); no empirical evaluation or implementation data provided in the excerpt.
When a client hires a personal lawyer, undergoes surgery, or receives advice from an investment manager, the expert they consult often has a fiduciary duty to act in their client's best interests; conversational agents should be held to a similar standard.
Analogy to existing professional fiduciary duties used as the core normative argument in the paper; no empirical testing of legal applicability reported in the excerpt.
Conversational AI agents, designed to feel and interact anthropomorphically with human users, must be held to a standard of care commensurate with their capabilities and access.
Normative assertion/proposal laid out in the paper (argumentative reasoning); no empirical test or legal analysis with sample size provided in the excerpt.
Conversational agents are increasingly integrated into the most private and intimate aspects of users' lives, from discussions of mental health to financial decisions.
Asserted as descriptive background in the paper (position/argumentative claim); examples provided (mental health, financial decisions); no empirical study or sample size reported in the excerpt.
The scientific results converged in both runs.
Paper statement reporting that the scientific results from both agents converged across the two experimental runs (descriptive outcome of the runs).
Category leaders are persona-resistant (~80% same-brand consistency across personas).
Measured same-brand consistency across personas in audit; reported approximate consistency level for category-leading brands.
Clustered 95% CIs exclude zero on all three measured cells (the sonnet cell's CI rests on only 4 prompt clusters and is correspondingly wider).
Reported clustered 95% confidence intervals for the three measured model/prompt cells; note about sonnet cell having only 4 prompt clusters (hence wider CI).
Using the three metrics (data product adoption, time-to-find, time-to-insight) ties platform success to measurable business value rather than internal activity.
Argument in the paper about metric selection and their role in assessing platform success (methodological rationale).
A staged framework that shifts ownership from hub to spokes avoids both centralized bottlenecks and uncoordinated decentralization.
Organizational/process recommendation presented in the paper as a way to manage decentralization (design rationale).
Natural-language conversational interfaces democratize access for business users and expose historically underutilized enterprise data.
Proposed UX/interaction benefit asserted in the paper (design claim; no empirical measurement reported in the excerpt).
Large language models (LLMs) that automate governance tasks also lower the barrier for domain practitioners to develop genuine cross-functional expertise spanning business and data engineering, enabling spoke teams to take on greater end-to-end ownership without proportionally increasing their dependence on the hub.
Argument in the paper linking AI/LLM capabilities to skill enablement and reduced hub dependence (conceptual claim; no empirical results in the excerpt).
Domain spokes own business semantics, product backlogs, and local iteration cadence, progressively assuming greater responsibility as they mature (shifting operational ownership outward over time).
Architectural/organizational design element described in the paper (procedural proposal for staged ownership transfer).
A central hub (Center of Excellence) can provide shared platform services, policy automation, and AI-enabled governance that automatically standardizes data products, generates quality rules, drafts data contracts, and reviews changes for regressions.
Functional capabilities described in the proposed architecture; presented as what the hub component will provide (design/specification).
An AI-augmented hub-and-spoke model layered on a modern lakehouse architecture can relax the flexibility-versus-control trade-off inherent in enterprise data platforms.
Proposed architectural solution and theoretical argument in the paper (design proposal; no reported experimental/field results provided in the text excerpt).
Affordance actualization (i.e., the realization of GenAI affordances) can shift strategic choices between replacement and retainment of target systems.
Theoretical contribution supported by empirical illustration from two consecutive acquisitions of the same target in the authors' case study (qualitative evidence).
GenAI reconfigures perceived knowledge challenges, alters integration logics, and expands feasible paths for value capture in M&A IS integration decisions.
Synthesis claim based on the paper's two-case comparative study and theoretical framing using the knowledge-based view and technology affordance lens (qualitative, interpretive evidence).
GenAI affordances reduced prior assumptions about system intransparency, personnel dependence, and conversion costs during IS integration.
Authors' analysis of the comparative case evidence showing changed perceptions and lowered barriers in the second acquisition after GenAI affordance discovery (qualitative evidence).
LLM-supported affordances, such as learning system knowledge through chat, increased knowledge transferability, knowledge aggregation, and efficiency.
Observed and interpreted affordance actualization in the second acquisition within the paper's qualitative case study (authors report that LLM/chat features enabled these improvements).
In the second acquisition the acquirer adopted a 'retain-and-revive' approach for the same target, enabled by newly discovered GenAI affordances.
Empirical observation from the paper's comparative case study of two consecutive acquisitions of the same digital target (qualitative case evidence showing contrasting integration choices across the two acquisitions).
These findings suggest a dynamically adaptive LLM-teacher collaboration as student proficiency increases.
Interpretive/recommendation claim in the abstract: authors conclude that collaboration should adapt dynamically with student proficiency based on observed efficacy and ceiling effects.
Both LLM and teacher are critical for student skill improvement.
Abstract statement reporting that both LLM and teacher contributions were important for skill improvement; supported by empirical analysis on the reported dataset (57,954 essays).
Teachers act as pedagogical gatekeepers and bridges to guarantee feedback quality.
Stated in the abstract that within the triadic system teachers ensure feedback quality, implying a complementary role confirmed by the authors' empirical analysis or system design.
The triadic collaboration system is efficacious in improving writing quality.
Empirical claim in the abstract supported by analysis of the large dataset (57,954 essays from 10,195 students across 120 schools over two years). The paper states findings confirm the system's efficacy in improving writing quality.
We introduce a multidimensional evaluation framework grounded in Systemic Functional Linguistics and the suggestion trajectory tracing pipeline.
Methodological contribution explicitly reported in the abstract: a new evaluation framework combining SFL and a suggestion trajectory tracing pipeline.