Evidence (6574 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	761	200	101	904	2020
Governance & Regulation	829	400	191	122	1566
Organizational Efficiency	784	193	125	84	1197
Technology Adoption Rate	637	236	124	97	1103
Research Productivity	431	131	58	340	972
Output Quality	481	183	59	47	770
Decision Quality	332	177	82	49	647
Firm Productivity	439	57	88	20	610
AI Safety & Ethics	218	279	66	33	602
Market Structure	181	170	123	24	503
Task Allocation	214	64	72	33	388
Skill Acquisition	174	62	62	17	315
Innovation Output	204	27	45	18	295
Employment Level	105	54	108	13	282
Fiscal & Macroeconomic	132	69	43	26	277
Consumer Welfare	117	63	42	11	233
Firm Revenue	154	48	26	3	231
Task Completion Time	173	31	8	12	225
Inequality Measures	44	123	50	6	223
Worker Satisfaction	89	65	22	12	188
Error Rate	71	92	10	2	175
Regulatory Compliance	77	69	14	5	165
Automation Exposure	58	56	26	13	156
Training Effectiveness	96	21	14	19	152
Wages & Compensation	77	37	25	6	145
Team Performance	86	17	27	10	141
Developer Productivity	95	17	14	6	133
Job Displacement	12	81	21	1	115
Hiring & Recruitment	52	7	8	3	70
Creative Output	32	20	8	3	64
Skill Obsolescence	5	47	6	1	59
Social Protection	28	16	8	2	54
Labor Share of Income	17	19	17	—	53
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Human Ai Collab Remove filter

The promoting effect of artificial intelligence on new quality productive forces is more pronounced in Jiangsu and Zhejiang provinces.

Heterogeneity tests on the Yangtze River Delta panel data comparing regional subsamples; authors report stronger positive effects in Jiangsu and Zhejiang.

high positive Mechanisms and Effects of Artificial Intelligence on New Qua... new quality productive forces (regional heterogeneity)

The positive effect of artificial intelligence on firms' new quality productive forces remains robust after addressing endogeneity concerns and conducting robustness checks.

Authors report endogeneity-corrected estimations and multiple robustness checks on the same panel dataset and constructed firm-level indicators; specific endogeneity correction methods and robustness checks are not detailed in the excerpt.

high positive Mechanisms and Effects of Artificial Intelligence on New Qua... new quality productive forces

Artificial intelligence significantly promotes the growth of new quality productive forces in new energy vehicle firms.

Panel data analysis of new energy vehicle firms in the Yangtze River Delta from 2001 to 2023; firm-level indicators of artificial intelligence and new quality productive forces constructed; regression estimation showing a significant positive effect.

high positive Mechanisms and Effects of Artificial Intelligence on New Qua... new quality productive forces

Proactive, edge-side prompt optimization can substantially reduce inference costs without sacrificing coding quality.

Aggregate experimental results on token reductions and preserved/improved task accuracy reported in the paper.

high positive Cross-Lingual Token Arbitrage: Optimizing Code Agent Context... inference cost (via token usage) and coding quality

Compared with LLMLingua-2 at matched compression rates, our method consistently achieves superior OckScore performance across all evaluated backends.

Head-to-head experimental comparison reported in the paper between the proposed middleware and LLMLingua-2 (matched compression rates) measuring OckScore.

high positive Cross-Lingual Token Arbitrage: Optimizing Code Agent Context... OckScore (a task-specific performance metric)

Ablation studies indicate that the gains come primarily from the structural rewriting stage rather than simple function-name extraction.

Ablation experiments reported in the paper comparing full rewrite pipeline versus variants (e.g., function-name extraction only).

high positive Cross-Lingual Token Arbitrage: Optimizing Code Agent Context... source of performance/token-reduction gains (rewriting vs. function-name extract...

Prompt compression via the middleware preserves or improves task accuracy on the evaluated benchmark.

Reported task accuracy comparisons on OMH-Polyglot before and after applying middleware across evaluated backends.

high positive Cross-Lingual Token Arbitrage: Optimizing Code Agent Context... task accuracy

The middleware reduces total tokens (prompt + completion) by up to 18.8 percent.

Empirical measurements reported in the paper comparing total token usage (prompt + completion) with and without middleware.

high positive Cross-Lingual Token Arbitrage: Optimizing Code Agent Context... total token count (prompt + completion)

Across three commercial LLM backends, the middleware reduces prompt tokens by 34–47 percent.

Empirical results reported from experiments on OMH-Polyglot across three commercial LLM backends (aggregate token counts before vs. after middleware).

high positive Cross-Lingual Token Arbitrage: Optimizing Code Agent Context... prompt token count

We introduce a pre-flight, edge-side prompt-rewriting middleware that runs locally (using Llama 3.2 (3B)) to perform cross-lingual translation into English, structural rewriting into a compact task-oriented format, and regex-validated rewrite-with-fallback safeguards to ensure the optimized prompt is never larger than the original.

System implementation and design described in the paper (local Llama 3.2 (3B) model, translation, rewriting, and rewrite-with-fallback mechanism).

high positive Cross-Lingual Token Arbitrage: Optimizing Code Agent Context... ability to produce an optimized prompt not larger than the original (prompt size...

Addressing these issues entails building dynamic evaluation testbeds involving adaptive counterparties, treating institutions as design primitives, and preserving human agency as a structural feature of the systems we build.

Specific prescriptive recommendations listed by the authors as part of the proposed research paradigm; offered as proposed methods rather than empirically validated interventions in the excerpt.

high positive Solipsistic Superintelligence is Unlikely to be Cooperative recommended design and evaluation practices for AI (dynamic testbeds, institutio...

The paper calls for a non-solipsistic research paradigm that treats interdependence as a core design principle rather than approaching cooperation as a task to solve.

Normative/research-agenda claim made by the authors; stated in the paper as a recommended change in research approach without empirical tests.

high positive Solipsistic Superintelligence is Unlikely to be Cooperative research paradigm orientation (non-solipsistic vs. solipsistic)

Closing this gap requires AI that participates in cooperation: the equilibrium-selection process through which multiple actors navigate their interdependence.

Prescriptive/theoretical recommendation by the authors; framed as necessary to address the earlier-claimed train-test-deploy gap, without empirical demonstration in the excerpt.

high positive Solipsistic Superintelligence is Unlikely to be Cooperative ability of AI to close the train-test-deploy gap via cooperative participation

AI's central challenge is shifting from capability to coexistence.

Author's conceptual assertion in the paper; no empirical data, sample, or experiment reported.

high positive Solipsistic Superintelligence is Unlikely to be Cooperative the primary challenge for AI development (capability vs. coexistence)

Together, these measures can properly establish a behavioral‑regulation model for brain‑privacy protection.

Concluding synthesis in the paper arguing that combined measures would yield the proposed regulatory model (normative conclusion without empirical validation).

high positive Empowerment or behavioral regulation? governing brain–comput... establishment/effectiveness of a behavioral-regulation model for brain-privacy p...

Implement a 'pre‑market regulatory sandbox + post‑market tracking' regime to manage product risks.

Prescriptive policy design proposed in the paper (conceptual recommendation; no empirical pilot data reported).

high positive Empowerment or behavioral regulation? governing brain–comput... effectiveness of combined pre-market sandbox and post-market tracking in managin...

Establish a compliance filing‑review mechanism for BCI privacy policies.

Policy recommendation in the paper proposing a procedural compliance mechanism (normative proposal without empirical testing).

high positive Empowerment or behavioral regulation? governing brain–comput... regulatory oversight mechanism for BCI privacy policies

Apply the principles of lawfulness, legitimacy, necessity and good‑faith to all brain‑privacy processing.

Policy recommendation formulated in the paper (prescriptive legal proposal; no empirical evaluation included).

high positive Empowerment or behavioral regulation? governing brain–comput... legal/principled governance of brain-data processing

A behavioral‑regulation model better reflects the multi‑interest, non‑exclusive nature of brain privacy and balances risk control with innovation.

Normative policy argument and conceptual comparison of regulatory models presented in the paper (theoretical, not empirically tested).

high positive Empowerment or behavioral regulation? governing brain–comput... suitability of behavioral-regulation model for balancing risk control and innova...

The machines are increasingly becoming competent.

Authorial assertion about the trend in AI capability (no metrics or studies provided in the excerpt).

high positive Co-Intelligence: Human-AI Coexistence in the Age of Thinking... AI capability/competence over time

The concept of co-intelligence describes a new cognitive ecology where the human and artificial minds mutually influence one another to come up with ways of comprehending, creating and making choices that neither of them could accomplish individually.

Conceptual claim attributed to Ethan Mollick (2024) and extended by the author — described conceptually rather than demonstrated empirically in the excerpt.

high positive Co-Intelligence: Human-AI Coexistence in the Age of Thinking... emergence of novel joint human-AI outputs/decisions

None of the past technologies have spread into so many aspects of human life, so fast.

Author's comparative assertion about the speed and breadth of AI diffusion relative to prior technologies (no empirical comparison provided in the excerpt).

high positive Co-Intelligence: Human-AI Coexistence in the Age of Thinking... relative speed and breadth of technological diffusion

Artificial intelligence has become a partner in our everyday activities: it dictates our emails, diagnoses our diseases, educates our young children, controls our budgets, creates our artworks, and influences the policies made by governments and corporations.

Authorial assertion listing domains of current AI use (no empirical study or quantified data provided in the excerpt).

high positive Co-Intelligence: Human-AI Coexistence in the Age of Thinking... presence/role of AI across a range of everyday activities (email composition, me...

The internet had to cope with more or less a decade before it could reach one billion users; social media did it in half times.

Comparative historical adoption claim presented by the author (no citation or empirical method given in the excerpt).

high positive Co-Intelligence: Human-AI Coexistence in the Age of Thinking... time-to-reach one billion users for internet and social media

Less than a year after its debut, hundreds of millions of individuals on all seven continents were using large language models, in virtually every field of professional activity, and in most languages.

Authorial assertion summarizing global LLM adoption (no specific study, dataset, or methodology provided in the excerpt).

high positive Co-Intelligence: Human-AI Coexistence in the Age of Thinking... number and breadth of large language model users across professions and language...

There were now a hundred million ChatGPT users in two months.

Authorial assertion in the text citing a user-count milestone for ChatGPT (no study or data source provided in the excerpt).

high positive Co-Intelligence: Human-AI Coexistence in the Age of Thinking... number of ChatGPT users

This provocation introduces fiduciary design as a guiding principle and argues that conversational AI trust and accountability could be unified into a single design and legal paradigm.

Proposal/argument presented in the paper (conceptual design + legal framing); no empirical evaluation or implementation data provided in the excerpt.

high positive Who Does Your AI Work For? Designing Conversational Agents a... feasibility and advisability of unifying trust and accountability via fiduciary ...

When a client hires a personal lawyer, undergoes surgery, or receives advice from an investment manager, the expert they consult often has a fiduciary duty to act in their client's best interests; conversational agents should be held to a similar standard.

Analogy to existing professional fiduciary duties used as the core normative argument in the paper; no empirical testing of legal applicability reported in the excerpt.

high positive Who Does Your AI Work For? Designing Conversational Agents a... applicability of fiduciary duty standard to conversational agents

Conversational AI agents, designed to feel and interact anthropomorphically with human users, must be held to a standard of care commensurate with their capabilities and access.

Normative assertion/proposal laid out in the paper (argumentative reasoning); no empirical test or legal analysis with sample size provided in the excerpt.

high positive Who Does Your AI Work For? Designing Conversational Agents a... requirement to hold conversational agents to a higher standard of care

Conversational agents are increasingly integrated into the most private and intimate aspects of users' lives, from discussions of mental health to financial decisions.

Asserted as descriptive background in the paper (position/argumentative claim); examples provided (mental health, financial decisions); no empirical study or sample size reported in the excerpt.

high positive Who Does Your AI Work For? Designing Conversational Agents a... degree of integration of conversational agents into private/intimate user contex...

The scientific results converged in both runs.

Paper statement reporting that the scientific results from both agents converged across the two experimental runs (descriptive outcome of the runs).

high positive First head-to-head comparison of agentic AI applied to the a... output_quality

Category leaders are persona-resistant (~80% same-brand consistency across personas).

Measured same-brand consistency across personas in audit; reported approximate consistency level for category-leading brands.

high positive Persona Conditioning of Brand Recommendations in Retrieval-A... brand consistency across personas (same-brand %)

Clustered 95% CIs exclude zero on all three measured cells (the sonnet cell's CI rests on only 4 prompt clusters and is correspondingly wider).

Reported clustered 95% confidence intervals for the three measured model/prompt cells; note about sonnet cell having only 4 prompt clusters (hence wider CI).

high positive Persona Conditioning of Brand Recommendations in Retrieval-A... statistical significance of persona effect (confidence intervals)

Using the three metrics (data product adoption, time-to-find, time-to-insight) ties platform success to measurable business value rather than internal activity.

Argument in the paper about metric selection and their role in assessing platform success (methodological rationale).

high positive Beyond the Data Mesh Illusion: Designing Modern AI-augmented... alignment of platform success metrics with business value

A staged framework that shifts ownership from hub to spokes avoids both centralized bottlenecks and uncoordinated decentralization.

Organizational/process recommendation presented in the paper as a way to manage decentralization (design rationale).

high positive Beyond the Data Mesh Illusion: Designing Modern AI-augmented... avoidance of centralized bottlenecks and uncoordinated decentralization (organiz...

Natural-language conversational interfaces democratize access for business users and expose historically underutilized enterprise data.

Proposed UX/interaction benefit asserted in the paper (design claim; no empirical measurement reported in the excerpt).

high positive Beyond the Data Mesh Illusion: Designing Modern AI-augmented... data access and usage by business users (adoption of previously underutilized da...

Large language models (LLMs) that automate governance tasks also lower the barrier for domain practitioners to develop genuine cross-functional expertise spanning business and data engineering, enabling spoke teams to take on greater end-to-end ownership without proportionally increasing their dependence on the hub.

Argument in the paper linking AI/LLM capabilities to skill enablement and reduced hub dependence (conceptual claim; no empirical results in the excerpt).

high positive Beyond the Data Mesh Illusion: Designing Modern AI-augmented... skill acquisition / reduction in dependence on central hub

Domain spokes own business semantics, product backlogs, and local iteration cadence, progressively assuming greater responsibility as they mature (shifting operational ownership outward over time).

Architectural/organizational design element described in the paper (procedural proposal for staged ownership transfer).

high positive Beyond the Data Mesh Illusion: Designing Modern AI-augmented... task allocation and ownership over data product lifecycle

A central hub (Center of Excellence) can provide shared platform services, policy automation, and AI-enabled governance that automatically standardizes data products, generates quality rules, drafts data contracts, and reviews changes for regressions.

Functional capabilities described in the proposed architecture; presented as what the hub component will provide (design/specification).

high positive Beyond the Data Mesh Illusion: Designing Modern AI-augmented... automation and standardization of governance tasks (e.g., quality rules, contrac...

An AI-augmented hub-and-spoke model layered on a modern lakehouse architecture can relax the flexibility-versus-control trade-off inherent in enterprise data platforms.

Proposed architectural solution and theoretical argument in the paper (design proposal; no reported experimental/field results provided in the text excerpt).

high positive Beyond the Data Mesh Illusion: Designing Modern AI-augmented... balance between flexibility (domain self-service) and centralized control (gover...

Affordance actualization (i.e., the realization of GenAI affordances) can shift strategic choices between replacement and retainment of target systems.

Theoretical contribution supported by empirical illustration from two consecutive acquisitions of the same target in the authors' case study (qualitative evidence).

high positive From Knowledge Loss To Knowledge Leverage: How Gen Ai Afford... strategic IS integration choice (replace vs retain)

GenAI reconfigures perceived knowledge challenges, alters integration logics, and expands feasible paths for value capture in M&A IS integration decisions.

Synthesis claim based on the paper's two-case comparative study and theoretical framing using the knowledge-based view and technology affordance lens (qualitative, interpretive evidence).

high positive From Knowledge Loss To Knowledge Leverage: How Gen Ai Afford... feasible paths for value capture and strategic integration logic

GenAI affordances reduced prior assumptions about system intransparency, personnel dependence, and conversion costs during IS integration.

Authors' analysis of the comparative case evidence showing changed perceptions and lowered barriers in the second acquisition after GenAI affordance discovery (qualitative evidence).

high positive From Knowledge Loss To Knowledge Leverage: How Gen Ai Afford... perceived integration barriers (intransparency, personnel dependence, conversion...

LLM-supported affordances, such as learning system knowledge through chat, increased knowledge transferability, knowledge aggregation, and efficiency.

Observed and interpreted affordance actualization in the second acquisition within the paper's qualitative case study (authors report that LLM/chat features enabled these improvements).

high positive From Knowledge Loss To Knowledge Leverage: How Gen Ai Afford... knowledge transferability, knowledge aggregation, efficiency

In the second acquisition the acquirer adopted a 'retain-and-revive' approach for the same target, enabled by newly discovered GenAI affordances.

Empirical observation from the paper's comparative case study of two consecutive acquisitions of the same digital target (qualitative case evidence showing contrasting integration choices across the two acquisitions).

high positive From Knowledge Loss To Knowledge Leverage: How Gen Ai Afford... IS integration strategy (retain-and-revive enabled by GenAI)

These findings suggest a dynamically adaptive LLM-teacher collaboration as student proficiency increases.

Interpretive/recommendation claim in the abstract: authors conclude that collaboration should adapt dynamically with student proficiency based on observed efficacy and ceiling effects.

high positive Double-Edged Sword or Sharp Tool? Designing and Evaluating T... adaptive collaboration strategy / task allocation over proficiency

Both LLM and teacher are critical for student skill improvement.

Abstract statement reporting that both LLM and teacher contributions were important for skill improvement; supported by empirical analysis on the reported dataset (57,954 essays).

high positive Double-Edged Sword or Sharp Tool? Designing and Evaluating T... skill improvement (writing skill acquisition)

Teachers act as pedagogical gatekeepers and bridges to guarantee feedback quality.

Stated in the abstract that within the triadic system teachers ensure feedback quality, implying a complementary role confirmed by the authors' empirical analysis or system design.

high positive Double-Edged Sword or Sharp Tool? Designing and Evaluating T... feedback quality

The triadic collaboration system is efficacious in improving writing quality.

Empirical claim in the abstract supported by analysis of the large dataset (57,954 essays from 10,195 students across 120 schools over two years). The paper states findings confirm the system's efficacy in improving writing quality.

high positive Double-Edged Sword or Sharp Tool? Designing and Evaluating T... writing quality

We introduce a multidimensional evaluation framework grounded in Systemic Functional Linguistics and the suggestion trajectory tracing pipeline.

Methodological contribution explicitly reported in the abstract: a new evaluation framework combining SFL and a suggestion trajectory tracing pipeline.

high positive Double-Edged Sword or Sharp Tool? Designing and Evaluating T... evaluation framework (methodology)

« Prev 1 2 3 … 50 51 52 … 131 132 Next »