Evidence (6507 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	609	159	77	736	1615
Governance & Regulation	664	329	160	99	1273
Organizational Efficiency	624	143	105	70	949
Technology Adoption Rate	502	176	98	78	861
Research Productivity	348	109	48	322	836
Output Quality	391	120	44	40	595
Firm Productivity	385	46	85	17	539
Decision Quality	275	143	62	34	521
AI Safety & Ethics	183	241	59	30	517
Market Structure	152	154	109	20	440
Task Allocation	158	50	56	26	295
Innovation Output	178	23	38	17	257
Skill Acquisition	137	52	50	13	252
Fiscal & Macroeconomic	120	64	38	23	252
Employment Level	93	46	96	12	249
Firm Revenue	130	43	26	3	202
Consumer Welfare	99	51	40	11	201
Inequality Measures	36	105	40	6	187
Task Completion Time	134	18	6	5	163
Worker Satisfaction	79	54	16	11	160
Error Rate	64	78	8	1	151
Regulatory Compliance	69	64	14	3	150
Training Effectiveness	81	15	13	18	129
Wages & Compensation	70	25	22	6	123
Team Performance	74	16	21	9	121
Automation Exposure	41	48	19	9	120
Job Displacement	11	71	16	1	99
Developer Productivity	71	14	9	3	98
Hiring & Recruitment	49	7	8	3	67
Social Protection	26	14	8	2	50
Creative Output	26	14	6	2	49
Skill Obsolescence	5	37	5	1	48
Labor Share of Income	12	13	12	—	37
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Productivity Remove filter

These findings challenge the narrative of complete automation by AI and underscore the enduring importance of human expertise in data science.

Interpretation based on competition results where AI-only baselines underperformed relative to many participant teams and top solutions used human-AI collaboration.

high mixed AgentDS Technical Report: Benchmarking the Future of Human-A... implications for automation vs. human expertise

Regional analysis shows inland regions remain capital-dependent, with an estimated (capital) elasticity of approximately 0.43.

Regional decomposition/estimation reported in the study comparing inland regions to coastal ones using the extended production function.

high mixed Analysis of China's Economic Growth Drivers: An Empirical St... capital elasticity in inland regions (≈0.43)

The authors identify ten evaluation practices that teams use, ranging from lightweight interpretive checks to formal organizational processes (examples: qualitative user reviews, red-team testing, A/B experiments, telemetry/log analysis, structured annotation, governance/meta-evaluation).

Thematic coding of 19 interview transcripts produced a taxonomy enumerating ten practices (paper reports the taxonomy as an outcome).

high mixed Results-Actionability Gap: Understanding How Practitioners E... taxonomy/count and description of evaluation practices

Quantum-driven growth depends critically on adoption rates, infrastructure readiness, complementary investments (digital infrastructure, human capital), and enabling policy/regulatory environments.

Scenario framework that varies (a) technical timelines, (b) sectoral adoption rates (diffusion models), (c) infrastructure readiness, and (d) policy environments; policy counterfactual modeling shows sensitivity of adoption and macro outcomes to these parameters.

high mixed Modeling Macroeconomic Output Gains from Quantum-Driven Prod... realized productivity gains, adoption rates, speed of diffusion

The magnitude and timing of macroeconomic impact from quantum computing are highly uncertain.

Monte Carlo / scenario ensemble results showing wide (fat-tailed) outcome distributions driven by uncertainty in technical milestones, adoption rates, and complementarity strengths; use of expert elicitation to parameterize tail risks.

high mixed Modeling Macroeconomic Output Gains from Quantum-Driven Prod... distribution of macroeconomic outcomes (GDP growth, TFP), timing of impacts

Policymakers face trade-offs between promoting innovation and market efficiency on one hand and protecting privacy, fairness, and national security on the other; economic analysis can inform calibration.

Normative policy analysis and synthesis of literature on digital regulation and trade-offs; supported by comparative observations of regulatory priorities across jurisdictions.

high mixed Path Analysis of Digital Economy and Reconstruction of Inter... policy trade-offs (innovation vs. privacy/fairness/security) and associated welf...

Safeguards such as audit trails, explainability, and human oversight impose additional implementation costs that must be weighed against efficiency benefits.

Normative and economic reasoning based on requirements for compliance and system design; no empirical cost estimates provided.

high mixed ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... implementation costs versus efficiency gains (net cost-benefit of deploying safe...

There is a fundamental tension between AI-driven efficiency and core administrative-law principles—discretion, due process, and accountability.

Doctrinal legal analysis of administrative-law principles in Vietnam and comparative institutional analysis of AI adoption in other systems.

high mixed ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... trade-off between administrative efficiency and adherence to legal principles (d...

The net educational value of AI-generated feedback depends on alignment with pedagogical goals, quality evaluation, integration with human teaching, and governance to manage equity, privacy, and incentives.

Synthesis statement from the meeting report produced by 50 interdisciplinary scholars; conceptual judgment rather than empirical proof.

high mixed The Future of Feedback: How Can AI Help Transform Feedback t... net educational value (composite of learning outcomes, equity metrics, privacy c...

LLMs excel at extracting and generating arguments from unstructured text but are opaque and hard to evaluate or trust.

Synthesis of recent LLM literature and observed properties (generation capability vs. opacity); no empirical evaluation within this paper.

high mixed Argumentative Human-AI Decision-Making: Toward AI Agents Tha... argument extraction/generation performance and model interpretability/trustworth...

HindSight has limitations: it depends on citation and venue proxies for impact, uses a finite forward window (30 months), and may undercount delayed-impact research and be domain-specific to AI/ML.

Authors' stated limitations in the paper noting reliance on observable downstream signals (citations/venues), the finite forward window, field heterogeneity, and measurement noise.

high mixed HindSight: Evaluating LLM-Generated Research Ideas via Futur... Reliability and completeness of HindSight as an evaluation metric given proxy ch...

Practical caveats: benefits depend on accelerators supporting MXFP formats; despite up to 96% recovery, residual quality gaps may remain for some task-specific or safety-critical cases; integration and tuning cost is required to apply BATQuant.

Discussion/limitation section in the paper outlining hardware dependency, remaining quality gaps despite high recovery percentages, and engineering effort for integration and tuning; these are argumentative caveats rather than results of controlled experiments.

high mixed BATQuant: Outlier-resilient MXFP4 Quantization via Learnable... Dependency on hardware support (binary), residual accuracy gap relative to full-...

The sign of the Largest Lyapunov Exponent (LLE) gives a precise criterion: negative LLE (contracting dynamics) permits fast convergence and real speedups for parallel Newton methods, whereas positive LLE (expanding/chaotic dynamics) prevents generally achieving fast convergence.

Theoretical derivation relating Lyapunov exponents to the stability of parallel-in-time linearizations and convergence of the parallel Newton iterations; supported by empirical observations reported on representative tasks.

high mixed Unifying Optimization and Dynamics to Parallelize Sequential... relation between LLE sign and achievable convergence speed / provable accelerati...

Many fixed-point and iterative schemes (e.g., Picard, Jacobi) are unified as special cases within the parallel Newton framework.

Theoretical analysis and derivations in the thesis that show these classical iterative methods arise from particular choices/approximations in the parallel Newton formulation.

high mixed Unifying Optimization and Dynamics to Parallelize Sequential... theoretical inclusivity / mapping of existing algorithms to the framework

The core problem is a trade-off between computational latency/resource cost and decision correctness: invoking more LLM reasoning improves correctness but increases latency; invoking less reduces latency but can increase failures.

Paper frames the research problem explicitly as this trade-off in the Introduction/Problem framing sections and motivates the need for adaptive orchestration.

high mixed When Should a Robot Think? Resource-Aware Reasoning via Rein... trade-off between decision correctness (task success) and computational latency/...

The paper's proposed ISB+NDMS approach is tailored to the Russian institutional context (leveraging historical planning experience) and its transferability to other political-economic systems is uncertain.

Comparative/transferability claim based on institutional analysis and normative reasoning in the paper; no cross-country empirical comparisons provided.

high mixed DIGITAL TRANSFORMATION OF THE RUSSIAN FEDERATION’S SOCIOECON... transferability/applicability of ISB+NDMS across institutional contexts

Teamwork partner type moderates the effect of service empathy on collaboration proficiency (i.e., the impact of service empathy on proficiency differs by human vs AI partner).

Reported interaction/moderated-mediation analyses from the online experiment (n = 861) indicating a significant partner-type × service-empathy interaction predicting collaboration proficiency.

high mixed Adoption of AI partners in temporary tasks: exploring the ef... collaboration proficiency

Employees' emotional state significantly moderates the relationship between partner type (human vs AI) and collaboration proficiency.

Moderation analyses reported from the same online experimental dataset (n = 861), testing interaction terms between partner type and measured employee emotion on collaboration proficiency; authors report a significant moderating effect.

high mixed Adoption of AI partners in temporary tasks: exploring the ef... collaboration proficiency

Demand for labor will shift toward data scientists, ML engineers, and interdisciplinary scientists, while wet-lab expertise and translational teams remain crucial.

Workforce trend analysis and employer hiring patterns summarized in the paper; interviews/case studies indicating changes in team composition.

high mixed Has AI Reshaped Drug Discovery, or Is There Still a Long Way... demand composition for roles (data scientists, ML engineers, wet-lab scientists)...

AI excels at hypothesis generation but cannot replace scientific reasoning and experimental validation; human expertise remains essential.

Argument and case examples in the paper showing AI-generated hypotheses requiring human-led experimental design, interpretation, and validation.

high mixed Has AI Reshaped Drug Discovery, or Is There Still a Long Way... role of AI versus human scientists in hypothesis generation and experimental val...

Net gains from AI are not automatic nor evenly distributed; benefits depend on translation rates to clinical success and on addressing non-technical enablers.

Synthesis and conditional argument informed by sector observations; not backed by empirical distributional analysis in the paper.

high mixed AI as the Catalyst for a New Paradigm in Biomedical Research distribution of gains across firms and translation to clinical success

Alignment with evolving regulatory expectations (evidence standards, auditing, liability) is necessary to translate AI capabilities into products and reduce adoption risk.

Policy-focused argument referencing regulatory uncertainty; no empirical measures of regulatory impact included.

high mixed AI as the Catalyst for a New Paradigm in Biomedical Research adoption risk and time-to-market under regulatory regimes

Realized, sustained impact ('democratized discovery') from AI depends on non-technological enablers: high-quality interoperable data, rigorous validation, transparency/auditability, workforce upskilling, ethical oversight, and regulatory alignment.

Synthesis and prescriptive argument in editorial grounded in observed constraints; no empirical testing of causal dependence provided.

high mixed AI as the Catalyst for a New Paradigm in Biomedical Research sustained impact of AI on discovery (realized democratized discovery)

Reward mechanisms reviewed include up-front token sales, milestone-triggered payouts, bounties, and royalties/licensing revenue distribution.

Synthesis of literature and case-study descriptions documenting available reward/payment mechanisms used by DAOs in decentralized science contexts.

high mixed Decentralized Autonomous Organizations in the Pharmaceutical... presence and prevalence of specific reward/payment mechanism types

Decision models in DAO governance include token-weighted voting, quadratic voting, reputation/stake-based delegation, and multisig/DAO councils for off-chain execution.

Theoretical review of governance mechanisms and survey of existing DAO practices as reported in secondary sources and project documentation.

high mixed Decentralized Autonomous Organizations in the Pharmaceutical... types of decision mechanisms implemented across DAOs

Token overhead varies from modest savings to a 451% increase while pass rates remain unchanged.

Measured token usage for agent runs with and without skills, reporting a range from modest token savings up to a 451% token increase with no corresponding change in pass rates.

high mixed SWE-Skills-Bench: Do Agent Skills Actually Help in Real-Worl... token usage/overhead (percent change) and its relation to pass rates

The review synthesizes cross-domain evidence on the use of AI across the continuum from target identification to regulatory integration and critically evaluates existing limitations including data bias, interpretability discrepancy, and regulatory ambiguity.

Statement about the scope and content of the review (literature synthesis and critical evaluation). This is a description of the paper's methods/content rather than an empirical finding; the excerpt indicates these topics are discussed.

high mixed THE AI REVOLUTION IN PHARMACEUTICALS: INNOVATIONS, CHALLENGE... coverage of limitations in AI application (presence and discussion of data bias,...

The study investigates the benefits and drawbacks associated with the incorporation of innovative artificial intelligence technologies into industrial policies.

Author-stated research objective reported in the text; evidence claimed to come from literature review (novel studies and existing literature), but no specific studies, sample sizes, or empirical measures are provided in the excerpt.

high mixed A Study on Work-Life Balance of Women Employees in the IT Se... benefits and drawbacks of incorporating AI into industrial policy

The paper constructs three policy-contingent labor market scenarios for 2025–2035: (1) an Augmented Services Economy with inclusive productivity gains, (2) a Dual-Speed Labor Market characterized by polarization and uneven adjustment, and (3) a Disruptive Automation Shock involving significant displacement and social strain.

Prognostic, scenario-based approach integrating the three evidence bases (task-level capability mapping, occupational exposure/complementarity analysis, and firm- and worker-level adoption evidence). The scenarios are developed and described in the paper for the 2025–2035 horizon.

high mixed Labor Futures Under Artificial Intelligence: Scenarios for t... alternative labor market trajectories for 2025–2035 (employment levels by sector...

The validity of human–AI decision-making studies hinges on participants' behaviours; effective incentives can potentially affect these behaviours.

Conclusion from the authors' thematic review and theoretical rationale linking incentive design to participant behaviour and study validity (no quantitative effect sizes provided in excerpt).

high mixed Incentive-Tuning: Understanding and Designing Incentives for... participant behaviour (engagement, effort, strategy) and resulting study validit...

The study's counterfactual analytical model links HR indicators (training intensity, absenteeism, labor productivity, turnover rates, workforce allocation) to organizational performance outcomes using regression-based simulations and predictive estimation.

Methodological claim explicitly stated: model construction from an industrial firm dataset using regression-based simulations and predictive techniques. (Specific sample size, variable operationalizations, and time frame not reported in the description.)

high mixed Artificial Intelligence and Human Resource Management: A Cou... methodological estimate of counterfactual organizational performance outcomes

Only one study reported a modest improvement in predicting endoscopic intervention needs (AUC: 0.68).

Single-study result cited in the review reporting AUC = 0.68 for prediction of need for endoscopic intervention.

high mixed How Do AI-Assisted Diagnostic Tools Impact Clinical Decision... prediction of need for endoscopic intervention (AUC)

The review synthesizes findings across five thematic areas: AI‑driven task automation and decision support; digital literacy and capacity building; gender‑sensitive employment patterns; infrastructural and policy challenges; and sustainable development outcomes.

Thematic synthesis of the 55 included articles as described in the paper; themes explicitly listed by the authors.

high mixed Role of AI in Enhancing Work Efficiency and Opportunities fo... thematic categorization of evidence across included studies

Artificial intelligence (AI) has a positive and statistically significant effect on growth at lower conditional quantiles (τ = 0.10–0.25) but is insignificant at higher quantiles.

MMQR estimation results reported in the paper showing significant positive AI coefficients at τ = 0.10–0.25 and insignificant coefficients at higher quantiles.

high mixed Towards Smart, Economic Performance and Sustainable Monetary... GDP growth (conditional quantiles of growth)

Both time constraints and LLM use significantly alter the characteristics of decision-makers' mental representations.

Results from the 2 × 2 experiment (N = 348) comparing representation-related measures across manipulated conditions; reported statistically significant differences associated with time constraints and with LLM use.

high mixed AI-Augmented Strategic Decision-Making Under Time Constraint... characteristics of mental representations (representation-related measures colle...

We develop a theoretical framework - the productivity funnel - that traces how technological potential narrows through successive stages, from access and digital infrastructure, through organizational absorption and human capital adaptation, to ultimate value capture.

Conceptual/theoretical development presented in the paper; no empirical sample needed (framework-building).

high mixed The complementarity trap: AI adoption and value capture n/a (theoretical framework describing stages leading to value capture)

Effects of curated Skills are highly heterogeneous across domains (e.g., +4.5 pp in Software Engineering vs. +51.9 pp in Healthcare).

Per-domain pass-rate deltas reported in the paper (SkillsBench per-domain analysis). The example domain deltas (+4.5 pp and +51.9 pp) are taken from the reported per-domain results.

high mixed SkillsBench: Benchmarking How Well Agent Skills Work Across ... task pass rate (per-domain average delta)

Institutional factors (education systems, active labor market policies, mobility, industrial policy, social protection) shape net employment outcomes from AI.

Theoretical and policy-focused synthesis; cross-country comparisons in literature highlight institutional mediation though no single new cross-country empirical estimate is provided.

high mixed Artificial Intelligence, Automation, and Employment Dynamics... variation in employment outcomes and distributional impacts across countries wit...

Net employment effects depend on the balance of substitution and complementarity, sectoral exposure, and institutional responses.

Conceptual labor-economics framework (task-based, skill-biased change) and comparative review of cross-country/sectoral evidence emphasizing institutional mediation.

high mixed Artificial Intelligence, Automation, and Employment Dynamics... net employment change (by sector/country) and distributional outcomes

AI will substantially restructure labor markets.

Task-based theoretical approach and cross-sectoral synthesis of empirical studies showing task substitution and complementarity effects across occupations and sectors.

high mixed Artificial Intelligence, Automation, and Employment Dynamics... occupational composition, sectoral employment shares, task mix

Kondratieff, Schumpeter, and Mandel each highlight different drivers of capitalist long waves: Kondratieff emphasizes regular technological-driven renewal, Schumpeter emphasizes entrepreneurship and innovation-led creative destruction, and Mandel emphasizes class relations and production structures.

Comparative theoretical analysis and literature synthesis across the three schools; conceptual summary of canonical positions (no original dataset; qualitative interpretation).

high mixed Economic Waves, Crises and Profitability Dynamics of Enterpr... theoretical drivers of capitalist cycles

The study's qualitative and exploratory design limits generalizability; the proposed framework requires quantitative testing and broader samples (practicing architects, firms, cross-cultural contexts).

Explicit limitations stated by authors; study is based on semi-structured interviews with architecture students (N unspecified) and inductive thematic analysis.

high mixed Human–AI Collaboration in Architectural Design Education: To... generalizability / external validity of findings and framework

XChronos reframes transhumanist technology evaluation in experiential terms, creating both market opportunities and measurement/regulatory challenges for AI economics.

Synthesis and concluding argument in the paper summarizing proposed implications; conceptual reasoning without empirical tests.

high mixed XChronos and Conscious Transhumanism: A Philosophical Framew... shift in evaluation criteria toward experiential measures and resultant market/r...

The methodological landscape of the evidence base is heterogeneous, consisting of cross-sectional surveys, case studies, quasi-experimental designs, and a limited number of longitudinal analyses.

Study design information was extracted from the 145 included studies revealing a mix of designs and relatively few longitudinal or experimental studies.

high mixed Digital transformation and its relationship with work produc... study design types (cross-sectional, case study, quasi-experimental, longitudina...

Human factors (training, trust calibration, workflows) determine whether clinicians accept, override, or ignore GenAI suggestions.

Qualitative and quantitative human-AI interaction studies and pilot deployments discussed in the paper; specific sample sizes and effect sizes are not reported in the paper.

high mixed GenAI and clinical decision making in general practice override/acceptance rates; clinician-reported trust and cognitive load; adherenc...

Safety and net benefit of GenAI CDS hinge on deployment details: user interface, real-time feedback, uncertainty quantification, calibration, and how recommendations are presented (strong vs. suggestive).

Human factors and implementation studies referenced; early A/B tests and human-AI interaction research suggest interface and presentation affect acceptance and error rates; no large-scale standardized implementation trial data cited.

high mixed GenAI and clinical decision making in general practice acceptance/override rates; error rates; calibration metrics; clinician trust

Reimbursement models (fee-for-service vs. capitation) will influence whether cost savings from GenAI are realized or offset by increased service volume.

Economic incentive framework and prior health-economics literature cited; the paper does not provide direct empirical tests but references plausible incentive channels.

high mixed GenAI and clinical decision making in general practice total spending; per-patient cost; service volume under different payment models

RL and adaptive methods are good for real-time adaptation but can be myopic, require large amounts of interaction data, and struggle to incorporate long-term preference structure and ethical constraints.

Surveyed properties of reinforcement learning and adaptive methods in HRI/RS literature; no new empirical evaluation in this paper.

high mixed Reimagining Social Robots as Recommender Systems: Foundation... real-time adaptation effectiveness, sample efficiency (amount of interaction dat...

Key tradeoffs in contemporary financing models include speed/flexibility versus regulatory coverage and long‑term cost, and data reliance versus privacy/fairness.

Multi‑criteria comparative evaluation and conceptual analysis across financing models; synthesis draws on regulatory context and observed product features rather than primary quantitative tradeoff estimation.

high mixed Traditional vs. contemporary financing models for MSMEs and ... tradeoff between speed/flexibility and regulatory protection/cost; tradeoff betw...

Performance of structure prediction models scales with data, model size, and compute; there are tradeoffs between accuracy and inference speed/simplicity.

Paper explicitly states scaling behavior and tradeoffs in 'Compute and training' and 'Representative models' sections; no precise scaling curves or thresholds are provided in the text.

high mixed Protein structure prediction powered by artificial intellige... model predictive performance as a function of training data volume, model size, ...

« Prev 1 2 3 4 5 6 … 130 131 Next »