Evidence (4560 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	378	106	59	455	1007
Governance & Regulation	379	176	116	58	739
Research Productivity	240	96	34	294	668
Organizational Efficiency	370	82	63	35	553
Technology Adoption Rate	296	118	66	29	513
Firm Productivity	277	34	68	10	394
AI Safety & Ethics	117	177	44	24	364
Output Quality	244	61	23	26	354
Market Structure	107	123	85	14	334
Decision Quality	168	74	37	19	301
Fiscal & Macroeconomic	75	52	32	21	187
Employment Level	70	32	74	8	186
Skill Acquisition	89	32	39	9	169
Firm Revenue	96	34	22	—	152
Innovation Output	106	12	21	11	151
Consumer Welfare	70	30	37	7	144
Regulatory Compliance	52	61	13	3	129
Inequality Measures	24	68	31	4	127
Task Allocation	75	11	29	6	121
Training Effectiveness	55	12	12	16	96
Error Rate	42	48	6	—	96
Worker Satisfaction	45	32	11	6	94
Task Completion Time	78	5	4	2	89
Wages & Compensation	46	13	19	5	83
Team Performance	44	9	15	7	76
Hiring & Recruitment	39	4	6	3	52
Automation Exposure	18	17	9	5	50
Job Displacement	5	31	12	—	48
Social Protection	21	10	6	2	39
Developer Productivity	29	3	3	1	36
Worker Turnover	10	12	—	3	25
Skill Obsolescence	3	19	2	—	24
Creative Output	15	5	3	1	24
Labor Share of Income	10	4	9	—	23

Productivity Remove filter

Better data continuity across lifecycle phases reduces model training friction and increases the value of historical data for forecasting and causal analysis.

Conceptual argument supported by case evidence in the review showing fragmented data reduces reusability; authors infer benefits for AI training and forecasting.

medium positive Digital Twins Across the Asset Lifecycle: Technical, Organis... model training friction / forecasting value of historical data

DTs generate continuous, high‑resolution operational data (IoT telemetry, usage patterns, maintenance logs) that can substantially improve AI models for predictive maintenance, scheduling, energy optimisation, and logistics.

Logical implication and examples from pilot studies in the review showing richer telemetry and operational datasets produced by DT pilots; argued benefits for AI model inputs.

medium positive Digital Twins Across the Asset Lifecycle: Technical, Organis... AI model performance or potential improvement via richer data inputs

Three core differences by which DTs extend BIM: (1) bidirectional automated physical↔digital data exchange; (2) integration of heterogeneous, real‑time sources (IoT, operational systems); (3) lifecycle continuity preserving data across handovers.

Conceptual synthesis across the literature reviewed (conceptual papers, case studies, pilots) identifying functional distinctions between DT and BIM.

medium positive Digital Twins Across the Asset Lifecycle: Technical, Organis... functional capabilities/features distinguishing DT from BIM

Digital twin (DT) technology can materially improve construction lifecycle performance beyond what Building Information Modelling (BIM) delivers.

Synthesis of 160 reviewed studies including conceptual papers, case studies and pilot deployments reporting performance improvements attributed to DT implementations.

medium positive Digital Twins Across the Asset Lifecycle: Technical, Organis... construction lifecycle performance (overall)

ANN analysis ranks information barriers as the most important predictor of organizational inertia.

ANN feature-importance analysis reported in the paper that ranks predictors for inertia, identifying information barriers as the top predictor; methodological specifics (sample size, ANN parameters) are not provided in the abstract.

medium positive Reimagining Stakeholder Engagement Through Generative AI: A ... Organizational inertia

Artificial neural network (ANN) analysis ranks functional values as the most important predictor of initial trust.

ANN feature-importance analysis reported in the paper that ranks predictors for initial trust, with functional values highest; method described as ANN-based relative importance ranking (details such as network architecture, training sample size, or validation metrics not reported in the abstract).

medium positive Reimagining Stakeholder Engagement Through Generative AI: A ... Initial trust in GAICS

Human interaction, information, and norm barriers increase organizational inertia (resistance to change) toward GAICS.

Qualitative phase surfaced these barriers; quantitative validation showed statistically significant positive relationships between (a) need for human interaction barriers, (b) information barriers (lack of knowledge/clarity), and (c) norm barriers (cultural/social norms) and organizational inertia.

medium positive Reimagining Stakeholder Engagement Through Generative AI: A ... Organizational inertia / resistance to change regarding GAICS

Functional and instrumental values increase initial trust in GAICS.

Mixed-methods evidence: qualitative exploratory phase identified functional and instrumental value as drivers; quantitative phase (inferential analysis) found positive, statistically significant effects of functional value (system usefulness/quality) and instrumental value (task-related benefits) on initial trust.

medium positive Reimagining Stakeholder Engagement Through Generative AI: A ... Initial trust in GAICS

AI/ML–based credit scoring and alternative‑data underwriting reduce information asymmetries, lowering search and monitoring costs and expanding effective credit supply to previously rejected MSMEs and startups.

Analytical argument supported by illustrative case examples and literature on machine‑learning underwriting; the paper notes limited causal identification and time‑sensitivity of fintech products.

medium positive Traditional vs. contemporary financing models for MSMEs and ... information asymmetry reduction, search/monitoring costs, credit supply expansio...

Government action (digital ID, payments rails, credit guarantees, standards, consumer protection) is vital to enable beneficial outcomes from digital finance for MSMEs.

Policy synthesis and comparative evaluation recommending government infrastructure and regulatory measures; conclusion based on institutional analysis rather than experimental evidence.

medium positive Traditional vs. contemporary financing models for MSMEs and ... effectiveness of digital finance ecosystem (enabled by infrastructure and policy...

Case studies indicate FinTech platforms have meaningfully lowered rejection rates and loan turnaround times for underbanked MSMEs, accelerating working‑capital access.

Illustrative case studies of FinTech deployments in India reporting lower rejection rates and faster approvals; paper explicitly notes these cases are illustrative and not nationally representative and do not establish causal identification.

medium positive Traditional vs. contemporary financing models for MSMEs and ... loan rejection rate, loan turnaround time, working‑capital access

Supply‑chain financing can meaningfully unlock working capital for MSMEs by leveraging buyer creditworthiness, yielding high impact for MSMEs embedded in modern supply chains.

Comparative evaluation and illustrative case studies highlighting supply‑chain finance deployments; evidence is demonstrative and not nationally representative or causally identified.

medium positive Traditional vs. contemporary financing models for MSMEs and ... working capital availability for MSMEs, impact magnitude for supply‑chain‑embedd...

Optimal financing outcomes generally come from hybrid approaches that combine formal banking credibility and policy support with FinTech speed and data-driven underwriting.

Comparative evaluation and policy synthesis recommending co‑lending, credit guarantees, and partnerships (banks as liquidity providers combined with FinTech underwriting); based on qualitative tradeoff analysis rather than experimental/causal evidence.

medium positive Traditional vs. contemporary financing models for MSMEs and ... overall financing outcomes (access, cost, risk mitigation)

Compared with traditional bank loans and government schemes, contemporary financing models tend to be faster, more flexible, and more scalable for smaller firms.

Comparative qualitative evaluation across five variables and illustrative case studies showing reduced loan turnaround times and improved accessibility for small firms; no nationally representative sample or causal inference provided.

medium positive Traditional vs. contemporary financing models for MSMEs and ... loan turnaround time, flexibility of repayment, scalability to small firms

Digital technologies — especially FinTech lending platforms, alternative debt/equity products, supply‑chain finance, crowdfunding, and emerging blockchain applications — are materially expanding timely access to capital for Indian MSMEs and startups.

Multi‑criteria comparative evaluation (accessibility, finance cost, flexibility, risk, scalability) plus illustrative case studies of FinTech and alternative financing deployments in India that report faster turnaround and inclusion effects. The paper notes case evidence is illustrative rather than nationally representative and lacks quantitative causal identification.

medium positive Traditional vs. contemporary financing models for MSMEs and ... timely access to capital (availability and speed of financing for MSMEs/startups...

Proprietary experimental datasets and curated metagenomic sequences become valuable intellectual assets that can differentiate commercial offerings.

Paper lists 'Data as an economic asset' and highlights the value of proprietary datasets and curated metagenomes; no market valuation data are included.

medium positive Protein structure prediction powered by artificial intellige... commercial value attributed to proprietary sequence/structure datasets and their...

Faster, cheaper access to structural hypotheses can shorten drug and enzyme discovery cycles, raising R&D productivity and lowering marginal costs of early‑stage screening.

Paper argues this as an implication under 'Productivity and R&D acceleration'; it is presented as an economic consequence rather than demonstrated with empirical cost‑or time‑saving data in the text.

medium positive Protein structure prediction powered by artificial intellige... duration and cost of early‑stage drug/enzyme discovery cycles and marginal cost ...

Practical applications are already emerging, including accelerating target structure availability for small‑molecule and biologics design, guiding enzyme redesign, and interpreting disease mutations.

Paper lists these application areas as emerging uses of AI‑predicted structures; evidence is presented as examples and implications rather than empirical case studies within the text.

medium positive Protein structure prediction powered by artificial intellige... availability of structural hypotheses for drug/biology design, utility in enzyme...

Template‑and‑MSA informed architectures (e.g., RoseTTAFold and AlphaFold family) deliver near‑experimental accuracy for many proteins.

Paper names these architectures and links their inputs (MSAs, templates) to high accuracy against experimental structures (PDB); specific evaluation datasets, protein counts, or error metrics are not enumerated in the text.

medium positive Protein structure prediction powered by artificial intellige... fraction of proteins for which prediction accuracy is near experimental (structu...

Modern AI systems (e.g., AlphaFold variants, RoseTTAFold, single‑sequence models like ESMFold) can approach or reach near‑experimental accuracy while greatly increasing speed and scalability.

Paper cites specific models (AlphaFold family, RoseTTAFold, ESMFold) and describes benchmarking against structural ground truth (PDB / curated experimental structures) and large‑scale pretraining; exact benchmark values or sample sizes are not specified in the text.

medium positive Protein structure prediction powered by artificial intellige... structure prediction accuracy (compared to experimental structures) and inferenc...

Based on findings and student-reported concerns, the authors recommend integrating explicit AI-literacy instruction to support critical and reflective use of Generative AI tools in education.

Authors' recommendation in discussion sections, motivated by observed heterogeneous effects, student concerns about accuracy and overreliance, and qualitative calls for guidance; recommendation not experimentally tested in this study.

medium positive Expanding the lens: multi-institutional evidence on student ... recommendation for AI-literacy instruction (policy/educational intervention)

Students reported that ChatGPT provided faster access to information, helped clarify concepts, and aided organization (e.g., outlining and summarizing).

Qualitative topic-based coding of open-ended survey responses from participating students (sample = 254 across six courses); thematic analysis identified benefits including speed, clarification, and organizational support.

medium positive Expanding the lens: multi-institutional evidence on student ... student-reported perceived usefulness/benefits

There is a weak but statistically significant positive relationship between iterative engagement with ChatGPT (measured by number of edits to the tool's outputs) and better academic performance.

Correlational analysis between usage behavior (number of edits) and student scores reported as weak but significant; based on same experimental sample (N = 254) and usage logs/survey data.

medium positive Expanding the lens: multi-institutional evidence on student ... student task/course scores (correlated with number of edits)

The improvement from allowing ChatGPT use was statistically significant in specific courses (examples named: computer systems administration, informatics, childhood disorders).

Course-level analyses using GLM and non-parametric comparisons showing statistically significant treatment effects in some courses; sample drawn from the full N = 254 distributed across six courses (per-course Ns not specified in summary).

medium positive Expanding the lens: multi-institutional evidence on student ... course/task scores within specified courses

Allowing students to use ChatGPT on knowledge-based academic tasks led to generally higher scores compared with control groups restricted to non-GenAI resources.

Randomized/experimental assignment of students to treatment (allowed ChatGPT) vs control (no GenAI) across six courses at two institutions; overall sample N = 254; comparisons made using descriptive statistics, general linear model (GLM) controlling for covariates, and non-parametric tests.

medium positive Expanding the lens: multi-institutional evidence on student ... student task/course scores (short-term performance on knowledge-based tasks)

Policy and platform design choices (e.g., provenance metadata, detection/disclosure of AI-generated content, monetization rule alignment) can reinforce or mitigate harms from GenAI-driven creator economies.

Policy recommendations and implications drawn from the qualitative findings across the 377-video sample and normative reasoning; not empirically tested.

medium positive Monetizing Generative AI: YouTubers' Collective Knowledge on... potential mitigation or amplification of harms via platform and policy intervent...

Policy interventions that raise the reinstatement rate — for example, compensation/transfers to translate AI gains into broad-based purchasing power, faster/stronger fiscal support or automatic stabilizers — can prevent the explosive feedback and stabilize demand.

Model experiments and sensitivity analysis showing that increasing the reinstatement elasticity or direct transfers moves the system from explosive to convergent parameter regions in the calibrated phase-space.

medium positive Abundant Intelligence and Deficient Demand: A Macro-Financia... reinstatement rate, aggregate demand, avoidance of explosive crisis (regime outc...

FutureBoosting generalizes across multiple real-world electricity markets and forecast horizons.

Empirical results reported across 'multiple real-world electricity markets' and several forecasting horizons to capture diverse volatility and regime behavior (details on exact markets/horizons are reported in the experiments section of the paper).

medium positive Regression Models Meet Foundation Models: A Hybrid-AI Approa... MAE (and other error metrics) across different market datasets and horizons

The approach preserves the interpretability of downstream regression models while injecting temporal context.

Use of interpretable regression models (e.g., gradient-boosted decision trees) and XAI analyses (SHAP/feature importance) reported in the paper demonstrating interpretability of feature contributions.

medium positive Regression Models Meet Foundation Models: A Hybrid-AI Approa... Model interpretability (qualitative; feature-level explanations via XAI)

Freezing the TSFM (no joint fine-tuning) makes the framework lightweight and plug-and-play, lowering computational cost relative to joint training.

Architectural design: two-stage pipeline with a frozen TSFM used only to generate forecasted features; paper asserts ability to leverage pretrained TSFMs without end-to-end retraining. (No detailed compute-cost benchmarks given in the summary.)

medium positive Regression Models Meet Foundation Models: A Hybrid-AI Approa... Computational/deployment cost (qualitative claim about lower cost and ease of in...

MAE reductions frequently exceed 30% in many cases when using FutureBoosting.

Reported quantitative results in the paper showing relative MAE reductions (paper text: 'reductions in Mean Absolute Error (MAE) exceeding 30% in many cases'); based on experiments across multiple datasets/horizons.

medium positive Regression Models Meet Foundation Models: A Hybrid-AI Approa... Relative reduction in Mean Absolute Error (percent)

FutureBoosting consistently outperforms state-of-the-art TSFMs and regression baselines.

Head-to-head experiments in the paper comparing the two-stage FutureBoosting pipeline to standalone TSFM models and common regression baselines (e.g., gradient-boosted trees) across multiple markets and horizons under rolling-origin evaluation.

medium positive Regression Models Meet Foundation Models: A Hybrid-AI Approa... MAE (and other forecasting error metrics vs. baselines)

FutureBoosting substantially improves electricity price forecasting.

Empirical evaluation reported in the paper across multiple real-world electricity market datasets and forecasting horizons; comparisons against TSFM-only and regression-only baselines using time-series-aware cross-validation; primary metric: Mean Absolute Error (MAE).

medium positive Regression Models Meet Foundation Models: A Hybrid-AI Approa... Mean Absolute Error (MAE) of electricity price forecasts

For economic and policy analysis, researchers should estimate distributions of effects, account for dynamic adaptation/nonstationarity, pre-register plans, track model versions, and combine RCTs with longitudinal/observational/structural methods.

Implications and recommendations section synthesized from practitioner interviews (n=16) and authors' applied methodological reasoning.

medium positive RCTs & Human Uplift Studies: Methodological Challenges and P... recommended research practices for economically meaningful inference about AI up...

High-stakes deployment, governance, and safety decisions should not rely on single uplift RCTs; they require synthesis across studies, ongoing monitoring, scenario analysis, and explicit uncertainty characterization.

Authors' recommendations drawn from thematic analysis of interview data (n=16) and the mapped validity consequences; policy implications section articulates this guidance.

medium positive RCTs & Human Uplift Studies: Methodological Challenges and P... reliability of decision-making based on uplift evidence

The paper's mechanism is strategyproof at an epoch granularity under its assumptions (quasilinear utilities, discrete slice items, decision epochs).

Theoretical mechanism-design claim presented in the paper relying on stated assumptions (quasilinear utility, discrete slices, epoch-based decisions). Empirical simulations assume truthful bidding per epoch consistent with this property but do not evaluate inter-epoch strategic deviations.

medium positive Real-Time AI Service Economy: A Framework for Agentic Comput... incentive compatibility per epoch (absence of profitable misreports within an ep...

Scaffold choice creates an economic opportunity for third-party tooling and open-source scaffolding because scaffold effects materially affect performance and reproducibility.

Observed performance differences across scaffolds (up to ~5 percentage points) and sensitivity of results to scaffold selection reported in the study.

medium positive Re-Evaluating EVMBench: Are AI Agents Ready for Smart Contra... market_opportunity_for_scaffold_tools (qualitative_based_on_performance_impact)

NFD increases complementarities between domain experts and AI, raising demand for hybrid roles (expert + knowledge engineer) and skills in elicitation, verification, and artifact design.

Conceptual argument in implications section, supported by practical demands observed in the case study (coordination between analysts and knowledge engineering activities).

medium positive Nurture-First Agent Development: Building Domain-Expert AI A... demand for hybrid roles; number of hybrid role hires or time spent on elicitatio...

The case study produced modular knowledge artifacts (rules, templates, tests) that supported reuse and auditability.

Empirical artifact production in the case study: creation of templates, checklists, heuristics, and test suites; reuse counts and audit traces were tracked qualitatively and with reuse metrics (exact numbers not specified).

medium positive Nurture-First Agent Development: Building Domain-Expert AI A... number and reuse rate of modular artifacts; presence of audit trails

In the same case study, iterative crystallization increased the consistency/reliability of agent outputs.

Case study measurements of agent reliability and qualitative practitioner feedback/acceptance across development spirals; precise quantitative details and sample size are not reported.

medium positive Nurture-First Agent Development: Building Domain-Expert AI A... consistency/reliability of outputs (agent output variance, agreement with practi...

In a detailed case study building a U.S. equity financial research agent, iterative crystallization reduced per-task human effort.

Case study with iterative co-development with financial analysts; interaction transcripts logged and operational metrics (time per analysis) reported across development spirals. The paper does not report sample size or statistical tests.

medium positive Nurture-First Agent Development: Building Domain-Expert AI A... analyst time per analysis (human effort per task)

Annotator affective traits shift labeling propensity (toward positivity); classifiers trained on pooled annotator labels may inherit systematic biases from annotator heterogeneity.

Observed associations between trait mood/reactivity and increased positive labeling in GEE models; extrapolated implication for classifier training when using pooled labels from heterogeneous annotators.

medium positive Exploring Indicators of Developers' Sentiment Perceptions in... systematic shift in aggregate labels (and therefore potential classifier outputs...

Trait-level mood and emotional reactivity weakly predict a higher tendency to label statements as positive (and fewer as neutral).

Statement-level repeated-measures generalized estimating equations (GEE) using the 81 participants' repeated labels of 30 statements per round; trait mood and reactivity variables were significant predictors in GEE models for positive vs neutral labeling, but with small effect sizes.

medium positive Exploring Indicators of Developers' Sentiment Perceptions in... probability of labeling a statement as positive (vs neutral)

Replacing the binary meta-analysis assumption (fully homogeneous vs fully heterogeneous) with KL-based adaptive pooling reduces inefficiency or bias that can arise under the binary assumption.

Motivating discussion and theoretical/simulation comparisons in the paper showing cases where standard approaches (fixed-effect or random-effect extremes) are inefficient or biased, and the KL method performs better.

medium positive Redefining shared information: a heterogeneity-adaptive fram... relative estimation efficiency and bias compared to standard meta-analytic extre...

Application to the eICU Collaborative Research Database demonstrates the practical performance of the KL-shrinkage method on a heterogeneous, multi-center clinical dataset.

Real-data empirical application described in the paper using the eICU database; reported performance comparisons (specific dataset size and metrics are provided in the paper's empirical section but are not specified in this summary).

medium positive Redefining shared information: a heterogeneity-adaptive fram... empirical performance on eICU data (e.g., predictive accuracy, estimation MSE, i...

Extensive simulation studies show the KL-shrinkage estimator is robust and versatile across varying degrees and structures of heterogeneity.

Comprehensive simulation experiments reported in the paper that vary heterogeneity magnitude and structure (simulation details reported in the empirical evaluation section; exact sample sizes/configurations given in the paper).

medium positive Redefining shared information: a heterogeneity-adaptive fram... estimator performance metrics in simulations (e.g., MSE, bias, coverage) across ...

Using KL divergence as the penalty is a natural and tractable choice because KL measures relative information between distributions and leads to convenient geometric/algebraic properties.

Argumentation and mathematical exposition in the methods section explaining properties of KL divergence and demonstrating resulting tractability in algebraic derivations.

medium positive Redefining shared information: a heterogeneity-adaptive fram... tractability of derivations / geometric justification (qualitative)

Inferential procedures (e.g., confidence intervals and hypothesis tests) based on the KL-shrinkage approach are asymptotically valid without assuming parameter homogeneity across datasets.

Asymptotic theoretical results in the paper establishing validity (coverage and test properties) even under heterogeneity assumptions; details in asymptotic analysis section.

medium positive Redefining shared information: a heterogeneity-adaptive fram... asymptotic coverage of confidence intervals and Type I error control of hypothes...

CBCTRepD improves report structure, reduces omissions, and promotes more systematic attention to co-existing lesions across anatomical regions in CBCT reports.

Clinical evaluation findings reported in the paper indicate improvements in structure, reduced omissions, and increased attention to multi-region co-existing lesions when using the system. (Operational definitions of 'structure', how omissions were identified, and measurement methods are not detailed in the provided text.)

medium positive Bridging the Skill Gap in Clinical CBCT Interpretation with ... Report structure, omission rate, and documentation of multi-region co-existing l...

Senior radiologists using CBCTRepD produce collaborative reports with reduced omission-related errors, including fewer clinically important missed lesions.

Clinician-centered assessment described in the evaluation; paper reports reductions in omission-related errors and clinically important missed lesions for seniors when using the system. (The provided summary does not list the number of senior reviewers, counts of omissions before/after, or statistical testing.)

medium positive Bridging the Skill Gap in Clinical CBCT Interpretation with ... Omission-related errors and clinically important missed lesions in final reports...

« Prev 1 2 3 … 69 70 71 … 91 92 Next »