Evidence (4560 claims)
Adoption
5267 claims
Productivity
4560 claims
Governance
4137 claims
Human-AI Collaboration
3103 claims
Labor Markets
2506 claims
Innovation
2354 claims
Org Design
2340 claims
Skills & Training
1945 claims
Inequality
1322 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 378 | 106 | 59 | 455 | 1007 |
| Governance & Regulation | 379 | 176 | 116 | 58 | 739 |
| Research Productivity | 240 | 96 | 34 | 294 | 668 |
| Organizational Efficiency | 370 | 82 | 63 | 35 | 553 |
| Technology Adoption Rate | 296 | 118 | 66 | 29 | 513 |
| Firm Productivity | 277 | 34 | 68 | 10 | 394 |
| AI Safety & Ethics | 117 | 177 | 44 | 24 | 364 |
| Output Quality | 244 | 61 | 23 | 26 | 354 |
| Market Structure | 107 | 123 | 85 | 14 | 334 |
| Decision Quality | 168 | 74 | 37 | 19 | 301 |
| Fiscal & Macroeconomic | 75 | 52 | 32 | 21 | 187 |
| Employment Level | 70 | 32 | 74 | 8 | 186 |
| Skill Acquisition | 89 | 32 | 39 | 9 | 169 |
| Firm Revenue | 96 | 34 | 22 | — | 152 |
| Innovation Output | 106 | 12 | 21 | 11 | 151 |
| Consumer Welfare | 70 | 30 | 37 | 7 | 144 |
| Regulatory Compliance | 52 | 61 | 13 | 3 | 129 |
| Inequality Measures | 24 | 68 | 31 | 4 | 127 |
| Task Allocation | 75 | 11 | 29 | 6 | 121 |
| Training Effectiveness | 55 | 12 | 12 | 16 | 96 |
| Error Rate | 42 | 48 | 6 | — | 96 |
| Worker Satisfaction | 45 | 32 | 11 | 6 | 94 |
| Task Completion Time | 78 | 5 | 4 | 2 | 89 |
| Wages & Compensation | 46 | 13 | 19 | 5 | 83 |
| Team Performance | 44 | 9 | 15 | 7 | 76 |
| Hiring & Recruitment | 39 | 4 | 6 | 3 | 52 |
| Automation Exposure | 18 | 17 | 9 | 5 | 50 |
| Job Displacement | 5 | 31 | 12 | — | 48 |
| Social Protection | 21 | 10 | 6 | 2 | 39 |
| Developer Productivity | 29 | 3 | 3 | 1 | 36 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
| Skill Obsolescence | 3 | 19 | 2 | — | 24 |
| Creative Output | 15 | 5 | 3 | 1 | 24 |
| Labor Share of Income | 10 | 4 | 9 | — | 23 |
Productivity
Remove filter
Better data continuity across lifecycle phases reduces model training friction and increases the value of historical data for forecasting and causal analysis.
Conceptual argument supported by case evidence in the review showing fragmented data reduces reusability; authors infer benefits for AI training and forecasting.
DTs generate continuous, high‑resolution operational data (IoT telemetry, usage patterns, maintenance logs) that can substantially improve AI models for predictive maintenance, scheduling, energy optimisation, and logistics.
Logical implication and examples from pilot studies in the review showing richer telemetry and operational datasets produced by DT pilots; argued benefits for AI model inputs.
Three core differences by which DTs extend BIM: (1) bidirectional automated physical↔digital data exchange; (2) integration of heterogeneous, real‑time sources (IoT, operational systems); (3) lifecycle continuity preserving data across handovers.
Conceptual synthesis across the literature reviewed (conceptual papers, case studies, pilots) identifying functional distinctions between DT and BIM.
Digital twin (DT) technology can materially improve construction lifecycle performance beyond what Building Information Modelling (BIM) delivers.
Synthesis of 160 reviewed studies including conceptual papers, case studies and pilot deployments reporting performance improvements attributed to DT implementations.
ANN analysis ranks information barriers as the most important predictor of organizational inertia.
ANN feature-importance analysis reported in the paper that ranks predictors for inertia, identifying information barriers as the top predictor; methodological specifics (sample size, ANN parameters) are not provided in the abstract.
Artificial neural network (ANN) analysis ranks functional values as the most important predictor of initial trust.
ANN feature-importance analysis reported in the paper that ranks predictors for initial trust, with functional values highest; method described as ANN-based relative importance ranking (details such as network architecture, training sample size, or validation metrics not reported in the abstract).
Human interaction, information, and norm barriers increase organizational inertia (resistance to change) toward GAICS.
Qualitative phase surfaced these barriers; quantitative validation showed statistically significant positive relationships between (a) need for human interaction barriers, (b) information barriers (lack of knowledge/clarity), and (c) norm barriers (cultural/social norms) and organizational inertia.
Functional and instrumental values increase initial trust in GAICS.
Mixed-methods evidence: qualitative exploratory phase identified functional and instrumental value as drivers; quantitative phase (inferential analysis) found positive, statistically significant effects of functional value (system usefulness/quality) and instrumental value (task-related benefits) on initial trust.
AI/ML–based credit scoring and alternative‑data underwriting reduce information asymmetries, lowering search and monitoring costs and expanding effective credit supply to previously rejected MSMEs and startups.
Analytical argument supported by illustrative case examples and literature on machine‑learning underwriting; the paper notes limited causal identification and time‑sensitivity of fintech products.
Government action (digital ID, payments rails, credit guarantees, standards, consumer protection) is vital to enable beneficial outcomes from digital finance for MSMEs.
Policy synthesis and comparative evaluation recommending government infrastructure and regulatory measures; conclusion based on institutional analysis rather than experimental evidence.
Case studies indicate FinTech platforms have meaningfully lowered rejection rates and loan turnaround times for underbanked MSMEs, accelerating working‑capital access.
Illustrative case studies of FinTech deployments in India reporting lower rejection rates and faster approvals; paper explicitly notes these cases are illustrative and not nationally representative and do not establish causal identification.
Supply‑chain financing can meaningfully unlock working capital for MSMEs by leveraging buyer creditworthiness, yielding high impact for MSMEs embedded in modern supply chains.
Comparative evaluation and illustrative case studies highlighting supply‑chain finance deployments; evidence is demonstrative and not nationally representative or causally identified.
Optimal financing outcomes generally come from hybrid approaches that combine formal banking credibility and policy support with FinTech speed and data-driven underwriting.
Comparative evaluation and policy synthesis recommending co‑lending, credit guarantees, and partnerships (banks as liquidity providers combined with FinTech underwriting); based on qualitative tradeoff analysis rather than experimental/causal evidence.
Compared with traditional bank loans and government schemes, contemporary financing models tend to be faster, more flexible, and more scalable for smaller firms.
Comparative qualitative evaluation across five variables and illustrative case studies showing reduced loan turnaround times and improved accessibility for small firms; no nationally representative sample or causal inference provided.
Digital technologies — especially FinTech lending platforms, alternative debt/equity products, supply‑chain finance, crowdfunding, and emerging blockchain applications — are materially expanding timely access to capital for Indian MSMEs and startups.
Multi‑criteria comparative evaluation (accessibility, finance cost, flexibility, risk, scalability) plus illustrative case studies of FinTech and alternative financing deployments in India that report faster turnaround and inclusion effects. The paper notes case evidence is illustrative rather than nationally representative and lacks quantitative causal identification.
Proprietary experimental datasets and curated metagenomic sequences become valuable intellectual assets that can differentiate commercial offerings.
Paper lists 'Data as an economic asset' and highlights the value of proprietary datasets and curated metagenomes; no market valuation data are included.
Faster, cheaper access to structural hypotheses can shorten drug and enzyme discovery cycles, raising R&D productivity and lowering marginal costs of early‑stage screening.
Paper argues this as an implication under 'Productivity and R&D acceleration'; it is presented as an economic consequence rather than demonstrated with empirical cost‑or time‑saving data in the text.
Practical applications are already emerging, including accelerating target structure availability for small‑molecule and biologics design, guiding enzyme redesign, and interpreting disease mutations.
Paper lists these application areas as emerging uses of AI‑predicted structures; evidence is presented as examples and implications rather than empirical case studies within the text.
Template‑and‑MSA informed architectures (e.g., RoseTTAFold and AlphaFold family) deliver near‑experimental accuracy for many proteins.
Paper names these architectures and links their inputs (MSAs, templates) to high accuracy against experimental structures (PDB); specific evaluation datasets, protein counts, or error metrics are not enumerated in the text.
Modern AI systems (e.g., AlphaFold variants, RoseTTAFold, single‑sequence models like ESMFold) can approach or reach near‑experimental accuracy while greatly increasing speed and scalability.
Paper cites specific models (AlphaFold family, RoseTTAFold, ESMFold) and describes benchmarking against structural ground truth (PDB / curated experimental structures) and large‑scale pretraining; exact benchmark values or sample sizes are not specified in the text.
Based on findings and student-reported concerns, the authors recommend integrating explicit AI-literacy instruction to support critical and reflective use of Generative AI tools in education.
Authors' recommendation in discussion sections, motivated by observed heterogeneous effects, student concerns about accuracy and overreliance, and qualitative calls for guidance; recommendation not experimentally tested in this study.
Students reported that ChatGPT provided faster access to information, helped clarify concepts, and aided organization (e.g., outlining and summarizing).
Qualitative topic-based coding of open-ended survey responses from participating students (sample = 254 across six courses); thematic analysis identified benefits including speed, clarification, and organizational support.
There is a weak but statistically significant positive relationship between iterative engagement with ChatGPT (measured by number of edits to the tool's outputs) and better academic performance.
Correlational analysis between usage behavior (number of edits) and student scores reported as weak but significant; based on same experimental sample (N = 254) and usage logs/survey data.
The improvement from allowing ChatGPT use was statistically significant in specific courses (examples named: computer systems administration, informatics, childhood disorders).
Course-level analyses using GLM and non-parametric comparisons showing statistically significant treatment effects in some courses; sample drawn from the full N = 254 distributed across six courses (per-course Ns not specified in summary).
Allowing students to use ChatGPT on knowledge-based academic tasks led to generally higher scores compared with control groups restricted to non-GenAI resources.
Randomized/experimental assignment of students to treatment (allowed ChatGPT) vs control (no GenAI) across six courses at two institutions; overall sample N = 254; comparisons made using descriptive statistics, general linear model (GLM) controlling for covariates, and non-parametric tests.
Policy and platform design choices (e.g., provenance metadata, detection/disclosure of AI-generated content, monetization rule alignment) can reinforce or mitigate harms from GenAI-driven creator economies.
Policy recommendations and implications drawn from the qualitative findings across the 377-video sample and normative reasoning; not empirically tested.
Policy interventions that raise the reinstatement rate — for example, compensation/transfers to translate AI gains into broad-based purchasing power, faster/stronger fiscal support or automatic stabilizers — can prevent the explosive feedback and stabilize demand.
Model experiments and sensitivity analysis showing that increasing the reinstatement elasticity or direct transfers moves the system from explosive to convergent parameter regions in the calibrated phase-space.
FutureBoosting generalizes across multiple real-world electricity markets and forecast horizons.
Empirical results reported across 'multiple real-world electricity markets' and several forecasting horizons to capture diverse volatility and regime behavior (details on exact markets/horizons are reported in the experiments section of the paper).
The approach preserves the interpretability of downstream regression models while injecting temporal context.
Use of interpretable regression models (e.g., gradient-boosted decision trees) and XAI analyses (SHAP/feature importance) reported in the paper demonstrating interpretability of feature contributions.
Freezing the TSFM (no joint fine-tuning) makes the framework lightweight and plug-and-play, lowering computational cost relative to joint training.
Architectural design: two-stage pipeline with a frozen TSFM used only to generate forecasted features; paper asserts ability to leverage pretrained TSFMs without end-to-end retraining. (No detailed compute-cost benchmarks given in the summary.)
MAE reductions frequently exceed 30% in many cases when using FutureBoosting.
Reported quantitative results in the paper showing relative MAE reductions (paper text: 'reductions in Mean Absolute Error (MAE) exceeding 30% in many cases'); based on experiments across multiple datasets/horizons.
FutureBoosting consistently outperforms state-of-the-art TSFMs and regression baselines.
Head-to-head experiments in the paper comparing the two-stage FutureBoosting pipeline to standalone TSFM models and common regression baselines (e.g., gradient-boosted trees) across multiple markets and horizons under rolling-origin evaluation.
FutureBoosting substantially improves electricity price forecasting.
Empirical evaluation reported in the paper across multiple real-world electricity market datasets and forecasting horizons; comparisons against TSFM-only and regression-only baselines using time-series-aware cross-validation; primary metric: Mean Absolute Error (MAE).
For economic and policy analysis, researchers should estimate distributions of effects, account for dynamic adaptation/nonstationarity, pre-register plans, track model versions, and combine RCTs with longitudinal/observational/structural methods.
Implications and recommendations section synthesized from practitioner interviews (n=16) and authors' applied methodological reasoning.
High-stakes deployment, governance, and safety decisions should not rely on single uplift RCTs; they require synthesis across studies, ongoing monitoring, scenario analysis, and explicit uncertainty characterization.
Authors' recommendations drawn from thematic analysis of interview data (n=16) and the mapped validity consequences; policy implications section articulates this guidance.
The paper's mechanism is strategyproof at an epoch granularity under its assumptions (quasilinear utilities, discrete slice items, decision epochs).
Theoretical mechanism-design claim presented in the paper relying on stated assumptions (quasilinear utility, discrete slices, epoch-based decisions). Empirical simulations assume truthful bidding per epoch consistent with this property but do not evaluate inter-epoch strategic deviations.
Scaffold choice creates an economic opportunity for third-party tooling and open-source scaffolding because scaffold effects materially affect performance and reproducibility.
Observed performance differences across scaffolds (up to ~5 percentage points) and sensitivity of results to scaffold selection reported in the study.
NFD increases complementarities between domain experts and AI, raising demand for hybrid roles (expert + knowledge engineer) and skills in elicitation, verification, and artifact design.
Conceptual argument in implications section, supported by practical demands observed in the case study (coordination between analysts and knowledge engineering activities).
The case study produced modular knowledge artifacts (rules, templates, tests) that supported reuse and auditability.
Empirical artifact production in the case study: creation of templates, checklists, heuristics, and test suites; reuse counts and audit traces were tracked qualitatively and with reuse metrics (exact numbers not specified).
In the same case study, iterative crystallization increased the consistency/reliability of agent outputs.
Case study measurements of agent reliability and qualitative practitioner feedback/acceptance across development spirals; precise quantitative details and sample size are not reported.
In a detailed case study building a U.S. equity financial research agent, iterative crystallization reduced per-task human effort.
Case study with iterative co-development with financial analysts; interaction transcripts logged and operational metrics (time per analysis) reported across development spirals. The paper does not report sample size or statistical tests.
Annotator affective traits shift labeling propensity (toward positivity); classifiers trained on pooled annotator labels may inherit systematic biases from annotator heterogeneity.
Observed associations between trait mood/reactivity and increased positive labeling in GEE models; extrapolated implication for classifier training when using pooled labels from heterogeneous annotators.
Trait-level mood and emotional reactivity weakly predict a higher tendency to label statements as positive (and fewer as neutral).
Statement-level repeated-measures generalized estimating equations (GEE) using the 81 participants' repeated labels of 30 statements per round; trait mood and reactivity variables were significant predictors in GEE models for positive vs neutral labeling, but with small effect sizes.
Replacing the binary meta-analysis assumption (fully homogeneous vs fully heterogeneous) with KL-based adaptive pooling reduces inefficiency or bias that can arise under the binary assumption.
Motivating discussion and theoretical/simulation comparisons in the paper showing cases where standard approaches (fixed-effect or random-effect extremes) are inefficient or biased, and the KL method performs better.
Application to the eICU Collaborative Research Database demonstrates the practical performance of the KL-shrinkage method on a heterogeneous, multi-center clinical dataset.
Real-data empirical application described in the paper using the eICU database; reported performance comparisons (specific dataset size and metrics are provided in the paper's empirical section but are not specified in this summary).
Extensive simulation studies show the KL-shrinkage estimator is robust and versatile across varying degrees and structures of heterogeneity.
Comprehensive simulation experiments reported in the paper that vary heterogeneity magnitude and structure (simulation details reported in the empirical evaluation section; exact sample sizes/configurations given in the paper).
Using KL divergence as the penalty is a natural and tractable choice because KL measures relative information between distributions and leads to convenient geometric/algebraic properties.
Argumentation and mathematical exposition in the methods section explaining properties of KL divergence and demonstrating resulting tractability in algebraic derivations.
Inferential procedures (e.g., confidence intervals and hypothesis tests) based on the KL-shrinkage approach are asymptotically valid without assuming parameter homogeneity across datasets.
Asymptotic theoretical results in the paper establishing validity (coverage and test properties) even under heterogeneity assumptions; details in asymptotic analysis section.
CBCTRepD improves report structure, reduces omissions, and promotes more systematic attention to co-existing lesions across anatomical regions in CBCT reports.
Clinical evaluation findings reported in the paper indicate improvements in structure, reduced omissions, and increased attention to multi-region co-existing lesions when using the system. (Operational definitions of 'structure', how omissions were identified, and measurement methods are not detailed in the provided text.)
Senior radiologists using CBCTRepD produce collaborative reports with reduced omission-related errors, including fewer clinically important missed lesions.
Clinician-centered assessment described in the evaluation; paper reports reductions in omission-related errors and clinically important missed lesions for seniors when using the system. (The provided summary does not list the number of senior reviewers, counts of omissions before/after, or statistical testing.)