Evidence (3492 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	609	159	77	736	1615
Governance & Regulation	664	329	160	99	1273
Organizational Efficiency	624	143	105	70	949
Technology Adoption Rate	502	176	98	78	861
Research Productivity	348	109	48	322	836
Output Quality	391	120	44	40	595
Firm Productivity	385	46	85	17	539
Decision Quality	275	143	62	34	521
AI Safety & Ethics	183	241	59	30	517
Market Structure	152	154	109	20	440
Task Allocation	158	50	56	26	295
Innovation Output	178	23	38	17	257
Skill Acquisition	137	52	50	13	252
Fiscal & Macroeconomic	120	64	38	23	252
Employment Level	93	46	96	12	249
Firm Revenue	130	43	26	3	202
Consumer Welfare	99	51	40	11	201
Inequality Measures	36	105	40	6	187
Task Completion Time	134	18	6	5	163
Worker Satisfaction	79	54	16	11	160
Error Rate	64	78	8	1	151
Regulatory Compliance	69	64	14	3	150
Training Effectiveness	81	15	13	18	129
Wages & Compensation	70	25	22	6	123
Team Performance	74	16	21	9	121
Automation Exposure	41	48	19	9	120
Job Displacement	11	71	16	1	99
Developer Productivity	71	14	9	3	98
Hiring & Recruitment	49	7	8	3	67
Social Protection	26	14	8	2	50
Creative Output	26	14	6	2	49
Skill Obsolescence	5	37	5	1	48
Labor Share of Income	12	13	12	—	37
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Innovation Remove filter

Case studies indicate FinTech platforms have meaningfully lowered rejection rates and loan turnaround times for underbanked MSMEs, accelerating working‑capital access.

Illustrative case studies of FinTech deployments in India reporting lower rejection rates and faster approvals; paper explicitly notes these cases are illustrative and not nationally representative and do not establish causal identification.

medium positive Traditional vs. contemporary financing models for MSMEs and ... loan rejection rate, loan turnaround time, working‑capital access

Supply‑chain financing can meaningfully unlock working capital for MSMEs by leveraging buyer creditworthiness, yielding high impact for MSMEs embedded in modern supply chains.

Comparative evaluation and illustrative case studies highlighting supply‑chain finance deployments; evidence is demonstrative and not nationally representative or causally identified.

medium positive Traditional vs. contemporary financing models for MSMEs and ... working capital availability for MSMEs, impact magnitude for supply‑chain‑embedd...

Optimal financing outcomes generally come from hybrid approaches that combine formal banking credibility and policy support with FinTech speed and data-driven underwriting.

Comparative evaluation and policy synthesis recommending co‑lending, credit guarantees, and partnerships (banks as liquidity providers combined with FinTech underwriting); based on qualitative tradeoff analysis rather than experimental/causal evidence.

medium positive Traditional vs. contemporary financing models for MSMEs and ... overall financing outcomes (access, cost, risk mitigation)

Compared with traditional bank loans and government schemes, contemporary financing models tend to be faster, more flexible, and more scalable for smaller firms.

Comparative qualitative evaluation across five variables and illustrative case studies showing reduced loan turnaround times and improved accessibility for small firms; no nationally representative sample or causal inference provided.

medium positive Traditional vs. contemporary financing models for MSMEs and ... loan turnaround time, flexibility of repayment, scalability to small firms

Digital technologies — especially FinTech lending platforms, alternative debt/equity products, supply‑chain finance, crowdfunding, and emerging blockchain applications — are materially expanding timely access to capital for Indian MSMEs and startups.

Multi‑criteria comparative evaluation (accessibility, finance cost, flexibility, risk, scalability) plus illustrative case studies of FinTech and alternative financing deployments in India that report faster turnaround and inclusion effects. The paper notes case evidence is illustrative rather than nationally representative and lacks quantitative causal identification.

medium positive Traditional vs. contemporary financing models for MSMEs and ... timely access to capital (availability and speed of financing for MSMEs/startups...

Proprietary experimental datasets and curated metagenomic sequences become valuable intellectual assets that can differentiate commercial offerings.

Paper lists 'Data as an economic asset' and highlights the value of proprietary datasets and curated metagenomes; no market valuation data are included.

medium positive Protein structure prediction powered by artificial intellige... commercial value attributed to proprietary sequence/structure datasets and their...

Faster, cheaper access to structural hypotheses can shorten drug and enzyme discovery cycles, raising R&D productivity and lowering marginal costs of early‑stage screening.

Paper argues this as an implication under 'Productivity and R&D acceleration'; it is presented as an economic consequence rather than demonstrated with empirical cost‑or time‑saving data in the text.

medium positive Protein structure prediction powered by artificial intellige... duration and cost of early‑stage drug/enzyme discovery cycles and marginal cost ...

Practical applications are already emerging, including accelerating target structure availability for small‑molecule and biologics design, guiding enzyme redesign, and interpreting disease mutations.

Paper lists these application areas as emerging uses of AI‑predicted structures; evidence is presented as examples and implications rather than empirical case studies within the text.

medium positive Protein structure prediction powered by artificial intellige... availability of structural hypotheses for drug/biology design, utility in enzyme...

Template‑and‑MSA informed architectures (e.g., RoseTTAFold and AlphaFold family) deliver near‑experimental accuracy for many proteins.

Paper names these architectures and links their inputs (MSAs, templates) to high accuracy against experimental structures (PDB); specific evaluation datasets, protein counts, or error metrics are not enumerated in the text.

medium positive Protein structure prediction powered by artificial intellige... fraction of proteins for which prediction accuracy is near experimental (structu...

Modern AI systems (e.g., AlphaFold variants, RoseTTAFold, single‑sequence models like ESMFold) can approach or reach near‑experimental accuracy while greatly increasing speed and scalability.

Paper cites specific models (AlphaFold family, RoseTTAFold, ESMFold) and describes benchmarking against structural ground truth (PDB / curated experimental structures) and large‑scale pretraining; exact benchmark values or sample sizes are not specified in the text.

medium positive Protein structure prediction powered by artificial intellige... structure prediction accuracy (compared to experimental structures) and inferenc...

New economic metrics are needed for VR (value of behavioral data streams, cost per reduction in harm, ROI on security investments, welfare metrics capturing trust and adoption).

Authors' recommendations based on identified gaps in the literature and the comparative review of 31 studies; proposed as agenda items rather than empirically developed metrics.

medium positive Securing Virtual Reality: Threat Models, Vulnerabilities, an... availability and use of new economic metrics for VR security and privacy (recomm...

VR generates high‑value behavioral and biometric datasets for AI personalization, training, and analytics; firms that extract this data can gain competitive advantages, creating incentives to centralize collection unless counteracted by policy or market forces.

Economic implications inferred by the authors from the literature synthesis and standard industrial‑organization logic; not supported by original empirical market data in the paper.

medium positive Securing Virtual Reality: Threat Models, Vulnerabilities, an... incentives for data centralization and resulting competitive advantage (conceptu...

There is a need for regulatory standards, industry best practices, and ethics‑by‑design approaches; interoperable policy frameworks are recommended to govern VR security and privacy.

Policy and governance recommendations synthesized from multiple reviewed studies and the authors' integration; presented as prescriptive guidance rather than empirically tested interventions.

medium positive Securing Virtual Reality: Threat Models, Vulnerabilities, an... adoption of regulatory/standards frameworks and their expected effect on privacy...

An effective defense mix for VR combines technical controls (secure boot, attestation, encrypted communications), AI tools for anomaly detection and policy enforcement, and human‑centered design (transparency, consent, usable controls).

Cross‑study synthesis showing these categories recur as recommended controls in the 31 reviewed papers; authors propose combining them in TVR‑Sec. No deployment or performance metrics provided.

medium positive Securing Virtual Reality: Threat Models, Vulnerabilities, an... overall defense effectiveness from combined controls (theoretical/proposed)

Socio‑Behavioral Safety measures (moderation, design constraints, psycho‑social safeguards) are necessary to prevent harassment, persuasion, addictive interfaces, and other psychological harms in shared virtual spaces.

Qualitative synthesis of social‑behavioral harms and proposed mitigations reported across the literature review (31 studies); comparative evaluation of socio‑technical controls.

medium positive Securing Virtual Reality: Threat Models, Vulnerabilities, an... incidence or severity of harassment/manipulation/psychological harms (identified...

User Privacy in VR requires managing highly sensitive behavioral and biometric traces with privacy‑preserving ML approaches (e.g., federated learning, differential privacy), consent mechanisms, and data minimization.

Repeated recommendations across the reviewed studies; authors synthesized privacy‑preserving technical approaches and governance mechanisms from the 31‑study corpus. No primary experiments demonstrating efficacy provided.

medium positive Securing Virtual Reality: Threat Models, Vulnerabilities, an... reduction in privacy risk for behavioral/biometric data (proposed, not empirical...

System Integrity defenses should cover hardware, firmware, sensors, and networks to protect against spoofing, device tampering, malware, and supply‑chain attacks.

Aggregated technical recommendations from the literature corpus (31 studies) and the authors' mapping of integrity threats to controls (secure boot, attestation, encrypted communications). No empirical testing of these controls in the paper.

medium positive Securing Virtual Reality: Threat Models, Vulnerabilities, an... coverage of integrity‑related threat mitigation (conceptual)

The Three‑Layer VR Security Framework (TVR‑Sec) integrates System Integrity, User Privacy, and Socio‑Behavioral Safety into an adaptive, multidimensional defense architecture for VR systems.

Conceptual synthesis developed by the authors from a comparative literature review of 31 peer‑reviewed studies (2023–2025); framework created by mapping identified vulnerabilities to technical, AI, and human‑centered controls. No empirical validation or deployment testing reported.

medium positive Securing Virtual Reality: Threat Models, Vulnerabilities, an... proposed comprehensiveness/coverage of VR security defenses (conceptual architec...

A coordinated Omnibus that clarifies interactions with the DSA and establishes consistent AI-focused enforcement capacity can reduce regulatory frictions, lower compliance costs, and better align incentives for responsible AI deployment.

Policy recommendation based on comparative mapping and scenario analysis; qualitative argumentation rather than empirical testing.

medium positive The Digital Omnibus and the Future of EU Regulation: Implica... regulatory frictions; compliance costs; incentives for responsible AI deployment

The iterative, human-in-the-loop agent workflow enables evaluation and refinement of algorithmic clusters into logically consistent, theory-ready categories.

Described iterative loop where agents evaluate clusters, align semantics, and refine outputs; qualitative assessments reported though no formal user-study metrics included in summary.

medium positive THETA: A Textual Hybrid Embedding-based Topic Analysis Frame... logical consistency and theory-readiness of resulting topic categories

DAFT via LoRA reshapes semantic vector geometry to highlight domain-relevant distinctions without full model retraining.

Methodological claim: LoRA fine-tuning applied to foundation embeddings to adjust vector space; no geometric analyses or quantitative illustrations provided in the summary.

medium positive THETA: A Textual Hybrid Embedding-based Topic Analysis Frame... changes in semantic vector geometry / enhanced separation of domain-relevant con...

Across six domains THETA outperforms LDA, ETM, and CTM on measures of coherence and domain interpretability.

Reported comparative experiments across six domains using coherence metrics and qualitative/human interpretability assessments against LDA, ETM, CTM. Summary does not provide effect sizes, statistical tests, or per-domain breakdowns.

medium positive THETA: A Textual Hybrid Embedding-based Topic Analysis Frame... topic coherence scores and human-rated domain interpretability

THETA substantially improves the interpretability and domain-specific coherence of topic/cluster outputs on very large social-text corpora.

Reported experiments comparing THETA to traditional topic models (LDA, ETM, CTM) across six domains; evaluation reportedly used topic coherence metrics and human-in-the-loop interpretability assessments/qualitative comparisons (no numeric results provided in summary).

medium positive THETA: A Textual Hybrid Embedding-based Topic Analysis Frame... topic interpretability and domain-specific topic coherence

Lowering fixed costs via shared resources can enable more entrants and niche innovators (e.g., specialized clinical apps).

Workshop economic implications and participant assertions in breakout sessions and plenary at the NSF workshop (Sept 26–27, 2024).

medium positive Report for NSF Workshop on Algorithm-Hardware Co-design for ... number of market entrants, emergence of niche products, diversity of suppliers

Public investment in shared data and compute as nonrival public goods will reduce duplication, lower entry barriers, and increase total R&D productivity.

Workshop implications for AI economics articulated by participants and authors as a policy recommendation; rationale stated in the summary document (NSF workshop, Sept 26–27, 2024).

medium positive Report for NSF Workshop on Algorithm-Hardware Co-design for ... duplication of effort, entry barriers (number of entrants), and aggregate R&D pr...

De-risk pathways from lab to clinic via reproducible benchmarks, continuous monitoring, and cross-sector collaborations (academia, industry, clinicians, regulators).

Workshop translation-focused recommendations and roadmap produced by consensus at the NSF workshop (Sept 26–27, 2024).

medium positive Report for NSF Workshop on Algorithm-Hardware Co-design for ... time-to-market, reproducibility metrics, and rate of successful clinical transla...

Enable safe, accountable, and resilient platforms (including virtual–physical healthcare ecosystems) to reduce translational risk.

Workshop recommendations addressing safety, resilience, and virtual–physical ecosystems from cross-disciplinary discussion at NSF workshop (Sept 26–27, 2024).

medium positive Report for NSF Workshop on Algorithm-Hardware Co-design for ... measures of translational risk (failure rates in translation, incidents, safety ...

Promote scalable validation ecosystems grounded in objective, continuous measures and physics-informed models.

Workshop validation and safety theme recommendations from panels and consensus-building exercises (NSF workshop, Sept 26–27, 2024).

medium positive Report for NSF Workshop on Algorithm-Hardware Co-design for ... presence and scalability of validation ecosystems; reliability/robustness metric...

Develop clinic workflow–aware systems and human–AI collaboration frameworks to fit real clinical practice and decision chains.

Stated systems and workflows recommendation from expert panels and clinician participants at the NSF workshop (Sept 26–27, 2024).

medium positive Report for NSF Workshop on Algorithm-Hardware Co-design for ... compatibility of AI-enabled systems with clinical workflows; measures of clinici...

Build shared compute infrastructures tailored to medical workloads and validation needs.

Workshop recommendation from infrastructure-themed sessions and consensus outcomes (NSF workshop, Sept 26–27, 2024).

medium positive Report for NSF Workshop on Algorithm-Hardware Co-design for ... existence and utilization of shared compute infrastructure for medical R&D (comp...

Sustain investment in shared, standardized data infrastructures (datasets, ontologies, benchmarks) to support medical algorithm–hardware co-design.

Workshop infrastructure call presented during breakout sessions and final recommendations at the NSF workshop (Sept 26–27, 2024).

medium positive Report for NSF Workshop on Algorithm-Hardware Co-design for ... availability and use of standardized medical datasets/ontologies/benchmarks

Principal recommendation: shift from isolated algorithm or hardware efforts to integrated algorithm–hardware–workflow co-design for medical contexts.

Stated workshop recommendation derived from panels and cross-disciplinary consensus at the NSF workshop (Sept 26–27, 2024).

medium positive Report for NSF Workshop on Algorithm-Hardware Co-design for ... alignment and integration of R&D efforts (degree of co-design adoption in projec...

Sustained public investment and new validation, governance, and translation ecosystems are needed to de-risk commercialization and accelerate safe, accountable clinical adoption.

Workshop principal recommendation based on qualitative synthesis of expert judgment from participants and breakout outcomes (NSF workshop, Sept 26–27, 2024).

medium positive Report for NSF Workshop on Algorithm-Hardware Co-design for ... commercialization risk level and speed/rate of clinical adoption

Enabling next-generation medical technologies requires a fundamental reorientation toward algorithm–hardware co-design that is clinic-aware, validated continuously, and backed by shared data and compute infrastructures.

Consensus recommendation from a two-day NSF workshop (Sept 26–27, 2024) in Pittsburgh convening interdisciplinary participants (academic researchers in algorithms and hardware, clinicians, industry leaders). Methods: expert panels, thematic breakout sessions, cross-disciplinary discussions, consensus-building. Documentation at https://sites.google.com/view/nsfworkshop.

medium positive Report for NSF Workshop on Algorithm-Hardware Co-design for ... successful development and clinical adoption of next-generation medical technolo...

Automation of routine SE tasks suggests measurable productivity gains at team and firm levels, but quantification requires causal, outcome-based studies (e.g., throughput, defect rates, time-to-market).

Interpretation of literature review findings and survey-reported perceived productivity gains; no causal empirical estimates provided in the paper.

medium positive Artificial Intelligence as a Catalyst for Innovation in Soft... potential productivity metrics (throughput, defect rates, time-to-market) — not ...

Empirical survey evidence shows generally positive perceptions of AI tools among software engineering professionals and growing adoption.

Cross-sectional survey of software engineering professionals asking about current tool usage and perceived benefits (productivity, quality, speed); absolute respondent count and sampling frame not provided in the summary.

medium positive Artificial Intelligence as a Catalyst for Innovation in Soft... self-reported perception of AI tools and self-reported adoption rate

ML enables predictive features in software engineering: effort estimation, defect prediction, work prioritization, and risk forecasting that support Agile planning and continuous delivery.

Literature review of ML-for-SE research and practitioner survey reporting use or expectations of predictive features; specific model performance metrics or dataset sizes not reported in the summary.

medium positive Artificial Intelligence as a Catalyst for Innovation in Soft... availability/use of predictive outputs (e.g., estimated effort, defect risk scor...

NLP techniques improve requirements management and team collaboration by extracting intent from natural-language artifacts (tickets, specs, PRs) and reducing miscommunication.

Synthesis of prior studies in the literature review and survey responses indicating perceived improvement in requirements handling and communication; survey sample size not reported.

medium positive Artificial Intelligence as a Catalyst for Innovation in Soft... perceived reduction in miscommunication / improved clarity of requirements

The method lowers the technical barrier for adopting surrogates in economics by removing dependence on specialized Bayesian neural-network techniques while preserving rigorous uncertainty quantification.

Argument in Implications section: decoupling uncertainty quantification from network architecture allows use of deterministic NNs with MCMC-sampled parameter inputs; no user-study or adoption metrics provided.

medium positive MCMC Informed Neural Emulators for Uncertainty Quantificatio... ease of adoption / reduction in technical complexity required to obtain UQ-prese...

The theoretical diagnostic (linking distribution mismatch to performance loss) gives practitioners a practical tool to detect when a surrogate trained on one parameter distribution will underperform after recalibration or policy changes.

Paper-provided theoretical result and suggested diagnostic use; empirical validation of the diagnostic is implied but not detailed in the summary.

medium positive MCMC Informed Neural Emulators for Uncertainty Quantificatio... diagnostic effectiveness in detecting performance degradation under distribution...

This approach dramatically reduces computation (training and/or evaluation wall-clock time) compared to approaches that sample network weights (Bayesian NNs) or exhaustively explore parameter grids.

Computational evaluation reported in the paper includes empirical examples demonstrating substantial reductions in wall-clock training/evaluation time relative to weight-sampling or exhaustive-parameter-grid baselines (exact datasets, runtimes, and sample sizes not detailed in the summary).

medium positive MCMC Informed Neural Emulators for Uncertainty Quantificatio... wall-clock training time and evaluation time

Training a deterministic neural surrogate conditioned on MCMC-drawn parameter samples reproduces the original (forward) model's uncertainty quantification while avoiding embedding parametric uncertainty inside the network weights.

Methodological description: surrogate is a deterministic NN whose inputs include parameter vectors drawn by MCMC from the model-parameter posterior; uncertainty is recovered by repeatedly evaluating the trained surrogate on those MCMC draws. Empirical examples are reported (details not provided here) showing reproduction of model uncertainty.

medium positive MCMC Informed Neural Emulators for Uncertainty Quantificatio... fidelity of uncertainty quantification / posterior predictive distributions prod...

The proposed pipeline (CFD -> CFM -> CFR) forms a closed loop that can assess and improve color fidelity in T2I systems.

Paper describes end-to-end workflow: CFD provides training/validation labels for CFM; CFM produces scores and attention maps for evaluation and localization; CFR consumes CFM attention during generation to refine images. The repository contains code implementing the pipeline.

medium positive Too Vivid to Be Real? Benchmarking and Calibrating Generativ... end-to-end improvement in measured color fidelity when applying CFD-trained CFM ...

Color Fidelity Refinement (CFR) is a training-free inference-time procedure that uses CFM attention maps to adaptively modulate spatial-temporal guidance scales during generation, thereby improving color authenticity of realistic-style T2I outputs without retraining the base model.

Method description in paper: CFR uses CFM's learned attention to identify low-fidelity regions and adapt guidance strength across space and denoising steps (spatial-temporal guidance). The authors evaluate CFR on existing T2I models and report improved perceived color authenticity; no retraining of base T2I models is required (implementation and code available in the repository).

medium positive Too Vivid to Be Real? Benchmarking and Calibrating Generativ... perceived color authenticity of generated images; requirement (or not) to retrai...

CFM aligns better with objective color realism judgments than existing preference-trained metrics and human ratings that favor vividness.

Empirical comparisons reported in the paper: CFM scoring shows improved alignment with CFD-based color-realism labels and with evaluation criteria that prioritize photographic fidelity, outperforming preference-trained metrics and the biased patterns in human ratings (paper reports both qualitative and quantitative gains; specific numerical improvements and test set sizes are provided in the paper/repo).

medium positive Too Vivid to Be Real? Benchmarking and Calibrating Generativ... alignment with color-realism judgments / correlation with CFD ground truth

The Color Fidelity Metric (CFM) is a multimodal encoder–based metric trained on CFD to predict human-consistent judgments of color fidelity and to produce spatial attention maps that localize color-fidelity errors.

Model architecture and training procedure described: a multimodal encoder trained using CFD's ordered realism labels to output scalar fidelity scores and spatial attention maps indicating where color fidelity issues occur. Training supervision comes from CFD's ordered labels (paper includes training/validation procedures; exact training dataset splits are in the paper/repo).

medium positive Too Vivid to Be Real? Benchmarking and Calibrating Generativ... color-fidelity scalar scores and spatial attention maps (localization of color e...

Varying sample size, injecting contaminated data, and including algorithm-reconstruction tasks during training allow networks to automatically inherit those properties (e.g., multi-n behavior, robustness, algorithmic outputs).

Empirical: training regimes described include varying dataset size n, contaminated simulations, and algorithm-reconstruction tasks; experiments reportedly show networks trained with these variations exhibit corresponding behaviors at test time. Specific experimental details (ranges of n, contamination levels) are not included in the summary.

medium positive ForwardFlow: Simulation only statistical inference using dee... presence of targeted properties in trained networks (finite-sample behavior acro...

Collapsing (aggregation) layers mimic reduction to sufficient statistics and enforce the desirable structure for set-valued (permutation-invariant) inputs.

Theoretical/design claim supported by architectural description and motivation: collapsing layers aggregate across observations to produce summaries, enforcing permutation invariance; supported indirectly by empirical success in simulations. This is primarily an architectural/representational argument rather than a purely empirical result.

medium positive ForwardFlow: Simulation only statistical inference using dee... permutation-invariance and quality of summary representations (qualitative/archi...

The network can learn to approximate the outputs of iterative estimation algorithms (demonstrated by learning an EM algorithm for a genetic-data estimation task).

Empirical: a genetic-data example where the network was trained (including an algorithm-reconstruction task) to approximate the EM algorithm outputs; evaluation shows qualitative/quantitative match to the iterative algorithm. Evidence is from reported experiments comparing network outputs to EM outputs (e.g., MSE between them).

medium positive ForwardFlow: Simulation only statistical inference using dee... similarity between network outputs and iterative algorithm outputs (e.g., MSE or...

Training the network with contaminated simulations yields estimators that are robust to contaminated observations at test time.

Empirical: experiments included injecting contaminated data into training simulations; evaluation measured robustness at test time under contamination and showed improved performance relative to networks not trained on contamination. Supported by reported robustness comparisons (metrics like MSE under contamination). Specific contamination rates and sample sizes are not provided in the summary.

medium positive ForwardFlow: Simulation only statistical inference using dee... robustness to contamination (estimation error / MSE under contaminated test data...

« Prev 1 2 3 … 60 61 62 … 69 70 Next »