Evidence (6869 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	758	199	100	900	2007
Governance & Regulation	826	400	191	122	1563
Organizational Efficiency	777	193	124	84	1189
Technology Adoption Rate	635	233	124	97	1098
Research Productivity	422	128	57	336	954
Output Quality	476	179	59	47	761
Decision Quality	328	177	81	47	640
Firm Productivity	435	57	88	20	606
AI Safety & Ethics	218	277	65	33	599
Market Structure	180	170	123	24	502
Task Allocation	213	64	72	33	387
Skill Acquisition	170	61	61	17	309
Innovation Output	203	27	43	18	292
Employment Level	105	54	107	13	281
Fiscal & Macroeconomic	131	69	43	26	276
Consumer Welfare	117	63	42	11	233
Firm Revenue	153	48	26	3	230
Task Completion Time	173	31	8	12	225
Inequality Measures	44	122	49	6	221
Worker Satisfaction	89	65	22	12	188
Error Rate	69	92	10	2	173
Regulatory Compliance	77	69	14	5	165
Automation Exposure	56	56	26	13	154
Training Effectiveness	94	21	13	19	149
Wages & Compensation	77	36	25	6	144
Team Performance	86	17	27	10	141
Developer Productivity	95	17	14	6	133
Job Displacement	12	80	20	1	113
Hiring & Recruitment	52	7	8	3	70
Creative Output	31	18	8	3	61
Skill Obsolescence	5	46	6	1	58
Social Protection	27	16	8	2	53
Labor Share of Income	17	19	17	—	53
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Governance Remove filter

Top-down AI guidance from institutions is common, while grassroots input from educators and students is often missing, which reduces policy relevance and uptake.

Survey items and thematic coding indicating the origin and participatory nature of institutional AI guidelines; comparative prevalence reported in open and closed responses.

low negative Exploring Student and Educator Challenges in AI Competency D... degree of grassroots input or participatory design in institutional AI policy fo...

Overreliance on GenAI CDS may lead to deskilling of clinicians, eroding judgment over time and increasing systemic vulnerability.

The paper cites theoretical risk and references limited longitudinal concerns; empirical longitudinal studies demonstrating deskilling are scarce per the paper’s stated evidence gaps.

low negative GenAI and clinical decision making in general practice clinician diagnostic skill over time; reliance/override rates; error rates when ...

Commercial structural biology services for routine solved folds may be commoditized, pushing firms toward complex validation, novel targets, or high‑value contract research.

Paper suggests this in 'Disruption of service markets' as a projected industry response; it is a strategic implication rather than an empirically demonstrated trend in the text.

low negative Protein structure prediction powered by artificial intellige... change in demand/pricing for routine structural biology services and shift towar...

Organizational compliance, governance, and transaction costs shape which AI uses are feasible, producing heterogeneity in adoption across firms; trust and accountability frictions can slow adoption even when productivity gains exist.

Workshop participants (n=15) reported compliance and governance considerations; authors infer broader organizational heterogeneity and friction effects from these qualitative data.

low negative The Values of Value in AI Adoption: Rethinking Efficiency in... adoption heterogeneity across firms; adoption speed/timing affected by governanc...

Designers’ expressed concerns about skill development suggest potential long-term effects on human capital accumulation; adoption that reduces learning opportunities could lower future wages or employability.

Participants' concerns captured in qualitative workshops (n=15); claim is an extrapolation to labor-market outcomes rather than direct measurement in the study.

low negative The Values of Value in AI Adoption: Rethinking Efficiency in... human capital accumulation; future wages; employability (hypothesized)

Private governance and firm-level solutions (internal standards, bargaining with unions) may proliferate, but these can entrench firm-specific norms and increase market power asymmetries.

Conceptual argument drawing on governance and industrial organization literature; no empirical measurement of prevalence or market-power effects included.

low negative AI governance under the second Trump administration: implica... prevalence of private governance; firm-specific norms; market power asymmetries

Inadequate protections reduce public trust in mobile-AI services, which can slow diffusion and undercut the growth trajectories that policy narratives anticipate.

Inferred from stakeholder commentary and policy discourse combined with communication-rights theory; the paper does not present survey or adoption-rate data.

low negative Promising Protection, Producing Exposure: AI Ethics and Mobi... public trust in mobile‑AI; adoption/diffusion rates

Low-wage and platform workers are particularly exposed to algorithmic management and surveillance, with potential downward pressure on wages, bargaining power, and job quality.

The paper's qualitative analysis of stakeholder comments and policy omissions, combined with literature-based inference about platform labor dynamics; no primary labor-market survey or quantitative wage data provided.

low negative Promising Protection, Producing Exposure: AI Ethics and Mobi... worker exposure to algorithmic management; wages; bargaining power; job quality

Soft‑law governance and growth-first narratives risk concentrating benefits (investment, productivity gains) while externalizing costs (privacy harms, biased decisioning) onto vulnerable populations, exacerbating inequality and reducing inclusive economic development.

Analytic inference from qualitative review of governance instruments and policy narratives combined with communications-ecology and political-economy reasoning; not based on quantitative economic measurement in the paper.

low negative Promising Protection, Producing Exposure: AI Ethics and Mobi... distribution of benefits and costs; inequality; inclusiveness of economic develo...

Uncertainty about long-run agentic behavior increases option value and downside risk of investing in agentic systems, which may raise discount rates and required returns.

Economic argument applying risk/return logic to agentic uncertainty; no quantitative empirical evidence provided.

low negative Visioning Human-Agentic AI Teaming: Continuity, Tension, and... investment valuation metrics (discount rates, required returns) for agentic syst...

Economic rents and advantages may accrue to agents who control large datasets, computing resources, and organizational processes that effectively integrate AI as a co-pilot, potentially increasing market concentration among AI providers.

Economic theory on scale economies and platform effects combined with observed industry patterns; reviewed literature provides conceptual arguments and case examples rather than broad empirical market-structure measurement.

low negative ChatGPT as an Innovative Tool for Idea Generation and Proble... market concentration measures; returns to data/compute ownership (not fully meas...

Generative AI poses substitution risk for entry-level or routine cognitive work focused on generation or drafting without evaluative responsibility.

Task-based analyses and case studies indicating automation potential for routine generation tasks; empirical demonstrations of AI-produced drafts/outputs that could replace such work, but longer-run displacement evidence is limited.

low negative ChatGPT as an Innovative Tool for Idea Generation and Proble... task automatability; employment/demand for routine-generation roles (largely unm...

Upfront integration and recurring governance costs mean smaller firms may face higher relative costs — potentially increasing scale advantages for larger incumbents.

Deployment case studies and cost reports indicating significant fixed integration and governance costs; inference to market structure is speculative.

low negative The Effectiveness of ChatGPT in Customer Service and Communi... relative upfront and ongoing costs; indicators of scale advantages or market con...

There is a risk of deskilling through excessive reliance on AI, implying a need for continuous training and certification to preserve human judgment.

Qualitative interview evidence and observed concerns about overreliance; authors recommend training/governance based on identified risks; no direct longitudinal measurement of deskilling provided in summary.

low negative Human-AI Synergy in Financial Decision-Making: Exploring Tru... human skill levels (deskilling risk); need for training/certification

Recommendation algorithms and widespread automated advice can induce herding or increase common exposures across retail investor portfolios, with potential macroprudential implications.

Theoretical discussion supported by examples from retail trading episodes and algorithmic amplification literature referenced in the review (conceptual and anecdotal evidence; limited systematic empirical quantification).

low negative Women's Investment Behaviour and Technology: Exploring the I... portfolio correlation across users, asset demand concentration, market volatilit...

Insurance markets may price AI-specific fraud risk, raising premiums or creating new products (AI-fraud insurance).

Speculative economic implication suggested by the authors; no market data or insurer statements cited.

low negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... changes in insurance pricing or product offerings attributable to AI-specific fr...

Vendors offering integrated governed hyperautomation stacks may capture premium pricing and increase switching costs, potentially widening adoption gaps between large incumbents and SMEs.

Market-structure and competitive dynamics discussed theoretically in the Implications section; no market-share or pricing data provided.

low negative Governed Hyperautomation for CRM and ERP: A Reference Patter... vendor pricing premiums; switching costs; differential adoption by firm size (ma...

Higher compliance and liability costs may be passed to districts, potentially affecting the affordability of EdTech for underfunded schools unless federal guidance or subsidies offset costs — a distributional concern.

Economic distributional reasoning (theoretical), not supported by empirical pricing or budget impact data in the Article.

low negative Civil Rights and the EdTech Revolution EdTech pricing to districts and affordability/access for underfunded schools

Regulators and standard-setters who value transparency and auditability will need to account for the gap between evaluation results and actionable fixes; firms may require incentives or rules to ensure evaluation leads to remediation, not just documentation.

Authors' policy implication derived from the study's finding of a results-actionability gap and discussion of auditability concerns; speculative recommendation rather than empirical finding.

low neutral Results-Actionability Gap: Understanding How Practitioners E... policy/regulatory effectiveness regarding evaluation leading to remediation (spe...

This study represents the first attempt to conduct a comprehensive evaluation of artificial intelligence (AI) and its influence on job displacement based on the existing body of literature.

Author assertion in the paper; the excerpt provides no external verification (no citation of prior reviews/meta-analyses to justify the 'first attempt' claim).

low null result A Study on Work-Life Balance of Women Employees in the IT Se... comprehensiveness of literature-based evaluation of AI's influence on job displa...

We currently lack an understanding of how political parties perceive the potential impact AI has on employment, the role of regulations in protecting workers from AI-related job losses, and the importance of AI educational and training programs.

Statement of a literature/knowledge gap motivating the study (assertion by the authors; no empirical basis provided in the excerpt).

low null result Political Ideology, Artificial Intelligence (AI), and Labor ... existing knowledge about political party perceptions of AI's impact on employmen...

Signal legitimacy was validated through negative control experiments.

Experimentation claim: the paper asserts that negative control experiments were run to validate that signals are not due to memorized ticker associations. The excerpt does not specify the design, number, or results of these negative controls.

low positive Can Blindfolded LLMs Still Trade? An Anonymization-First Fra... legitimacy of predictive signals (i.e., whether performance persists under negat...

Pidgin should not be treated as 'broken English' but as necessary linguistic infrastructure for repaired, sustainable development; failures often reflect language-sovereignty crises requiring political solutions.

Normative claim supported by mixed-methods findings on comprehension, adoption, and legitimacy, and Critical Discourse Analysis of institutional language hierarchies.

low positive From Linguistic Hybridity to Development Sovereignty: Pidgin... normative assessment of language status and policy implication (not a quantitati...

The paper advances a new conceptual framework called 'Developmental Sociolinguistics' and formalizes Three Laws of Linguistic Justice (Epistemic Access, Discursive Parity, Sovereignty), operationalized via a proposed 'Pidgin Protocol' for decolonized development practice.

Conceptual/theoretical contribution based on synthesis of field results and literature; proposal of framework and laws as normative prescriptions rather than empirically tested policy interventions.

low positive From Linguistic Hybridity to Development Sovereignty: Pidgin... theoretical/conceptual contribution (framework and protocol)

Standards for provenance, labeling of AI-generated content, and interoperable evidence formats would lower verification costs and create beneficial network effects.

Policy recommendation derived from identified verification frictions and the study's analysis of data/model governance needs.

low positive Fact-Checking Platforms in the Middle East: A Comparative St... verification cost and interoperability/network effects

There is growing market demand for AI-assisted fact-checking tools, creating opportunities for software, monitoring services, and labeled datasets.

Analytic implication drawn from findings about increasing AI use and needs for automation/labeling; based on interviews and market inference in the study.

low positive Fact-Checking Platforms in the Middle East: A Comparative St... market demand for AI tools and labeled datasets

Regulators should consider guidelines on AI monitoring, algorithmic fairness in performance evaluations, and protections to prevent hybrid‑induced career penalties.

Policy recommendation based on conceptual assessment of risks identified in literature synthesis; not an empirical claim—no policy evaluation data provided.

low positive The Sociology of Remote Work and Organisational Culture: How... existence/applicability of regulatory guidelines; protections against career pen...

Hybrid agency implies complementarity between GenAI and managerial/knowledge‑worker skills (curation, evaluation, coordination), potentially increasing returns to those skills while automating routine cognitive tasks—consistent with skill‑biased technological change.

Synthesis of recurring themes linking GenAI capabilities with managerial skill topics in the thematic clusters; positioned as an implication for labour demand and skill composition rather than an empirically tested effect.

low positive Generative AI and the algorithmic workplace: a bibliometric ... expected changes in returns to managerial/knowledge‑worker skills and automation...

Policy prescriptions for developing countries to mitigate these vulnerabilities include: diversify supply sources, invest in local human capital and mid-stream capabilities, create legal/regulatory flexibility to navigate competing standards, and pursue regional cooperation to build bargaining leverage.

Policy analysis and recommendations grounded in the mechanisms identified via process tracing and comparative cases; intended as prescriptive synthesis rather than empirically demonstrated interventions in the paper. (Based on inferred best-practice interventions; no empirical evaluation/sample size provided.)

low positive China-US Trade War and the Challenges for Developing Countri... effectiveness of policy measures (e.g., diversification index, human-capital ind...

There is demand for tooling that bridges evaluation outputs to actionable fixes (e.g., failure-mode libraries, standardized remediation templates, evaluation-to-priority mapping), signaling economic opportunities for third-party tools and consulting services.

Authors' inference based on the documented results-actionability gap and participants' descriptions of pain points; presented as a market implication rather than direct market measurement.

low positive Results-Actionability Gap: Understanding How Practitioners E... inferred market demand for evaluation-to-action tooling/services

Firms that invest in instrumentation, cross-functional processes, and remediation levers capture more value from LLMs; organizations with better evaluation-to-action pipelines will obtain higher productivity gains and market edge.

Authors' inference from observed heterogeneity among teams in the interviews and comparison of practices in teams that reported more success converting evaluations into changes.

low positive Results-Actionability Gap: Understanding How Practitioners E... relative productivity/value capture tied to evaluation-to-action capability (inf...

Public investments in standards, verification infrastructure, and public-interest datasets can correct market failures and support trustworthy AI.

Policy recommendation informed by governance and public-good theory and examples from the literature; the claim is prescriptive and not validated by new empirical evidence within the paper.

low positive The Evolution and Societal Impact of Artificial Intelligence... trustworthiness of AI systems and correction of market failures via public inves...

Policy instruments (law and markets) should be designed to remain institutionally and procedurally responsive to ethical claims that resist full codification (e.g., through participatory governance, oversight mechanisms, equitable redress, care-centered procurement standards).

Normative policy prescriptions derived from the Levinasian diagnosis and case illustrations; proposed measures are normative and not empirically evaluated within the paper.

low positive Examining ethical challenges in human–robot interaction usin... responsiveness of policy and market instruments to non-codifiable ethical claims...

Integrating Object-Oriented Ontology (OOO) and the material turn enables attention to nonhuman actors and assemblages without collapsing them into human-centered instrumentalism.

Theoretical synthesis of OOO/material-turn literature and argument that this synthesis offers analytic resources for socio-technical assemblages; illustrated conceptually in domains.

low positive Examining ethical challenges in human–robot interaction usin... conceptual adequacy of analytic lens for nonhuman actors and assemblages (qualit...

Humans who configure and teach agents gain understanding and skills themselves — learning-by-teaching generates human capital accumulation endogenous to agent deployment (bidirectional scaffolding).

Qualitative, naturalistic observations and comparative documentation of users configuring/teaching agents during the one-month study; no randomized assignment or pre/post quantitative skill testing reported.

low positive When Openclaw Agents Learn from Each Other: Insights from Em... human skill accumulation / understanding from configuring/teaching agents

Models trained primarily on negative constraints will generalize constraint adherence more robustly under distribution shift than models trained primarily on preference rankings.

Presented as a central, experimentally falsifiable prediction derived from the paper's theoretical account; the paper does not present large-scale empirical confirmation and recommends controlled experiments to test this.

low positive Via Negativa for AI Alignment: Why Negative Constraints Are ... robustness of constraint adherence under distribution shift (e.g., adherence rat...

Negative examples function as counterfactual eliminators that rule out regions of behavior space, allowing a model to settle on robust acceptable behavior, whereas positive preference signals require continual calibration in a high-dimensional, context-sensitive space.

Informal/structural theoretical argument and analogy to falsification presented in the paper; no direct empirical test reported there demonstrating this exact mechanism.

low positive Via Negativa for AI Alignment: Why Negative Constraints Are ... conceptual measure of behavioral space reduction and subsequent robustness (oper...

Regulators may prefer systems that support contestability and audit trails and could mandate argumentation-style explainability in certain sectors.

Speculative policy prediction; no regulatory statements or empirical policy adoption evidence cited.

low positive Argumentative Human-AI Decision-Making: Toward AI Agents Tha... regulatory adoption rate of contestability/audit-trail requirements

Better contestability may reduce litigation and regulatory frictions if decisions are transparently defensible.

Speculative legal-economic claim; no case studies or empirical legal analysis provided.

low positive Argumentative Human-AI Decision-Making: Toward AI Agents Tha... frequency/cost of litigation and regulatory disputes post-adoption of contestabl...

New service layers may emerge (argumentation-as-a-service, audit firms, explanation certification, human-in-the-loop orchestration platforms).

Speculative market/industry evolution claim based on analogous tech-service cretions; no empirical evidence.

low positive Argumentative Human-AI Decision-Making: Toward AI Agents Tha... emergence and market size of new service verticals around argumentative AI

New metrics are needed to value resilience (robustness to out-of-distribution events, graceful degradation) in procurement and contracting; performance-based contracts and regulated minimums for oversight mode selection can help align incentives.

Prescriptive recommendation based on gaps identified in procurement and contracting practice; conceptual proposal without empirical testing.

low positive Resilience Meets Autonomy: Governing Embodied AI in Critical... existence and use of resilience metrics in procurement/contracts and resulting a...

Demand will grow for tools and services that enable oversight (auditability, explainability, safe fallbacks), creating markets for verification, certification, safety middleware, and human-in-the-loop platforms.

Market-structure and demand-side reasoning based on the proposed governance needs; forecast-style projection without empirical market-data analysis.

low positive Resilience Meets Autonomy: Governing Embodied AI in Critical... market growth for oversight-enabling products and services (demand, number of ve...

Allocation decisions should be explicit, auditable, and adaptive — with provisions for overriding, fallbacks, and graceful degradation during unanticipated conditions.

Normative recommendation based on safety and accountability principles combined with crisis-management practices; argued via conceptual analysis and illustrative design features.

low positive Resilience Meets Autonomy: Governing Embodied AI in Critical... auditability, adaptability, and existence of override/fallback mechanisms in dep...

Investment in multimodal continual learning, scalable and reliable knowledge-editing methods, and retrieval architectures that guarantee cross-modal consistency is economically justified.

Research/prioritization recommendations based on empirical benchmark findings showing current gaps; argumentation for R&D focus areas.

low positive V-DyKnow: A Dynamic Benchmark for Time-Sensitive Knowledge i... recommended R&D investment priorities (qualitative)

The findings argue for policies requiring disclosure of training-data timeframes and robust monitoring for time-sensitive factual accuracy in deployed systems.

Policy recommendations in the paper drawing on benchmark results and identified failure modes; prescriptive argumentation rather than empirical policy evaluation.

low positive V-DyKnow: A Dynamic Benchmark for Time-Sensitive Knowledge i... policy recommendation advocating disclosure and monitoring (qualitative)

Models and platforms that offer transparent update mechanisms (frequent data updates, reliable RAG pipelines, clear training snapshot metadata) will have competitive advantages in the market.

Economic and market analysis in implications section recommending transparency and update mechanisms as differentiators; speculative/business-analytical evidence rather than experimental.

low positive V-DyKnow: A Dynamic Benchmark for Time-Sensitive Knowledge i... market differentiation potential (qualitative)

Embedding culturally aligned moderation and multi-layer safety orchestration can reduce regulatory frictions and increase adoption in conservative or tightly regulated markets.

Paper claims regulatory and safety economics implications from their safety/moderation architecture; this is an asserted implication rather than an empirically validated outcome in the summary.

low positive Fanar 2.0: Arabic Generative AI Stack regulatory friction and adoption (policy/economic impact, asserted)

The methods used (data quality focus, continual pre-training, model merging, modular product stacks) are potentially transferable to other underrepresented/low-resource languages, lowering barriers to regional AI competitiveness.

Paper posits this policy/transferability implication as an argument in the 'Implications for AI Economics' section; no cross-language experimental evidence provided in the summary.

low positive Fanar 2.0: Arabic Generative AI Stack transferability potential to other languages (qualitative)

Fanar 2.0 demonstrates that targeted data curation, continual pre-training, and model-merging can be a viable alternative to the raw-scale pre-training arms race for language-specific competitiveness.

Paper argues this implication based on achieving benchmark gains on Arabic and English using curated data (120B tokens), continual pre-training, model-merging, and a 256 H100 GPU training budget rather than massively larger-scale pre-training.

low positive Fanar 2.0: Arabic Generative AI Stack viability of alternative development strategy vs scale (conceptual/performance c...

Oryx provides Arabic-aware image/video understanding and culturally grounded image generation.

Paper identifies Oryx as the vision component with Arabic-aware understanding and culturally grounded generation; no benchmark metrics are provided in the summary.

low positive Fanar 2.0: Arabic Generative AI Stack vision model capability (Arabic-aware understanding and culturally grounded gene...

« Prev 1 2 3 … 130 131 132 … 137 138 Next »