Evidence (7395 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	609	159	77	736	1615
Governance & Regulation	664	329	160	99	1273
Organizational Efficiency	624	143	105	70	949
Technology Adoption Rate	502	176	98	78	861
Research Productivity	348	109	48	322	836
Output Quality	391	120	44	40	595
Firm Productivity	385	46	85	17	539
Decision Quality	275	143	62	34	521
AI Safety & Ethics	183	241	59	30	517
Market Structure	152	154	109	20	440
Task Allocation	158	50	56	26	295
Innovation Output	178	23	38	17	257
Skill Acquisition	137	52	50	13	252
Fiscal & Macroeconomic	120	64	38	23	252
Employment Level	93	46	96	12	249
Firm Revenue	130	43	26	3	202
Consumer Welfare	99	51	40	11	201
Inequality Measures	36	105	40	6	187
Task Completion Time	134	18	6	5	163
Worker Satisfaction	79	54	16	11	160
Error Rate	64	78	8	1	151
Regulatory Compliance	69	64	14	3	150
Training Effectiveness	81	15	13	18	129
Wages & Compensation	70	25	22	6	123
Team Performance	74	16	21	9	121
Automation Exposure	41	48	19	9	120
Job Displacement	11	71	16	1	99
Developer Productivity	71	14	9	3	98
Hiring & Recruitment	49	7	8	3	67
Social Protection	26	14	8	2	50
Creative Output	26	14	6	2	49
Skill Obsolescence	5	37	5	1	48
Labor Share of Income	12	13	12	—	37
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Adoption Remove filter

Deployed AI systems can produce algorithmic bias that harms marginalized groups when models are trained on skewed or non‑representative data.

Synthesis of prior empirical findings and case studies on algorithmic bias and fairness in ML systems; paper does not present new empirical tests.

medium-high negative Towards Responsible Artificial Intelligence Adoption: Emergi... fairness metrics, disparate error rates, incidence of discriminatory outcomes fo...

Human reviewers may over-trust machine-generated language and explanations (automation bias), reducing the likelihood of detecting fraudulent outputs.

Reference to automation-bias literature and conceptual examples; threat modeling and illustrative vignettes in the article.

medium-high negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... detection rate of fraudulent outputs by human reviewers when outputs are machine...

Existing internal audit and compliance frameworks focus on access, transaction, and system controls, not on content-generation integrity.

Literature and standards review combined with threat-control mapping demonstrating gaps in content/provenance coverage.

medium-high negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... coverage of content-generation integrity within existing audit/compliance framew...

AI systems and economic models are biased toward European languages because of lack of vernacular corpora; investing in high-quality corpora for African vernaculars (e.g., Cameroon Pidgin) is necessary to avoid misallocation of resources.

Policy implication extrapolated from the study's finding that vernacular mediation materially affects outcomes, combined with general knowledge about data-driven AI bias; no empirical AI-modeling tests in the paper.

speculative negative (current state) / positive (recommended investment) From Linguistic Hybridity to Development Sovereignty: Pidgin... AI model performance and allocation bias (inferred, not measured)

The introduction of cognitive technologies into business processes sets new requirements for market opportunity analytics, and digital analytics makes it possible to accurately measure its impact on business models and innovative solutions.

Conceptual statement in the paper's introduction; no empirical test or numerical evidence provided in the excerpt.

speculative null result Innovative Cognitive Tools for Studying Market Opportunities... accuracy/capability of market opportunity analytics to measure impact of cogniti...

There are research opportunities to measure returns to 'teaching' (causal impact of configuring agents on human skill accumulation and earnings) and to model agent-platform ecosystems with network effects, spillovers, and endogenous quality hierarchies.

Author-stated research agenda and proposed empirical questions derived from the observed phenomena; not empirical results but recommended directions.

speculative null result When Openclaw Agents Learn from Each Other: Insights from Em... need for future causal estimates of returns to teaching and formal models of eco...

Future research should quantify calibration and skill of LLMs over longer horizons, develop ensembles that pair LLMs with domain specialists, and expand temporally grounded benchmarks across different conflict types.

Authors' stated research agenda and limitations: calls for longer-horizon calibration studies and broader benchmarking derived from observed domain heterogeneity and the scope of the present snapshot.

speculative null result When AI Navigates the Fog of War future research outputs (calibration metrics, ensemble methods, expanded benchma...

Recommended research priorities include hierarchical/temporal-decomposition methods, continual learning, robust adaptation to non-stationarity, and causal/structured reasoning to handle multi-factor interactions.

Paper discussion linking observed failure modes to methodological gaps and proposing research directions to address limitations; these are recommendations rather than experimentally validated claims.

speculative null result RetailBench: Evaluating Long-Horizon Autonomous Decision-Mak... suggested research directions to improve robustness (proposed, not empirically v...

Regulators and payers will require clinical validation, safety guarantees, and clear liability frameworks for human–AI shared decision-making before widescale deployment.

Policy implication stated in the paper's discussion section based on general regulatory considerations; not an empirical result from the study.

speculative null result Hierarchical Reinforcement Learning Based Human-AI Online Di... regulatory requirements / safety validation (anticipated, not measured)

Empirical economics research should use firm-level and pipeline microdata and quasi-experimental designs to estimate causal effects of AI adoption on outcomes like time-to-hit, preclinical attrition, IND filings, and NME approvals per R&D dollar.

Research recommendation offered in the paper based on identified gaps; not an evidence claim but an explicit methodological suggestion.

speculative null result Learning from the successes and failures of early artificial... recommended empirical outcomes to be measured: time-to-hit, preclinical attritio...

Policy does not predict individuals' intent to increase usage but functions as a marker of maturity—formalizing successful diffusion by Enthusiasts while acting as a gateway the Cautious have yet to reach.

Analysis of a policy variable within the survey dataset (N=147) showing no predictive relationship with individual intent to increase AI use, but an association between presence of policy and indicators of organizational adoption/maturity and differential reach into archetype groups.

medium-low null result Developers in the Age of AI: Adoption, Policy, and Diffusion... Individual intent to increase usage; organizational policy presence; organizatio...

Prospective studies are needed to evaluate AI's real-world clinical impact in acute GIB.

Authors' recommendation in the discussion and conclusion based on the predominance of retrospective evidence and few prospective/RCTs.

speculative null result How Do AI-Assisted Diagnostic Tools Impact Clinical Decision... need for prospective evaluation of clinical impact (recommendation)

The study recommends iterative prompt refinement, integration with adaptive learning models, and further exploration of autonomous self-prompting mechanisms.

Concluding recommendations derived from the study's results and interpretation; presented as future directions rather than empirically tested interventions within this study.

speculative null result Prompt Engineering for Autonomous AI Agents: Enhancing Decis... recommendations for methods and research directions (not an empirical outcome me...

Future research should explore sector-specific AI adoption challenges and long-term workforce adaptation strategies.

Author recommendation presented in the paper's discussion/future work section of the summary.

speculative null result Artificial intelligence and organisational transformation: t... N/A (recommended future research topics)

Recommended future research includes scalable interoperability solutions, longitudinal lifecycle value validation, human‑centred adoption strategies, and sustainability assessment methods.

Authors' explicit recommendations at the end of the review based on identified gaps in the literature.

speculative null result Digital Twins Across the Asset Lifecycle: Technical, Organis... priority research areas to address current evidence gaps

Researchers should combine qualitative studies with administrative/matched employer–employee data and experimental/quasi-experimental designs (pilot rollouts, staggered adoption) to identify causal effects of AI on tasks, productivity, and wages.

Methodological recommendation by authors based on limitations of their qualitative study (15 UX designers) and the need to quantify observed phenomena; not an empirical claim tested in the paper.

speculative null result The Values of Value in AI Adoption: Rethinking Efficiency in... recommended measurement approaches for causal identification (task allocation, p...

Recommended research directions: combine neural summary networks with explicit uncertainty modules (e.g., conditional normalizing flows), benchmark against classical econometric estimators, explore transfer learning for pre-trained estimators, and study interpretability and sensitivity to misspecification.

Authors' recommendations based on limitations and implications discussed in the paper; these are forward-looking propositions rather than empirically supported claims.

speculative null result ForwardFlow: Simulation only statistical inference using dee... research agenda items (qualitative recommendations)

Future research priorities include obtaining causal estimates (e.g., field experiments) of productivity gains from trust-mediated AI adoption and conducting cost–benefit analyses of trust-building interventions.

Study’s stated research agenda/recommendations; not an empirical claim but a recommended direction for follow-up research.

speculative null result Algorithmic Trust and Managerial Effectiveness: The Role of ... causal productivity estimates and cost–benefit outcomes (research recommendation...

Key research priorities include improving measurement of AI usage across countries, causal identification of long-run effects, and sectoral reskilling strategy evaluation.

Identified gaps and methodological limitations in the reviewed empirical literature (measurement heterogeneity, limited long-run panels, sectoral variation) motivating suggested future research agenda.

speculative null result S-TCO: A Sustainable Teacher Context Ontology for Educationa... quality and scope of future empirical evidence on AI economic effects

To measure and monitor these effects, researchers should track firm-level adoption of AI features, fulfillment automation intensity, platform-mediated market entry, and task-level labor shifts.

Author recommendations based on gaps identified in the case-based and multi-modal empirical work and the sensitivity of results to adoption measures; not an empirical finding but a methodological claim.

speculative null result Artificial Intelligence–Enabled E-Commerce Systems and Autom... measurement coverage metrics (availability/quality of adoption and task-shift da...

Policy priorities should differ by national Skill Imbalance: countries with strong demand for new skills should prioritize education and reskilling, while countries with strong supply should prioritize firm absorption (innovation, financing, technology adoption).

Interpretation of cross-country Skill Imbalance Index and its implications; prescriptive recommendation based on the observed demand–supply patterns rather than causal testing of policies.

speculative null result Bridging Skill Gaps for the Future Policy emphasis (education/reskilling versus firm absorption) inferred from Skil...

The results indicate the need to build digital infrastructure, human capital, and support open data.

Policy recommendation provided in the paper based on the empirical findings linking cognitive tools to market opportunities (specific cost–benefit or implementation analyses not provided in the excerpt).

speculative positive Innovative Cognitive Tools for Studying Market Opportunities... policy actions (digital infrastructure, human capital development, open data sup...

Developing domain-specific vernacular NLP and speech models (health, agriculture, education) would help replicate pragmatic features (proverbs, registers) that enable epistemic appropriation.

Policy/research recommendation based on qualitative findings that proverbs and registers confer legitimacy and facilitate knowledge transfer; no experimental NLP work reported in study.

speculative positive From Linguistic Hybridity to Development Sovereignty: Pidgin... potential improvement in vernacular AI-assisted advisory effectiveness (proposed...

Local-language (vernacular) inclusion improves economic returns to development interventions by increasing comprehension and adoption, thereby improving program cost-effectiveness.

Logical extrapolation from observed higher comprehension and adoption rates in the field sample (N = 45); no direct economic cost–benefit analysis reported in the study—claim framed as implication for AI economics.

speculative positive From Linguistic Hybridity to Development Sovereignty: Pidgin... implied economic return / cost-effectiveness (inferred from uptake/comprehension...

Findings support regulatory focus on transparency, auditability, and consumer protections because low trust would slow adoption and reduce welfare gains from AI marketing.

Policy implication derived from empirical association between trust and adoption/loyalty in the study; regulatory effects were not empirically tested in the paper.

speculative positive Trust in AI-Driven Marketing and its Impact on Brand Loyalty... Policy relevance (inferred impact on adoption and welfare)

Investments in trustworthy AI systems (privacy, transparency, fairness) can increase retention and customer lifetime value because trust raises loyalty directly and via adoption.

Managerial implication inferred from observed positive direct and indirect effects of Trust on Brand Loyalty in the SEM results; CLV and retention were not directly measured.

speculative positive Trust in AI-Driven Marketing and its Impact on Brand Loyalty... Customer retention / Customer Lifetime Value (inferred, not directly measured)

Economic evaluations of AI adoption should include psychological and human-capital externalities (effects on self-efficacy, skill depreciation, job satisfaction) to fully account for welfare and productivity dynamics.

Argument grounded in experimental and survey findings showing psychological impacts of AI-use mode; general recommendation for research and evaluation rather than an empirical finding.

speculative positive Relying on AI at work reduces self-efficacy, ownership, and ... recommended evaluation scope (inclusion of psychological/human-capital measures)

A research agenda for AI economists should include building multimodal detection models for greenwashing and earnings management using text, financials, satellite imagery, and supply‑chain data.

Prescriptive research agenda item in the paper; no empirical implementation or benchmark results presented here.

speculative positive SUSTAINABILITY ISSUES IN FINANCIAL ACCOUNTING RESEARCH detection accuracy / precision-recall of greenwashing/earnings-management models

AI and NLP methods can be used to scale verification of ESG disclosures by cross‑checking them with regulatory filings, news, supply‑chain data, satellite imagery, and alternative data to flag inconsistencies.

Proposed methodological solution in the paper's implications and research agenda; suggestion is prescriptive and not validated by new experiments in this review.

speculative positive SUSTAINABILITY ISSUES IN FINANCIAL ACCOUNTING RESEARCH detection of inconsistencies / flagged potential manipulation

If banks operationalize NLP for personalization and acquisition at scale, this could increase differentiation, raise switching costs, and potentially affect market concentration—warranting antitrust monitoring.

Theoretical implication extrapolated from identified capability gaps and economic reasoning about differentiation, switching costs, and scaling advantages; not empirically tested in the reviewed papers.

speculative positive Natural language processing in bank marketing: a systematic ... market structure indicators (differentiation, switching costs, market concentrat...

Limited applied research on NLP for acquisition and personalization implies unrealized value in banking: NLP could enable more efficient, targeted customer acquisition and cross‑sell, potentially lowering customer‑acquisition cost (CAC) and increasing lifetime value (LTV).

Inference drawn from observed topical gaps (low article counts on acquisition/personalization) and standard marketing economics linking targeting/personalization to CAC and LTV; no direct causal evidence provided in the reviewed literature.

speculative positive Natural language processing in bank marketing: a systematic ... customer‑acquisition cost (CAC), customer lifetime value (LTV), acquisition effi...

Standardizing these infra-level primitives could lower integration costs across ecosystems and accelerate enterprise adoption of agent-hosted services.

Policy/economic argument presented in the paper's implications and research directions; no empirical standardization impact study provided.

speculative positive Bridging Protocol and Production: Design Patterns for Deploy... integration cost per deployment; enterprise adoption rate over time after standa...

Missing infraprotocol primitives in MCP create opportunities for platform differentiation—providers implementing CABP/ATBA/SERF-like extensions can capture value by offering more production-ready agent tooling.

Strategic/economic reasoning stated in the implications section; not supported by empirical market-share data in the summary.

speculative positive Bridging Protocol and Production: Design Patterns for Deploy... market share or customer adoption of providers offering these extensions; differ...

Public archives of prompts and commits accelerate diffusion by lowering search/learning costs and enabling replication, thereby increasing adoption speed and lowering entry barriers.

Paper's asserted implication based on the existence of public artifacts and general reasoning about knowledge diffusion; this is an interpretive claim rather than an experimentally validated finding (argumentative, extrapolative).

speculative positive Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau E... hypothesized effect on diffusion/adoption (not directly measured in the project)

Developing economic metrics linked to architecture (interoperability indices, expected upgrade cost, observability coverage, market concentration measures, systemic‑risk indicators) is recommended to guide policy and investment.

Policy recommendation grounded in the paper's normative analysis; no pilot metric development or empirical validation presented.

speculative positive The Internet of Physical AI Agents: Interoperability, Longev... availability and use of architecture‑linked economic metrics

The benchmark provides a testbed useful for studying strategic behavior, coordination failures, and market-like interactions among agents, which can inform economic research and policy.

Paper claims the benchmark's multi-agent, strategic tasks can be used as experimental environments for economic and policy research; this is a normative claim supported by the benchmark's design rather than by empirical studies in the paper.

speculative positive The PokeAgent Challenge: Competitive and Long-Context Learni... utility of benchmark as a research/testbed for studying strategic/multi-agent ph...

Open-source orchestration lowers entry barriers, broadening participation and potentially compressing rents that would otherwise accrue to well-resourced incumbents.

Paper's discussion section argues that releasing orchestration and evaluation tools publicly reduces the technical overhead for entrants; this is a theoretical/observational claim rather than empirically measured in the paper.

speculative positive The PokeAgent Challenge: Competitive and Long-Context Learni... predicted change in barrier-to-entry and market rents (qualitative)

The clear performance gaps indicate high returns to specialized efforts (RL, domain-specific engineering) relative to generalist LLM-only approaches, shaping where teams invest labor and compute.

Paper links benchmarking results (performance gaps between baselines and humans) to economic implications, arguing specialization yields higher returns; this is an interpretive claim based on reported performance differentials.

speculative positive The PokeAgent Challenge: Competitive and Long-Context Learni... economic return on investment inference based on performance differences between...

Benchmarks like PokeAgent will reallocate researcher and industry attention toward multi-agent, partial-observability, and long-horizon planning problems—likely increasing funding and compute investment in RL and hybrid LLM+RL methods.

Paper offers an economic/implication analysis arguing that introducing such a benchmark changes incentives and investment patterns; this is a reasoned projection rather than an empirical observation.

speculative positive The PokeAgent Challenge: Competitive and Long-Context Learni... predicted shifts in researcher/industry attention and investment (qualitative fo...

Public investment in open environments, robotics testbeds, and safety research can reduce concentration risks and externalities and democratize access to embodied AI research.

Policy recommendation based on anticipated strategic importance of shared infrastructure; not empirically validated here.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... accessibility of research infrastructure; distribution of research capabilities ...

Value in the AI ecosystem may shift from passive text/image corpora toward rich interaction datasets and simulated/real environments; ownership and control of simulation platforms and testbeds could become strategically important assets.

Economic and strategic inference from the proposed technical emphasis on embodied/interaction learning; no supporting market data in the paper.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... asset valuations for simulation/testbed providers; transaction volumes for inter...

Increased sample efficiency and transfer will reduce compute and data costs, lowering barriers to entry for firms and broadening feasible AI applications.

Economic argument connecting technical metrics to cost and market effects; not empirically demonstrated in the paper.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... compute/data cost per task; market entry rates for firms

More autonomous learners that can self-experiment and learn from observation will lower deployment costs for adaptable agents and accelerate automation across more occupations, especially embodied and social tasks.

Economic reasoning and projection based on expected technical improvements; speculative without empirical economic analysis in the paper.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... cost of deploying adaptable agents; rate of automation adoption across occupatio...

Cross-cutting elements (hierarchical organization, curriculum/bootstrapping, intrinsic motivation, uncertainty estimation, memory consolidation, neuromodulatory analogs) are important for improving learning in the proposed architecture.

Conceptual recommendation based on known mechanisms from neuroscience and machine learning literature; not validated in the paper.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... improvements in sample efficiency, robustness, transfer when these elements are ...

System M (meta-control) should generate internal signals that decide when to prioritize A vs B, allocate attention, consolidate memory, and trade off uncertainty, novelty, expected information value, and effort costs.

Design proposal motivated by biological meta-control and decision theories; no empirical tests presented.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... accuracy/effectiveness of switching decisions; overall learning efficiency when ...

System B (action-driven learning) should learn through intervention, consequences, and trial-and-error, using active exploration, reinforcement learning, and hierarchical/skill learning.

Architectural proposal aligning with RL and hierarchical learning literature; theoretical description without experimental evidence.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... efficacy of skills learned through action (task success rates; learning speed fr...

System A (observation-driven learning) should build models of others, social contingencies, and passive affordances through imitation, self-supervised representation learning, and inverse RL.

Architectural specification and mapping to existing algorithms (imitation, SSL, inverse RL); no empirical validation provided.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... quality of models learned from observation; accuracy of inferred social continge...

Integrating observation-driven and action-driven learning with meta-control and evolutionary/developmental priors should improve sample efficiency, robustness, transfer, and lifelong adaptation.

Conceptual argument and proposed integration of methods; suggested but untested experimentally in the paper.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... sample efficiency; robustness to distribution shift; cross-domain transfer; life...

A biologically inspired three-part architecture (System A: observation-driven learning; System B: action-driven learning; System M: internally generated meta-control) can address these limitations.

Theoretical proposal and analogy to biological systems; no empirical validation reported in the paper.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... sample efficiency; robustness; transfer; lifelong adaptation

Embedding LLM coaching tools in platforms (employee onboarding, customer support, peer-support communities) could raise overall conversational quality by improving expressive outcomes rather than only informational accuracy.

Authors' implication drawn from trial results showing improved alignment to empathic norms after personalized coaching; no field deployment evidence provided in the paper.

speculative positive Practicing with Language Models Cultivates Human Empathic Co... conversational quality (expressive empathy) — extrapolated

« Prev 1 2 3 … 144 145 146 147 148 Next »