Evidence (8570 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	758	199	100	900	2007
Governance & Regulation	826	400	191	122	1563
Organizational Efficiency	777	193	124	84	1189
Technology Adoption Rate	635	233	124	97	1098
Research Productivity	422	128	57	336	954
Output Quality	476	179	59	47	761
Decision Quality	328	177	81	47	640
Firm Productivity	435	57	88	20	606
AI Safety & Ethics	218	277	65	33	599
Market Structure	180	170	123	24	502
Task Allocation	213	64	72	33	387
Skill Acquisition	170	61	61	17	309
Innovation Output	203	27	43	18	292
Employment Level	105	54	107	13	281
Fiscal & Macroeconomic	131	69	43	26	276
Consumer Welfare	117	63	42	11	233
Firm Revenue	153	48	26	3	230
Task Completion Time	173	31	8	12	225
Inequality Measures	44	122	49	6	221
Worker Satisfaction	89	65	22	12	188
Error Rate	69	92	10	2	173
Regulatory Compliance	77	69	14	5	165
Automation Exposure	56	56	26	13	154
Training Effectiveness	94	21	13	19	149
Wages & Compensation	77	36	25	6	144
Team Performance	86	17	27	10	141
Developer Productivity	95	17	14	6	133
Job Displacement	12	80	20	1	113
Hiring & Recruitment	52	7	8	3	70
Creative Output	31	18	8	3	61
Skill Obsolescence	5	46	6	1	58
Social Protection	27	16	8	2	53
Labor Share of Income	17	19	17	—	53
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Adoption Remove filter

Investments in trustworthy AI systems (privacy, transparency, fairness) can increase retention and customer lifetime value because trust raises loyalty directly and via adoption.

Managerial implication inferred from observed positive direct and indirect effects of Trust on Brand Loyalty in the SEM results; CLV and retention were not directly measured.

speculative positive Trust in AI-Driven Marketing and its Impact on Brand Loyalty... Customer retention / Customer Lifetime Value (inferred, not directly measured)

Economic evaluations of AI adoption should include psychological and human-capital externalities (effects on self-efficacy, skill depreciation, job satisfaction) to fully account for welfare and productivity dynamics.

Argument grounded in experimental and survey findings showing psychological impacts of AI-use mode; general recommendation for research and evaluation rather than an empirical finding.

speculative positive Relying on AI at work reduces self-efficacy, ownership, and ... recommended evaluation scope (inclusion of psychological/human-capital measures)

A research agenda for AI economists should include building multimodal detection models for greenwashing and earnings management using text, financials, satellite imagery, and supply‑chain data.

Prescriptive research agenda item in the paper; no empirical implementation or benchmark results presented here.

speculative positive SUSTAINABILITY ISSUES IN FINANCIAL ACCOUNTING RESEARCH detection accuracy / precision-recall of greenwashing/earnings-management models

AI and NLP methods can be used to scale verification of ESG disclosures by cross‑checking them with regulatory filings, news, supply‑chain data, satellite imagery, and alternative data to flag inconsistencies.

Proposed methodological solution in the paper's implications and research agenda; suggestion is prescriptive and not validated by new experiments in this review.

speculative positive SUSTAINABILITY ISSUES IN FINANCIAL ACCOUNTING RESEARCH detection of inconsistencies / flagged potential manipulation

If banks operationalize NLP for personalization and acquisition at scale, this could increase differentiation, raise switching costs, and potentially affect market concentration—warranting antitrust monitoring.

Theoretical implication extrapolated from identified capability gaps and economic reasoning about differentiation, switching costs, and scaling advantages; not empirically tested in the reviewed papers.

speculative positive Natural language processing in bank marketing: a systematic ... market structure indicators (differentiation, switching costs, market concentrat...

Limited applied research on NLP for acquisition and personalization implies unrealized value in banking: NLP could enable more efficient, targeted customer acquisition and cross‑sell, potentially lowering customer‑acquisition cost (CAC) and increasing lifetime value (LTV).

Inference drawn from observed topical gaps (low article counts on acquisition/personalization) and standard marketing economics linking targeting/personalization to CAC and LTV; no direct causal evidence provided in the reviewed literature.

speculative positive Natural language processing in bank marketing: a systematic ... customer‑acquisition cost (CAC), customer lifetime value (LTV), acquisition effi...

Standardizing these infra-level primitives could lower integration costs across ecosystems and accelerate enterprise adoption of agent-hosted services.

Policy/economic argument presented in the paper's implications and research directions; no empirical standardization impact study provided.

speculative positive Bridging Protocol and Production: Design Patterns for Deploy... integration cost per deployment; enterprise adoption rate over time after standa...

Missing infraprotocol primitives in MCP create opportunities for platform differentiation—providers implementing CABP/ATBA/SERF-like extensions can capture value by offering more production-ready agent tooling.

Strategic/economic reasoning stated in the implications section; not supported by empirical market-share data in the summary.

speculative positive Bridging Protocol and Production: Design Patterns for Deploy... market share or customer adoption of providers offering these extensions; differ...

Public archives of prompts and commits accelerate diffusion by lowering search/learning costs and enabling replication, thereby increasing adoption speed and lowering entry barriers.

Paper's asserted implication based on the existence of public artifacts and general reasoning about knowledge diffusion; this is an interpretive claim rather than an experimentally validated finding (argumentative, extrapolative).

speculative positive Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau E... hypothesized effect on diffusion/adoption (not directly measured in the project)

Developing economic metrics linked to architecture (interoperability indices, expected upgrade cost, observability coverage, market concentration measures, systemic‑risk indicators) is recommended to guide policy and investment.

Policy recommendation grounded in the paper's normative analysis; no pilot metric development or empirical validation presented.

speculative positive The Internet of Physical AI Agents: Interoperability, Longev... availability and use of architecture‑linked economic metrics

The benchmark provides a testbed useful for studying strategic behavior, coordination failures, and market-like interactions among agents, which can inform economic research and policy.

Paper claims the benchmark's multi-agent, strategic tasks can be used as experimental environments for economic and policy research; this is a normative claim supported by the benchmark's design rather than by empirical studies in the paper.

speculative positive The PokeAgent Challenge: Competitive and Long-Context Learni... utility of benchmark as a research/testbed for studying strategic/multi-agent ph...

Open-source orchestration lowers entry barriers, broadening participation and potentially compressing rents that would otherwise accrue to well-resourced incumbents.

Paper's discussion section argues that releasing orchestration and evaluation tools publicly reduces the technical overhead for entrants; this is a theoretical/observational claim rather than empirically measured in the paper.

speculative positive The PokeAgent Challenge: Competitive and Long-Context Learni... predicted change in barrier-to-entry and market rents (qualitative)

The clear performance gaps indicate high returns to specialized efforts (RL, domain-specific engineering) relative to generalist LLM-only approaches, shaping where teams invest labor and compute.

Paper links benchmarking results (performance gaps between baselines and humans) to economic implications, arguing specialization yields higher returns; this is an interpretive claim based on reported performance differentials.

speculative positive The PokeAgent Challenge: Competitive and Long-Context Learni... economic return on investment inference based on performance differences between...

Benchmarks like PokeAgent will reallocate researcher and industry attention toward multi-agent, partial-observability, and long-horizon planning problems—likely increasing funding and compute investment in RL and hybrid LLM+RL methods.

Paper offers an economic/implication analysis arguing that introducing such a benchmark changes incentives and investment patterns; this is a reasoned projection rather than an empirical observation.

speculative positive The PokeAgent Challenge: Competitive and Long-Context Learni... predicted shifts in researcher/industry attention and investment (qualitative fo...

Public investment in open environments, robotics testbeds, and safety research can reduce concentration risks and externalities and democratize access to embodied AI research.

Policy recommendation based on anticipated strategic importance of shared infrastructure; not empirically validated here.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... accessibility of research infrastructure; distribution of research capabilities ...

Value in the AI ecosystem may shift from passive text/image corpora toward rich interaction datasets and simulated/real environments; ownership and control of simulation platforms and testbeds could become strategically important assets.

Economic and strategic inference from the proposed technical emphasis on embodied/interaction learning; no supporting market data in the paper.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... asset valuations for simulation/testbed providers; transaction volumes for inter...

Increased sample efficiency and transfer will reduce compute and data costs, lowering barriers to entry for firms and broadening feasible AI applications.

Economic argument connecting technical metrics to cost and market effects; not empirically demonstrated in the paper.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... compute/data cost per task; market entry rates for firms

More autonomous learners that can self-experiment and learn from observation will lower deployment costs for adaptable agents and accelerate automation across more occupations, especially embodied and social tasks.

Economic reasoning and projection based on expected technical improvements; speculative without empirical economic analysis in the paper.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... cost of deploying adaptable agents; rate of automation adoption across occupatio...

Cross-cutting elements (hierarchical organization, curriculum/bootstrapping, intrinsic motivation, uncertainty estimation, memory consolidation, neuromodulatory analogs) are important for improving learning in the proposed architecture.

Conceptual recommendation based on known mechanisms from neuroscience and machine learning literature; not validated in the paper.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... improvements in sample efficiency, robustness, transfer when these elements are ...

System M (meta-control) should generate internal signals that decide when to prioritize A vs B, allocate attention, consolidate memory, and trade off uncertainty, novelty, expected information value, and effort costs.

Design proposal motivated by biological meta-control and decision theories; no empirical tests presented.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... accuracy/effectiveness of switching decisions; overall learning efficiency when ...

System B (action-driven learning) should learn through intervention, consequences, and trial-and-error, using active exploration, reinforcement learning, and hierarchical/skill learning.

Architectural proposal aligning with RL and hierarchical learning literature; theoretical description without experimental evidence.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... efficacy of skills learned through action (task success rates; learning speed fr...

System A (observation-driven learning) should build models of others, social contingencies, and passive affordances through imitation, self-supervised representation learning, and inverse RL.

Architectural specification and mapping to existing algorithms (imitation, SSL, inverse RL); no empirical validation provided.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... quality of models learned from observation; accuracy of inferred social continge...

Integrating observation-driven and action-driven learning with meta-control and evolutionary/developmental priors should improve sample efficiency, robustness, transfer, and lifelong adaptation.

Conceptual argument and proposed integration of methods; suggested but untested experimentally in the paper.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... sample efficiency; robustness to distribution shift; cross-domain transfer; life...

A biologically inspired three-part architecture (System A: observation-driven learning; System B: action-driven learning; System M: internally generated meta-control) can address these limitations.

Theoretical proposal and analogy to biological systems; no empirical validation reported in the paper.

speculative positive Why AI systems don't learn and what to do about it: Lessons ... sample efficiency; robustness; transfer; lifelong adaptation

Embedding LLM coaching tools in platforms (employee onboarding, customer support, peer-support communities) could raise overall conversational quality by improving expressive outcomes rather than only informational accuracy.

Authors' implication drawn from trial results showing improved alignment to empathic norms after personalized coaching; no field deployment evidence provided in the paper.

speculative positive Practicing with Language Models Cultivates Human Empathic Co... conversational quality (expressive empathy) — extrapolated

LLM-driven personalized coaching can cheaply scale soft-skill training (empathy expression) that would otherwise require costly human trainers, suggesting a high-return application of AI in workforce development.

Implication drawn from observed efficacy of brief automated coaching in the trial and the scalable nature of LLM deployment; no direct economic field trial provided in the paper.

speculative positive Practicing with Language Models Cultivates Human Empathic Co... scalability and cost-effectiveness (extrapolated, not directly measured)

Barriers to entry may be larger for tacit‑capability‑driven systems than for rule‑based systems, potentially increasing market concentration.

Economic argument linking tacit capabilities to requirements for large data, compute, and specialized training dynamics; speculative and not empirically tested in the paper.

speculative positive Why the Valuable Capabilities of LLMs Are Precisely the Unex... market concentration / barriers to entry

HindSight-style retrospective matching could underpin markets or contingent contracts for ideas by providing an objective payoff rule based on later publications and citations.

Paper's implications section proposing that retrospective matching can be used as an objective payoff rule for markets; this is a proposed application rather than an empirical finding.

speculative positive HindSight: Evaluating LLM-Generated Research Ideas via Futur... Feasibility of using retrospective match-and-score rules as payoff mechanisms in...

By extracting more training value from the same environment interactions, LEAFE reduces marginal data/interaction costs and shifts the cost curve of deploying agentic systems (improves returns-to-sample-effort).

Economic implication argued in the paper based on reported increased sample efficiency under fixed budgets; no formal economic modeling provided—argumentative inference from performance gains per interaction.

speculative positive Internalizing Agency from Reflective Experience Effective cost per unit performance (implied reduction via higher Pass@k per int...

The methodology enables modular chiplet economics by removing a key validation bottleneck, which could support modular upgrade paths and lower manufacturing cost via mixed-node IP blocks.

Authors propose this as an implication of improved integration and repeatability; argumentative claim without accompanying manufacturing-cost or economic-case studies in the summary.

speculative positive ODIN-Based CPU-GPU Architecture with Replay-Driven Simulatio... manufacturing cost or modular upgrade feasibility (projected)

Replay-driven validation can reduce engineering labor hours spent chasing non-deterministic bugs, lowering validation cost per project and decreasing risk of late-stage silicon respins.

Economic implication presented by authors: deterministic, repeatable debugging is argued to reduce manual effort and risk; no empirical labor-hour or cost-savings data provided in the demonstration.

speculative positive ODIN-Based CPU-GPU Architecture with Replay-Driven Simulatio... engineering labor hours and validation cost per project (projected, not measured...

Replay-driven validation is positioned as a scalable pre-silicon validation strategy for future chiplet-based heterogeneous systems.

Authors articulate scalability as a key positioning argument and present the methodology applied to a non-trivial CPU+multiple-GPU-core+NoC demonstrator; however, no large-scale or multi-project scalability study or quantitative scaling metrics are provided.

speculative positive ODIN-Based CPU-GPU Architecture with Replay-Driven Simulatio... scalability/applicability to larger or varied chiplet-based systems (claimed, no...

A successful, stable parallel Newton software stack could spawn middleware and tooling ecosystems (sequence-parallel training/inference libraries), changing how cloud compute is sold and optimized for long-sequence workloads.

Forward-looking implication argued in the thesis based on observed algorithmic improvements and typical software-market dynamics; no empirical market studies provided.

speculative positive Unifying Optimization and Dynamics to Parallelize Sequential... emergence of middleware and market changes (speculative)

Higher utilization efficiency and lower memory footprints from the proposed methods can reduce energy per computation on sequence tasks, moderating environmental impacts of large-scale sequence modeling.

Argument based on measured reductions in runtime and memory in experimental results combined with standard relations between runtime/memory and energy; no direct energy-measurement experiments reported.

speculative positive Unifying Optimization and Dynamics to Parallelize Sequential... energy per computation (projected reduction)

If effective, these methods raise the value of parallel hardware (GPUs/TPUs) for sequence-heavy tasks and could increase demand for massive-parallel accelerators over specialized sequential hardware.

Economic and systems-level reasoning extrapolating from algorithmic speedups and memory reductions; no market-deployment experiments presented.

speculative positive Unifying Optimization and Dynamics to Parallelize Sequential... relative demand for parallel accelerators in sequence-heavy workloads (projected...

Enabling parallelization across sequence length can substantially increase GPU utilization and throughput for workloads previously dominated by sequential bottlenecks, reducing amortized compute cost per inference/training pass on long sequences.

Analytical argument based on observed runtime/parallelization improvements and the structure of GPU hardware; no large-scale economic deployment experiments reported in the thesis (argumentative/implicational evidence).

speculative positive Unifying Optimization and Dynamics to Parallelize Sequential... GPU utilization, throughput, and amortized compute cost per pass (projected)

There is a market opportunity for scalable 'control-as-a-service' offerings and curated urban traffic datasets enabled by this data-driven control approach.

Authors' market and policy discussion extrapolating from technical results to business models and data infrastructure value; conceptual reasoning rather than empirical market analysis.

speculative positive Data-driven generalized perimeter control: Zürich case study commercialization potential / emergence of data-driven service offerings (qualit...

Reductions in travel time and CO2 emissions translate into measurable economic benefits (lower fuel consumption, productivity gains, reduced pollution-related health costs).

Economic implications discussed qualitatively in the paper as extrapolation from measured reductions in travel time and emissions; no direct empirical economic quantification within the traffic simulation experiments.

speculative positive Data-driven generalized perimeter control: Zürich case study economic proxies: fuel consumption, travel-time value (productivity), pollution-...

Benchmarks and standards are needed for evaluating high-frequency time series performance to guide procurement and contracting decisions.

Paper recommends establishing standards and benchmarking protocols specifically for high-frequency time series, motivated by observed TSFM brittleness on millisecond data. This is a policy/research recommendation rather than an empirical result.

speculative positive Bridging the High-Frequency Data Gap: A Millisecond-Resoluti... existence and adoption of high-frequency TS benchmarking standards (recommendati...

Improved short-term forecasting enabled by high-frequency data can translate into operational benefits such as better resource allocation (spectrum, scheduling), reduced service-level violations, and enablement of new latency-sensitive services.

Paper argues these application-level benefits as implications of better forecasting for telecom control; these are projected outcomes based on the relevance of the forecasting horizons to control tasks, not empirically demonstrated in the summary.

speculative positive Bridging the High-Frequency Data Gap: A Millisecond-Resoluti... operational improvements (resource allocation efficiency, reduction in service-l...

High-frequency datasets (like millisecond 5G traces) are economically valuable; firms that collect such domain-specific, high-resolution data can gain competitive advantages in low-latency applications.

Paper's implications for AI economics argue that access to high-frequency operational data improves model performance for latency-sensitive tasks and therefore has economic value. This is an economic argument grounded in the empirical observation of model brittleness but not supported by market-level empirical analysis in the summary.

speculative positive Bridging the High-Frequency Data Gap: A Millisecond-Resoluti... economic value / competitive advantage derived from proprietary high-frequency d...

Research and engineering efforts should develop architectures, multi-scale modeling, and fine-tuning protocols tailored to high-frequency time series.

Paper recommends these research directions based on benchmark limitations (poor TSFM performance on high-frequency data). This is a prescriptive claim (future research needed) rather than an empirical result.

speculative positive Bridging the High-Frequency Data Gap: A Millisecond-Resoluti... anticipated improvement in high-frequency time-series performance through specia...

Heterogeneous datasets and missing hardware evaluation create market opportunities for third parties supplying standardized datasets, verification suites, and end-to-end benchmarks (economically valuable public goods).

Market-structure inference based on observed heterogeneity in datasets and the Layer 3b gap across the surveyed systems; presented as an implication in the review.

speculative positive Generative AI for Quantum Circuits and Quantum Code: A Techn... market opportunity for dataset/benchmark providers

Adaptive, resource-aware control of reasoning can reduce operational compute costs and energy usage, increase throughput and resource utilization, and enable new pricing or provisioning strategies for deployed embodied systems.

Paper includes an 'Implications for AI Economics' section arguing these outcomes as consequences of fewer/shorter LLM invocations and improved per-task latency and utilization; these are presented as implications rather than directly measured results.

speculative positive When Should a Robot Think? Resource-Aware Reasoning via Rein... operational cost (compute), energy usage, throughput, provisioning/ pricing impl...

Platform design that implements robust context‑sensitive memory gating (fine‑grained policy engines, provenance, auditable suppression logic) can reduce downstream harms and may become a competitive product differentiation.

Policy and product recommendation based on BenchPreS results; the paper offers this as a plausible solution path but does not provide experimental validation of such platform mechanisms.

speculative positive BenchPreS: A Benchmark for Context-Aware Personalized Prefer... Effectiveness of context‑sensitive memory gating in reducing harms (proposed, no...

The approach has potential to scale to other cities and informal sectors, but generalizability needs empirical testing.

Paper's policy/scaling claim; supported by pilot feasibility but explicitly notes the need for further testing and validation across contexts.

speculative positive AI-Driven Skill Mapping and Gig Economy Matching Algorithm f... scalability / external validity

Richer profiles that capture informal experience and community endorsements improve signaling and may increase returns to informal learning/experience.

Conceptual claim supported by the system's use of nontraditional inputs (community recommendations, short-term histories); the pilot suggests immediate improved matches but does not quantify returns to informal human capital over time.

speculative positive AI-Driven Skill Mapping and Gig Economy Matching Algorithm f... returns to informal learning (wage premia, employment stability)

Dynamic skill extraction and real-time opportunity discovery can increase market thickness, making matches faster and better.

Theoretical/economic implication drawn from system mechanics and pilot outcomes (improved matches and wages); no direct measurement of market thickness or match speeds reported in the summary.

speculative positive AI-Driven Skill Mapping and Gig Economy Matching Algorithm f... market thickness (number of active participants), match speed

Improved predictive accuracy from AI tools can potentially improve screening, promotion, and retention decisions and thereby increase firm productivity by better allocating human capital.

Framing/implication in the paper: authors argue improved measurement and prediction could plausibly enhance managerial decision quality; this is presented as an implication rather than an empirically tested result within the study.

speculative positive Adoption of AI-Based HR Analytics and Its Impact on Firm Pro... Managerial decision quality and firm productivity (hypothesized, not directly me...

Fee-for-service payment structures may not reward efficiency gains from AI; value-based payment or shared-savings models are better aligned to incentivize adoption that reduces total cost and improves outcomes.

Health policy and reimbursement literature synthesizing incentives under different payment models; limited empirical testing of reimbursement models for AI-assisted services.

medium_high positive Human-AI interaction and collaboration in radiology: from co... reimbursement levels, adoption under different payment models, cost savings real...

« Prev 1 2 3 … 168 169 170 171 172 Next »