Evidence (2290 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	373	105	59	437	982
Governance & Regulation	366	172	114	55	717
Research Productivity	237	95	34	294	664
Organizational Efficiency	364	82	62	34	545
Technology Adoption Rate	290	115	66	27	502
Firm Productivity	274	33	68	10	390
AI Safety & Ethics	116	177	44	24	363
Output Quality	231	61	23	25	340
Market Structure	107	121	85	14	332
Decision Quality	158	68	33	17	279
Employment Level	70	32	74	8	186
Fiscal & Macroeconomic	74	52	32	21	183
Skill Acquisition	88	31	38	9	166
Firm Revenue	96	34	22	—	152
Innovation Output	105	12	21	11	150
Consumer Welfare	66	29	35	7	137
Regulatory Compliance	52	61	13	3	129
Inequality Measures	24	66	31	4	125
Task Allocation	68	8	28	6	110
Error Rate	42	47	6	—	95
Training Effectiveness	55	12	11	16	94
Worker Satisfaction	42	32	11	6	91
Task Completion Time	74	5	4	1	84
Team Performance	44	9	15	7	76
Wages & Compensation	38	13	19	4	74
Hiring & Recruitment	39	4	6	3	52
Automation Exposure	18	15	9	5	47
Job Displacement	5	29	12	—	46
Developer Productivity	27	2	3	1	33
Social Protection	18	8	6	1	33
Worker Turnover	10	12	—	3	25
Creative Output	15	5	3	1	24
Skill Obsolescence	3	18	2	—	23
Labor Share of Income	8	4	9	—	21

Innovation Remove filter

Support systems for digital services exporters, especially SMEs, are inadequate in China.

Review of policy documents and literature highlighting gaps in finance, legal support, and standards compliance assistance for SME internationalization (qualitative).

medium negative Analysis of Digital Services Trade and Export Competitivenes... SME capacity to internationalize / SME export performance in digital services

China's platform firms show uneven internationalization and platform infrastructure is not consistently internationally competitive.

Case examples and synthesis of domestic/international studies on platform internationalization included in the review (qualitative evidence).

medium negative Analysis of Digital Services Trade and Export Competitivenes... platform international reach and infrastructure competitiveness

China has limited influence in high‑level trade rule formation.

Policy review and comparative institutional analysis within the literature review; descriptive assessment of China's participation in multilateral rule‑making (no formal measurement of influence).

medium negative Analysis of Digital Services Trade and Export Competitivenes... influence/representation in international rule‑setting fora (digital trade and d...

Current institutional, technological, and market shortcomings limit China’s ability to close the gap with economies operating under high‑standard trade regimes.

Qualitative comparative analysis of policy and institutional frameworks against high‑standard trade members; literature and case examples (no new microdata).

medium negative Analysis of Digital Services Trade and Export Competitivenes... relative export competitiveness gap vs. high‑standard trade economies

Absence of governance and observability could increase social costs of accidents and induce conservative regulation that stifles beneficial adoption.

Policy reasoning and historical regulatory responses to systemic risks; conceptual projection without quantitative modeling of regulatory impact.

medium negative The Internet of Physical AI Agents: Interoperability, Longev... social cost of accidents, regulatory restrictiveness, adoption rates

Strong proprietary stacks and incompatible protocols could create winner‑take‑all or oligopolistic market outcomes due to network effects and switching costs.

Market‑structure theory and historical platform examples (e.g., dominant tech platforms); argument is conceptual and not backed by new empirical market analysis in the paper.

medium negative The Internet of Physical AI Agents: Interoperability, Longev... market concentration (e.g., market share distribution), barriers to entry

Without these architectural commitments, the economic costs — stranded assets, safety incidents, reduced innovation, and high coordination costs — will be substantial.

Predictive economic argument built from historical IoT/Internet lessons and systems reasoning; no quantitative cost estimates or econometric analysis in the paper.

medium negative The Internet of Physical AI Agents: Interoperability, Longev... economic costs: stranded assets, safety incident frequency, innovation rates, co...

Poor governance and observability in agent networks would make accountability, certification, and regulation difficult.

Policy and governance reasoning with illustrative domain examples; conceptual argument without empirical governance case studies or metrics.

medium negative The Internet of Physical AI Agents: Interoperability, Longev... ease of accountability/certification/regulation; observability coverage

Weak or brittle security and trust mechanisms across distributed agent ecosystems will pose serious risks.

Lessons drawn from IoT security failures and conceptual threat analysis; no new penetration testing or security metrics presented.

medium negative The Internet of Physical AI Agents: Interoperability, Longev... security/trust robustness of agent ecosystems (vulnerabilities, compromise rates...

Lifecycle mismatch — rapidly evolving AI software embedded in long‑lived physical assets — risks premature ossification or expensive retrofits.

Systems engineering reasoning and historical analogies to embedded systems/IoT lifecycles; no quantitative lifecycle modeling or case study data in the paper.

medium negative The Internet of Physical AI Agents: Interoperability, Longev... frequency/cost of ossification and expensive retrofits; expected upgrade cost

Top-performing community submissions (including baselines and competition entries) still leave a performance gap relative to elite human play on battling tasks.

Paper reports comparative evaluation results showing win-rate and other metrics for heuristic, RL, LLM baselines and community submissions versus human (elite) benchmarks; analysis highlights a remaining gap.

medium negative The PokeAgent Challenge: Competitive and Long-Context Learni... performance gap measured primarily by win-rate (Battling) and strategic robustne...

Misalignment or poor meta-control could produce persistent unsafe behaviors in autonomous learners; governance and oversight mechanisms will be crucial.

Risk analysis based on conceptual failure modes for meta-control; no empirical incidents reported in the paper.

medium negative Why AI systems don't learn and what to do about it: Lessons ... frequency and severity of unsafe behaviors; successful governance interventions

Current models transfer poorly across domains, are brittle in nonstationary environments, and are inefficient in physical/embodied tasks.

Synthesis of known challenges from prior literature and practical experience; paper cites these as motivating observations rather than reporting new data.

medium negative Why AI systems don't learn and what to do about it: Lessons ... cross-domain generalization; robustness under nonstationarity; sample efficiency...

Current models have limited meta-control and do not autonomously decide when to explore, imitate, consult prior knowledge, or consolidate.

Conceptual critique based on typical ML training pipelines and limited on-line decision-making modules; no empirical tests in paper.

medium negative Why AI systems don't learn and what to do about it: Lessons ... autonomy in meta-decisions (e.g., fraction of exploration/imitative acts chosen ...

There is weak integration between passive observation (supervised/representation learning) and active experimentation (reinforcement/exploratory learning) in current systems.

Observation of methodological separation in current literature and systems; conceptual discussion in the paper.

medium negative Why AI systems don't learn and what to do about it: Lessons ... performance on mixed observation-action tasks; ability to combine passive and ac...

Current AI models lack the architectures and control mechanisms required for sustained, autonomous learning in dynamic real-world settings.

Conceptual/theoretical analysis presented in the paper; synthesis of limitations observed in existing literature and practices (no new empirical data provided).

medium negative Why AI systems don't learn and what to do about it: Lessons ... ability to sustain autonomous learning in dynamic real-world environments

Public‑interest concerns (bias, misuse, systemic risk) may be harder to mitigate via simple transparency rules; policies should emphasize outcome‑based regulations, mandatory behavioral testing, and marketplace disclosure obligations for stressed scenarios.

Policy implication derived from the non‑rule‑encodability thesis; no empirical policy evaluation included.

medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... effectiveness of transparency-based vs outcome-based regulatory approaches

Standard contracts and regulatory audits that rely on inspection of rule sets or source code will be insufficient to assess model behavior or risk; regulators and buyers must rely more on behavior‑based testing, standards, and outcome measures.

Policy and regulatory argument derived from the main theorem about non‑rule‑encodability; no empirical regulatory studies presented.

medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... effectiveness of rule‑based audits/regulatory inspections for assessing model ri...

Full interpretability via rule extraction may be impossible for the most valuable parts of LLM competence, limiting the utility of some transparency approaches for safety and auditing.

Argumentative consequence of the main theoretical claim and structural mismatch; supported by historical limitations of rule‑based systems; no empirical tests reported.

medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... feasibility of fully extracting human‑readable rules from LLMs (interpretability...

There is a structural mismatch between explicit human cognitive tools (rules, checklists) and the pattern‑rich, high‑dimensional competence encoded in LLMs.

Theoretical/structural argument about distributed statistical representations in LLMs versus discrete rules; no experimental quantification provided.

medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... alignment/mismatch between human‑readable rules and LLM representations/competen...

Historical expert systems failed to generalize or scale to complex, ambiguous tasks, contrasting with LLMs' broader empirical successes.

Historical case analysis and literature review-style discussion of expert systems versus contemporary LLM performance; no new quantitative historical dataset provided.

medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... generalization and scalability of rule‑based expert systems

Existing idea-evaluation approaches (LLM judges or human panels) are subjective and disconnected from real research outcomes.

Framing and motivation in the paper arguing current approaches rely on subjective judgments and do not directly tie to later publication/citation outcomes; supported implicitly by the empirical mismatch (LLM-judge vs HindSight).

medium negative HindSight: Evaluating LLM-Generated Research Ideas via Futur... Degree of alignment between evaluative judgments (LLM/human) and later real-worl...

High governance costs in regulated/high-risk domains can slow adoption of agentic systems, concentrating deployment in less regulated uses or among large firms that can afford governance infrastructure.

Economic reasoning about fixed and marginal governance costs and firm-level adoption decisions; no empirical adoption data presented.

medium negative Runtime Governance for AI Agents: Policies on Paths rate of adoption of agentic systems across firm sizes and regulated domains

Path-dependent behavior increases the complexity of principal–agent contracting and moral hazard between platforms, enterprise customers, and downstream users, requiring richer contract terms (acceptable paths, logging, audit rights).

Economic theory reasoning and applied contract/design implications discussed; no empirical contract-study data.

medium negative Runtime Governance for AI Agents: Policies on Paths complexity of contractual arrangements (number/complexity of contract clauses or...

Path-dependent policies complicate ex post auditing and simple rule-based regulation; regulators may prefer standards requiring runtime evaluation and logging to be enforceable in practice.

Conceptual argument about limits of auditing when important state is ephemeral and about how runtime logging enables ex post review; illustrative policy examples mapping to runtime requirements.

medium negative Runtime Governance for AI Agents: Policies on Paths enforceability of regulation (ease of ex post compliance verification)

The poor TSFM performance is attributed to pretraining corpora lacking high-frequency, domain-diverse examples (temporal-scale and domain mismatch).

Paper interprets benchmark failures as resulting from pretraining data mismatch (TSFMs usually pretrained on low-frequency domains like energy/finance) and argues lack of high-frequency examples reduces effectiveness. This is a causal interpretation based on observed transfer failures rather than a controlled causal experiment.

medium negative Bridging the High-Frequency Data Gap: A Millisecond-Resoluti... generalization effectiveness of TSFMs when pretrained on low-frequency corpora a...

Most TSFM configurations evaluated failed to achieve adequate predictive performance on this high-frequency distribution.

Benchmarking compares multiple TSFM configurations (and includes traditional ML baselines) on the 5G millisecond dataset and reports that most TSFMs did not reach acceptable performance levels. The summary does not provide exact performance numbers or how adequacy was defined.

medium negative Bridging the High-Frequency Data Gap: A Millisecond-Resoluti... adequacy of predictive performance (forecasting error/accuracy relative to task ...

Current time-series foundation models (TSFMs), typically pretrained on low-frequency data, generalize poorly to high-frequency wireless and traffic data in zero-shot transfer.

Benchmarks reported in the paper include zero-shot evaluations of multiple TSFM configurations on the high-frequency 5G dataset and find poor zero-shot predictive performance. Exact models, metrics, and sample sizes are not specified in the summary.

medium negative Bridging the High-Frequency Data Gap: A Millisecond-Resoluti... predictive performance in zero-shot transfer (forecasting accuracy/error on high...

Estimates of productivity gains from automating quantum-program generation should be discounted given the current lack of hardware-execution validation; adoption timelines and returns remain contingent on resolving the Layer 3b gap.

Forward-looking inference in the review: because Layer 3b is unreported across systems, projected productivity/adoption gains derived from Layers 1–2 results are uncertain and should be treated conservatively.

medium negative Generative AI for Quantum Circuits and Quantum Code: A Techn... recommended adjustment to productivity/adoption estimates

The absence of Layer 3b reporting raises investment risk and valuation uncertainty for startups and investors building on generative quantum-code technologies.

Economic reasoning derived from the documented empirical gap (no real-device evaluation) in the review; the claim links missing validation to higher uncertainty in productization and revenue potential.

medium negative Generative AI for Quantum Circuits and Quantum Code: A Techn... investment risk / valuation uncertainty

Because end-to-end hardware evaluation is missing, claims of model performance based only on syntactic and semantic tests may be over-optimistic when translated into hardware-deployed value.

Analytical inference in the review: observed evaluations stop at Layers 1–2 for most systems, so mapping to hardware outcomes is unvalidated; this underpins the caution about over-optimistic extrapolation.

medium negative Generative AI for Quantum Circuits and Quantum Code: A Techn... risk of overestimation of deployable performance from Layer 1–2 results

Datasets and provenance vary in coverage and quality, and benchmarking practices are heterogeneous across systems, complicating cross-system comparisons.

Review of the 5 identified datasets and reported benchmarking across the 13 systems found variation in dataset provenance, size, task coverage, and bespoke evaluation metrics.

medium negative Generative AI for Quantum Circuits and Quantum Code: A Techn... dataset coverage/provenance quality and benchmarking heterogeneity

The absence of Layer 3b evaluations creates uncertainty about latency, fidelity, noise resilience, calibration dependence, and practical deployability of generated artifacts.

Logical inference based on the documented lack of real-hardware execution (Layer 3b) across 13 systems; review highlights these specific practical metrics as untested in real devices.

medium negative Generative AI for Quantum Circuits and Quantum Code: A Techn... uncertainty in hardware-related performance metrics (latency, fidelity, noise re...

Operational sustainability is a challenge: coordinating long R&D timelines and ensuring expert governance for drug development within DAOs is difficult.

Case-study observations and discussion of organizational challenges; acknowledged lack of longitudinal performance data in the studied projects.

medium negative Decentralized Autonomous Organizations in the Pharmaceutical... project continuity over long R&D timelines, availability/quality of expert gover...

Token economics can create speculative behavior misaligned with long-horizon drug development incentives.

Theoretical analysis of token market dynamics and incentive misalignment; supported by general observations of crypto market speculative behavior, but no DAO-specific empirical causation demonstrated.

medium negative Decentralized Autonomous Organizations in the Pharmaceutical... token price volatility, short-term trading activity vs. long-term investment in ...

Traditional hierarchical firms struggle to coordinate dispersed expertise and finance public‑good stages of drug development.

Theoretical/organizational analysis and literature synthesis on coordination problems and financing gaps for public-good preclinical stages; qualitative argumentation rather than empirical causal inference.

medium negative Decentralized Autonomous Organizations in the Pharmaceutical... coordination efficiency across geographically/disciplinarily dispersed teams; fi...

If AI models encode prevailing consensus or measurement conventions, they risk locking in suboptimal conventions and creating path-dependent coordination failures in R&D.

Argument based on path-dependence and model-mediated coordination theory; conceptual exploration with illustrative scenarios; no empirical demonstrations.

medium negative At the table with Wittgenstein: How language shapes taste an... incidence of path-dependent coordination failures and persistence of suboptimal ...

Platformization of sensory models and proprietary digital twins could create winner-take-most market dynamics, raise barriers to entry, and concentrate rents in firms controlling large sensory-performance datasets.

Economic reasoning drawing on platform economics and data-monopoly literature; applied conceptually to sensory-model platforms; no empirical market-concentration measurement in the food domain provided.

medium negative At the table with Wittgenstein: How language shapes taste an... market concentration, barriers to entry, and rent distribution in firms using pr...

Failures of translation—both literal (across languages/markets) and metaphorical (between disciplines, scales, and practices)—impede global adoption and ideation of food products and innovations.

Argumentative synthesis citing cross-cultural examples and theoretical literature on translation costs; qualitative examples rather than empirical measurement of translation failures.

medium negative At the table with Wittgenstein: How language shapes taste an... success/adoption rates of food products across cultural/linguistic markets and c...

Industrial food R&D tends toward conservatism, privileging established measurement and classification schemes that can obscure sensory nuance and cultural variation.

Critical review and synthesis of literature on industrial R&D practices and measurement norms; illustrative industry examples cited; no systematic surveys or quantitative industry-wide data presented.

medium negative At the table with Wittgenstein: How language shapes taste an... degree of methodological conservatism in R&D and resultant loss of sensory/cultu...

Language and conceptual frameworks (drawing on Wittgenstein) constrain what can be noticed, measured, and communicated about texture and taste, creating epistemic limits in scientific practice.

Philosophical analysis using Wittgensteinian language theory and examples from food science and sensory studies; literature synthesis and illustrative examples; no systematic empirical validation.

medium negative At the table with Wittgenstein: How language shapes taste an... scope and granularity of observable and communicable sensory descriptors (textur...

Empirical evidence shows that every 1 percentage Industrial Robot Density elevation leads to a 0.8 percentage point decrease in the Manufacturing Global Value Chain Participation Rate.

Empirical claim reported in the paper; method described as empirical analysis but the provided excerpt does not specify dataset, country sample, time period, model specification, controls, or sample size.

medium negative Artificial Intelligence and Globalized Division of Labor: Re... Manufacturing Global Value Chain (GVC) Participation Rate (percentage points)

Developing countries face Technology Embargo, Rule Bundling and Capital Concentration Triple Barriers.

Theoretical and literature-based claim described by the authors; no empirical quantification of these barriers (e.g., number of embargoes, measures of rule bundling, capital concentration metrics) included in the excerpt.

medium negative Artificial Intelligence and Globalized Division of Labor: Re... barriers to participation in global division of labor for developing countries (...

Despite positive outcomes, challenges such as workforce displacement, ethical concerns, and limited access to AI technologies were identified as barriers to full adoption.

Study respondents reported barriers in the survey; descriptive statistics summarized the prevalence of workforce displacement concerns, ethical issues, and limited access to AI technologies as impediments to broader adoption.

medium negative Entrepreneurship in the Era of Artificial Intelligence: Rede... barriers to AI adoption (perceived workforce displacement, ethical concerns, lim...

O SCF é expandido para uma camada de segunda ordem (SCF-E) que incorpora déficit de imaginação tecnocultural e governança simbólica, explicando por que a IA permanece em pilotos e não se converte em capacidade organizacional.

Extensão conceitual (segunda ordem) relatada no artigo; respaldada metodologicamente pela combinação QUAN→QUAL, incluindo etnografia orientada ao SCF (detalhes empíricos no corpo do artigo, não no resumo).

medium negative A FRICÇÃO PSICOANTROPOLÓGICA (SCF - Symbolic-Cognitive Frict... progressão de iniciativas de IA de pilotos para capacidade organizacional

A literatura de adoção tecnológica (TAM, UTAUT, Difusão de Inovações) tende a tratar a resistência como variável comportamental genérica ou deficiência de 'treinamento', negligenciando dimensões simbólicas (ritos, identidades e poder), mecanismos cognitivos de ameaça (aversão à perda, sobrecarga e heurísticas) e seus efeitos econômicos.

Revisão bibliográfica e posicionamento teórico declarado no artigo comparando modelos consagrados com a perspectiva proposta; sem indicação de meta-análise ou contagem empírica no resumo.

medium negative A FRICÇÃO PSICOANTROPOLÓGICA (SCF - Symbolic-Cognitive Frict... cobertura das dimensões simbólicas e cognitivas na literatura de adoção tecnológ...

A Fricção Psicoantropológica (SCF) é proposta e detalhada como um coeficiente mensurável do custo cultural e da resistência cognitiva que reduz a capacidade de pequenas e médias empresas (PMEs) de transformar iniciativas de Inteligência Artificial (IA) em geração de valor em escala.

Proposição teórica e operacionalização apresentada no artigo; desenho metodológico descrito como QUAN→QUAL incluindo construção de escala psicométrica e etnografia organizacional. O resumo não especifica tamanho de amostra para validação.

medium negative A FRICÇÃO PSICOANTROPOLÓGICA (SCF - Symbolic-Cognitive Frict... capacidade das PMEs de transformar iniciativas de IA em geração de valor em esca...

Over-reliance on data-driven insights without adequate human oversight can worsen market uncertainty.

Reported in the study's qualitative case studies and interpretive analysis as a potential negative consequence of improper AI/Big Data use (no quantified examples provided in the summary).

medium negative An Empirical Study on the Impact of the Integration of AI an... Increase in market uncertainty associated with reduced human oversight

Algorithmic bias is a potential pitfall of using AI and Big Data that can exacerbate market uncertainty.

Identified as a risk in the paper's qualitative analysis and discussion of pitfalls (no incident counts or empirical quantification provided in the summary).

medium negative An Empirical Study on the Impact of the Integration of AI an... Increase in market uncertainty (or risk) attributable to algorithmic bias

There are concerns that AI may undermine the right to privacy in India.

Legal and policy analysis in the paper discussing privacy risks associated with AI and data-driven governance (review of privacy frameworks and potential conflicts). No empirical sample size; based on normative/legal analysis.

medium negative Regulation and governance of artificial intelligence in Indi... impact of AI on the right to privacy

« Prev 1 2 3 … 23 24 25 … 45 46 Next »