Evidence (6869 claims)
Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 758 | 199 | 100 | 900 | 2007 |
| Governance & Regulation | 826 | 400 | 191 | 122 | 1563 |
| Organizational Efficiency | 777 | 193 | 124 | 84 | 1189 |
| Technology Adoption Rate | 635 | 233 | 124 | 97 | 1098 |
| Research Productivity | 422 | 128 | 57 | 336 | 954 |
| Output Quality | 476 | 179 | 59 | 47 | 761 |
| Decision Quality | 328 | 177 | 81 | 47 | 640 |
| Firm Productivity | 435 | 57 | 88 | 20 | 606 |
| AI Safety & Ethics | 218 | 277 | 65 | 33 | 599 |
| Market Structure | 180 | 170 | 123 | 24 | 502 |
| Task Allocation | 213 | 64 | 72 | 33 | 387 |
| Skill Acquisition | 170 | 61 | 61 | 17 | 309 |
| Innovation Output | 203 | 27 | 43 | 18 | 292 |
| Employment Level | 105 | 54 | 107 | 13 | 281 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 117 | 63 | 42 | 11 | 233 |
| Firm Revenue | 153 | 48 | 26 | 3 | 230 |
| Task Completion Time | 173 | 31 | 8 | 12 | 225 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Worker Satisfaction | 89 | 65 | 22 | 12 | 188 |
| Error Rate | 69 | 92 | 10 | 2 | 173 |
| Regulatory Compliance | 77 | 69 | 14 | 5 | 165 |
| Automation Exposure | 56 | 56 | 26 | 13 | 154 |
| Training Effectiveness | 94 | 21 | 13 | 19 | 149 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Team Performance | 86 | 17 | 27 | 10 | 141 |
| Developer Productivity | 95 | 17 | 14 | 6 | 133 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 52 | 7 | 8 | 3 | 70 |
| Creative Output | 31 | 18 | 8 | 3 | 61 |
| Skill Obsolescence | 5 | 46 | 6 | 1 | 58 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 19 | 17 | — | 53 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Governance
Remove filter
Regulated deployment imposes four load-bearing systems properties — deterministic replay, auditable rationale, multi-tenant isolation, statelessness for horizontal scale — and stateful architectures violate them by construction.
Conceptual/architectural argument presented in the paper (theoretical analysis), not an empirical measurement in the abstract.
Evaluation of four leading AI platforms shows that standard RAG-based approaches achieve an average of only 15% accuracy when information is insufficient.
Empirical evaluation described in paper: four AI platforms tested on benchmark; reported average accuracy of 15% for RAG-based approaches on cases with insufficient information.
Unemployment insurance adjudication has seen rapid integration of AI systems and the question of additional fact-finding poses the most significant bottleneck for a system that affects millions of applicants annually.
Contextual/introductory claim in paper; references to domain-scale impact and bottleneck; no specific numeric study sample provided in excerpt.
A well-known limitation of AI systems is presumptuousness: the tendency of AI systems to provide confident answers when information may be lacking.
Statement in paper framing the problem; general literature/contextual claim (no specific experiment cited in the excerpt).
Brevity, semantic isolation and rhetorical register independently predict representational outcome (i.e., which submissions are included/excluded in summaries).
Statistical/semantic analysis (presumably regression or causal inference) reported in the paper linking textual features—brevity, semantic isolation, rhetorical register—to representational outcomes.
Exclusion concentrates in clusters expressing dissent, scepticism and critique of AI, with exclusion rates of 33%–88% in such clusters.
Cluster/semantic analysis reported in the paper showing higher exclusion rates for clusters labeled as dissent/scepticism/critique.
In topic B, 15.3% of participants are effectively excluded by the official summary.
Empirical measurement reported in the paper quantifying participants 'effectively excluded' when comparing source submissions to official summary coverage.
In topic A, 16.9% of participants are effectively excluded by the official summary.
Empirical measurement reported in the paper quantifying participants 'effectively excluded' when comparing source submissions to official summary coverage.
Both official government summaries underperform a random-participant baseline for topic B (coverage degradation of -8.0%).
Empirical comparison in the paper between official government summary and a random-participant baseline using the n=5,253 consultation responses.
Both official government summaries underperform a random-participant baseline for topic A (coverage degradation of -9.1%).
Empirical comparison in the paper between official government summary and a random-participant baseline using the n=5,253 consultation responses.
No single policy instrument is sufficient to produce high regional science and technology industrial competitiveness.
Result of fuzzy-set qualitative comparative analysis (fsQCA) on AI policy instruments issued by provincial-level governments in China, reported in the study; fsQCA finds no individual condition is sufficient.
LLMs endorsed fraudulent investments at 0% across all models tested.
Preregistered experiment across seven leading LLMs producing 3,360 AI advisory conversations; reported 0% endorsement of objectively fraudulent opportunities.
Endorsement reversal occurred in fewer than 3 in 1,000 observations.
Observed incidence reported from the preregistered experiment (3,360 AI advisory conversations); statement in paper reporting incidence <3/1,000.
Critical gaps persist in explainability, regulatory alignment, ethical governance, and context-specific validation.
Authors' synthesis and Conclusion listing persistent shortcomings identified across the reviewed literature.
Integration of decision intelligence principles into AI applications for financial risk management in emerging markets is nascent.
Authors' synthesis noting limited presence of decision intelligence frameworks or hybrid human-AI decision processes across the reviewed literature.
There is limited empirical validation of AI approaches in emerging market settings.
Review finding described in Results and Conclusion: comparatively few studies provide robust, context-specific empirical validation for emerging markets despite general claims of effectiveness.
Disparities emerge and compound across stages of the ML pipeline (training data, model predictions, and post-processing).
Pipeline-level analysis reported in paper showing sources of disparity at multiple stages and how effects accumulate from training data through prediction to post-processing.
Post-processing amplifies these disparities by collapsing heterogeneous probabilities into percentile-based risk tiers.
Analysis of the pipeline showing that converting model probabilities into percentile-based risk tiers (post-processing step) increases observed disparities across demographic groups.
Older and female students with comparable dropout risk are under-identified by the EWS.
Audit comparison showing lower identification/flagging rates for older and female students who have comparable modeled or observed dropout risk to other groups; reported as part of the pipeline disparities analysis.
Younger, male, and international students are disproportionately flagged for support by the EWS, even when many ultimately succeed.
Empirical results from the replica-based audit comparing model predictions and post-processing flags against eventual student outcomes; disparities reported by demographic groups (age, gender, residency). Exact sample size and numerical metrics not provided in the abstract.
Recent policy and academic discourse has increasingly acknowledged the infeasibility of fullstack AI sovereignty, but has not yet provided an integrating theoretical architecture for governing dependence under these conditions.
Literature/policy-discourse claim made in the paper (review/interpretation). No empirical sampling or quantitative evidence reported in the provided text.
The concentration of AI-related infrastructures is coalescing into distinct geocognitive power poles whose competing infrastructural ecosystems generate structural asymmetries that position small and medium-sized states within regimes of cognitive-informational dependence.
Theoretical/geopolitical argument introduced in the paper (conceptual framing). No empirical sample size or quantitative measurement provided in the excerpt.
There is a growing concentration of computational capacity, data ecosystems, and advanced model architectures within a limited number of technological actors, signaling the emergence of a cognitive-informational order in which influence is exercised through the architectures that shape how knowledge is generated, interpreted, and operationalized.
Theoretical/observational assertion in the paper (conceptual synthesis). No empirical details, sample sizes, or quantitative analyses provided in the supplied text.
The policy and research challenge posed by platform-mediated automation is not merely job quantity (technological unemployment) but institutional continuity — how societies reproduce practical competence when platforms optimize for efficiency rather than formation.
Normative and conceptual claim developed through literature synthesis (institutional economics, platform governance, workforce development); presented as an analytical reframing rather than an empirically tested hypothesis.
Entry-level roles have historically functioned as apprenticeships in which workers acquire tacit knowledge and critical judgment; if platforms curtail these formative occupational layers, organizations may lack future workers capable of exercising contextual reasoning required to manage complex systems.
Institutional economics and workforce development literature cited in the paper; conceptual synthesis without original empirical measurement reported.
Platform-mediated automation risks hollowing out labor structures from both directions: eroding repetitive, junior roles from below and automating supervisory coordination functions from above.
Theoretical argument synthesizing institutional economics and platform literature; articulated as a conceptual risk rather than demonstrated with original empirical data.
Algorithmic systems are displacing routine tasks across both low-wage entry-level work and middle-management functions.
Stated in paper's argumentation; supported by a literature-based review drawing on platform governance literature and recent research on AI-enhanced automation (no original empirical sample or quantitative study reported).
For agentic systems, there are three structural breaks: decision diffusion, evidence fragmentation, and responsibility ambiguity.
Analytical identification and labeling of three specific structural problems for agentic AI within the paper's argumentation.
The paper introduces the 'cascade of uncertainty', showing how governance failures propagate through serial dependencies between framework layers.
Conceptual/theoretical model introduced and analyzed in the paper (cascade model linking framework layers and failure propagation).
Agentic AI systems encounter structural breaks that prevent normal framework fillability.
Paper's analytic assessment reports that agentic AI systems cause structural breaks undermining the framework's ability to fill DES-properties.
Classical ML systems achieve only minimal DES-property fillability.
Analytic comparison in the paper classifies classical ML systems as providing minimal governance evidence fillability.
When automated decision systems fail, organizations frequently discover that formally compliant governance infrastructure cannot reconstruct what happened or why.
Asserted by the paper as an observed problem motivating the study; presented as a general empirical/experiential claim (literature/examples synthesis) rather than a controlled empirical estimate.
Artificial intelligence introduces systemic risks through unprovenanced AI-derived metadata.
Cautionary claim made by the authors; stated as a systemic risk linked to provenance issues of AI-generated metadata, without empirical incident data in the excerpt.
The debate about scholarly knowledge infrastructure has long been framed as a contest between openness and commercial enclosure, and this framing distorts both policy and practice.
Conceptual/persuasive claim made in the paper's opening paragraph; no empirical data or sample reported in the excerpt.
AI is driving states to reconsider interdependence not as the source of peace, but as a battlefield of power.
Normative and interpretive conclusion drawn from the paper's analysis of AI's geopolitical implications; no empirical data or sample reported in the abstract.
AI is redefining foreign policy in a multipolar world by making the line between economic cooperation and strategic vulnerability indistinct.
Theoretical claim and synthesis in the paper's thesis; no empirical evidence or sample size provided in the abstract.
AI is reshaping economic relationships between countries that were previously sources of mutually beneficial relations into instruments of coercion.
The paper presents a theoretical analysis drawing on international political economy and foreign policy theory; no empirical measurements reported in the abstract.
AI enhances the weaponization of economic interdependence by enabling states to monitor, predict, manipulate, and disrupt transnational networks with unprecedented accuracy.
The paper advances a theoretical argument and synthesis of international political economy and foreign policy literatures; no empirical sample or quantitative data reported in the abstract.
The infrastructure for cross-user agent collaboration is entirely absent, let alone the governance mechanisms needed to secure it.
Authoritative claim in paper framing the research gap; presented as observational/argumentative (no empirical audit reported).
Current AI agent frameworks have made remarkable progress in automating individual tasks, yet all existing systems serve a single user.
Statement in paper's introduction/positioning; conceptual survey-style claim (no empirical study or systematic benchmark reported).
General-purpose LLMs pose misinformation risks for development and policy experts, lacking epistemic humility for verifiable outputs.
Conceptual/argumentative claim stated in the paper's motivation; no empirical test reported in the abstract.
Current session-based context handling (sessions ending, context windows filling, memory APIs returning flat facts) produces intelligence that is powerful per session but amnesiac across time.
Descriptive diagnostic argument in the paper; no empirical measurement reported in this text.
The US restricts mobility and knowledge flows and challenges regulatory efforts to protect its advantage.
Descriptive claim about US strategy (policy observation stated in the paper's framing; not quantified in the excerpt).
The AI race amplifies security risks and international tensions.
Introductory/interpretive claim motivating the study (no specific empirical quantification provided in the excerpt).
The US and China form two poles around which global AI research increasingly revolves (i.e., global AI research is polarizing around these two countries).
Longitudinal network analysis of international collaboration and citation patterns derived from publication data compared to random realizations.
The US and China have long diverged in both cross-country collaboration and citation links, forming two poles around which global AI research increasingly revolves.
Large-scale data of scientific publications spanning three decades; analysis comparing cross-country collaboration and citation links to their random realizations (null models).
Under logit demand and symmetric rivals, the QoS gap is strictly decreasing in API price and rival entry elasticity.
Comparative statics derived from the analytical model (logit demand, symmetric rivals).
The paper identifies governance challenges such as accountability gaps, digital sovereignty risks, ethical pluralism, and strategic weaponization arising from embedding AI in diplomatic practice.
Conceptual and normative analysis section of the paper outlining risks and governance challenges; illustrated by examples and argumentation.
Thin training coverage fosters anxiety about substitution and slows diffusion of AI tools.
Reported associations from surveys of mid-level managers and technical staff, interviews, and document analysis across cases; thematic coding identified links between limited training, worker anxiety, and slower diffusion. (Sample size not reported.)
The authors identify five 'decoys' that seemingly critique—but in actuality co-constitute—AI's emergent power relations and material political economy.
Analytical contribution of the paper: identification and conceptual description of five decoys based on literature synthesis; this is a descriptive/theoretical taxonomy rather than an empirical enumeration with sample size.