Evidence (4333 claims)
Adoption
5539 claims
Productivity
4793 claims
Governance
4333 claims
Human-AI Collaboration
3326 claims
Labor Markets
2657 claims
Innovation
2510 claims
Org Design
2469 claims
Skills & Training
2017 claims
Inequality
1378 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 402 | 112 | 67 | 480 | 1076 |
| Governance & Regulation | 402 | 192 | 122 | 62 | 790 |
| Research Productivity | 249 | 98 | 34 | 311 | 697 |
| Organizational Efficiency | 395 | 95 | 70 | 40 | 603 |
| Technology Adoption Rate | 321 | 126 | 73 | 39 | 564 |
| Firm Productivity | 306 | 39 | 70 | 12 | 432 |
| Output Quality | 256 | 66 | 25 | 28 | 375 |
| AI Safety & Ethics | 116 | 177 | 44 | 24 | 363 |
| Market Structure | 107 | 128 | 85 | 14 | 339 |
| Decision Quality | 177 | 76 | 38 | 20 | 315 |
| Fiscal & Macroeconomic | 89 | 58 | 33 | 22 | 209 |
| Employment Level | 77 | 34 | 80 | 9 | 202 |
| Skill Acquisition | 92 | 33 | 40 | 9 | 174 |
| Innovation Output | 120 | 12 | 23 | 12 | 168 |
| Firm Revenue | 98 | 34 | 22 | — | 154 |
| Consumer Welfare | 73 | 31 | 37 | 7 | 148 |
| Task Allocation | 84 | 16 | 33 | 7 | 140 |
| Inequality Measures | 25 | 77 | 32 | 5 | 139 |
| Regulatory Compliance | 54 | 63 | 13 | 3 | 133 |
| Error Rate | 44 | 51 | 6 | — | 101 |
| Task Completion Time | 88 | 5 | 4 | 3 | 100 |
| Training Effectiveness | 58 | 12 | 12 | 16 | 99 |
| Worker Satisfaction | 47 | 32 | 11 | 7 | 97 |
| Wages & Compensation | 53 | 15 | 20 | 5 | 93 |
| Team Performance | 47 | 12 | 15 | 7 | 82 |
| Automation Exposure | 24 | 22 | 9 | 6 | 62 |
| Job Displacement | 6 | 38 | 13 | — | 57 |
| Hiring & Recruitment | 41 | 4 | 6 | 3 | 54 |
| Developer Productivity | 34 | 4 | 3 | 1 | 42 |
| Social Protection | 22 | 10 | 6 | 2 | 40 |
| Creative Output | 16 | 7 | 5 | 1 | 29 |
| Labor Share of Income | 12 | 5 | 9 | — | 26 |
| Skill Obsolescence | 3 | 20 | 2 | — | 25 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
Governance
Remove filter
Unit of analysis is country-year observations for G20 members covering 2015–2023.
Paper states sample and scope as a cross-country panel of G20 economies from 2015–2023 (panel dataset). (Up to 20 countries × 9 years = up to 180 country-year observations, depending on coverage).
The paper's empirical approach is primarily qualitative and interpretive: a systematic literature review plus comparative qualitative case studies, using policy documents, public diplomacy examples, development initiatives, technology export and standards behaviour, and secondary empirical studies as evidence.
Methods section of the paper explicitly states the approach and evidence types; sample of four comparative cases (US, China, EU, Russia) is specified.
The paper demonstrates different mixes and institutional practices of smart power in practice by applying the framework to the United States, China, the European Union, and Russia.
Explicit comparative qualitative case studies of four major international actors (sample size: four cases) using policy documents, public diplomacy examples, and development/technology initiatives as illustrative evidence.
Empirical validation of the book’s proposals would require complementary case studies, model documentation, and outcome measurements.
Author/reviewer recommendation in the blurb about methodological limitations and next steps; not an empirical finding.
The book is predominantly conceptual and policy-analytic and uses illustrative case vignettes rather than presenting a single empirical study.
Explicit methodological description in the Data & Methods blurb: synthesis of technical ideas, governance requirements, and illustrative vignettes; no empirical sample or experimental protocol described.
The research program is grounded in 12 years of forensic legal research spanning 2014–2026.
Author-stated research timeline and methodology (2014–2026 forensic legal research).
The protocol is underpinned by a forensic audit of approximately 4,200 specialized texts (legal doctrine, regulation, standards, technical literature).
Stated corpus and audit in the Methods section: ~4,200 texts reviewed as part of the forensic audit.
The protocol systematizes arguments for 16 projected rulings at Mexico’s Supreme Court (SCJN) to anchor the proposed rights and rules in constitutional practice.
Doctrinal projection and constitutional strategy section of the compendium describing 16 projected SCJN rulings (method: legal projection/modeling).
The compendium’s findings and recommendations are based on a forensic audit of approximately 4,200 specialized texts covering doctrine, jurisprudence, regulation and technical literature.
Stated methodological claim in the compendium: forensic corpus audit of ~4,200 texts (sample size reported).
The evidence base is qualitative: the study uses conceptual framework synthesis, comparative analysis of multi-sector implementations, and case examples rather than randomized or large-sample empirical evaluation.
Methods and limitations section of the paper explicitly describing the evidence base and methods (qualitative synthesis, pattern extraction, cross-case lessons).
The paper presents a deployment pattern intended to be adapted by sector and regulatory context rather than a one-size-fits-all blueprint.
Explicit statement in the paper and the described pattern design; based on qualitative pattern extraction and prescriptive guidance.
Partial least squares structural equation modeling (PLS-SEM) was used to test hypothesized direct, mediated, and moderated paths.
Methods/analysis section states PLS-SEM was the statistical approach to estimate paths, mediation, and moderation effects.
The study employed a 2 × 2 between-subjects experimental design manipulating (1) identity disclosure (transparent vs. nondisclosed) and (2) conversational tone (empathetic/personalized vs. generic).
Explicit description of experimental factors and design in the methods (2 × 2 between-subjects).
Stimuli (chatbot dialogues) were standardized and pretested using a large-language-model (LLM) workflow to ensure consistent experimental stimuli across conditions.
Methods section describing stimuli creation: LLM-generated dialogues were produced and pretested to standardize messages across the 2 × 2 conditions.
Quasi-experimental designs (difference-in-differences, instrumental variables, event studies) and panel regressions are useful methods for identifying causal effects of AI adoption where plausibly exogenous variation exists.
Methodological summary in the paper listing common empirical strategies used in the literature to estimate causal impacts of technology adoption.
Current research is limited by measurement challenges in capturing AI capabilities and firm-level adoption, and by a lack of longitudinal worker-firm data and causal identification in many settings.
Explicit limitations noted by the paper: gaps in task measures, scarce longitudinal linked datasets, and methodological challenges in causal inference.
This paper's approach is qualitative and based on secondary literature synthesis; it does not collect primary survey, experimental, or administrative data.
Explicit statement in the Data & Methods section of the paper.
Key empirical gaps remain: better measurement of K_T (AI/software capital), more granular matched employer‑employee and wealth data, and improved estimates of task-substitution elasticities are required to precisely quantify incidence and policy impacts.
Authors’ stated research agenda and limitations section, including sensitivity analyses showing outcome variation with parameter choices and measurement uncertainty.
Models are prompted to assess profiles along dimensions of social acceptance, marital stability, and cultural compatibility.
Experimental procedure: prompts asked models to rate profiles on the three named dimensions.
We evaluate five LLM families (GPT, Gemini, Llama, Qwen, and BharatGPT).
Methods: models enumerated as the LLM families evaluated in the audit.
We vary caste identity across Brahmin, Kshatriya, Vaishya, Shudra, and Dalit, and income across five buckets.
Experimental design described: caste identity explicitly manipulated across five named caste categories; income varied across five buckets.
We conduct a controlled audit of caste bias in LLM-mediated matchmaking evaluations using real-world matrimonial profiles.
Described methodology in the paper: a controlled audit using real-world matrimonial profiles to probe LLMs for caste bias.
Repositioning informal systems as co-creators in urban governance (relational public administration) enables transformative governance and effective localization of SDGs in sustainable cities in South Africa.
Conceptual/analytical argumentation (theoretical paper; no empirical sample reported).
Determinants that significantly increase the likelihood of participation in small-scale livestock production in Malawi include household size, access to credit, access to extension services, landholding size, distance to the market, and location in the Northern region.
Cross-sectional analysis of IHS5 (sample = 8,795 households); determinants identified as significant in the analysis.
Households engaged in small-scale livestock production in Malawi earned, on average, an additional MWK 36,405.76 compared to non-producing households.
Cross-sectional analysis of the Fifth Integrated Household Survey (IHS5) with a sample of 8,795 households.
Individuals in Thohoyandou used traditional healing practices (e.g., steam inhalation with stones and salt; herbal concoctions including various named plants and mixtures) to survive COVID-19 without hospitalization, underscoring the significance of traditional healing practices during the pandemic.
Narrative inquiry based on in-depth interviews with three respondents (sample size = 3).
Teacher unions function as a counter-hegemonic force challenging neoliberal geopolitics and political norms and are repositioning as intellectual activists rather than compliant officials.
Qualitative interpretivist analysis of narrative interviews with unionized educators and public union discussions (no sample size reported).
Digitalization significantly enhances market access and supplier diversity for SMMEs.
Qualitative secondary data thematic analysis (literature/reports/industry initiatives; no sample size reported).
Indigenous Knowledge Systems (IKS) represent a dynamic body of wisdom encompassing sustainable agriculture, natural resource management, and community resilience, and offer proven, contextually grounded solutions to modern challenges like climate change and food insecurity.
Qualitative desktop research synthesizing existing literature (literature review; no sample size reported).
Prompts can be treated as decision policies that allocate discretion between researcher and system, governing what is executed and when iteration stops.
Methodological framing advanced by the authors describing prompts as decision policies; conceptual claim based on the paper's analytic framework rather than empirical measurement.
Operational constraints and decision rule prompts deliver large and stable footprint reductions while preserving decision equivalent topic outputs.
Experimental comparisons of prompt strategies in the benchmarked workflow showing reductions in runtime/CO2e and evaluated topic outputs' decision-equivalence (asserted in abstract; no numeric reductions or sample sizes provided).
We benchmark a modern economic survey workflow, an LDA-based literature mapping implemented with GenAI assisted coding and executed in a fixed cloud notebook, measuring runtime and estimated CO2e with CodeCarbon.
Experimental benchmark described in the paper: single implemented workflow (LDA-based literature mapping) executed in a fixed cloud notebook with runtime and CO2e measured using CodeCarbon (methodological claim).
Training footprint is the largest cluster in the mapped Green AI literature.
Result from the paper's literature mapping / clustering (statement in abstract; no numeric cluster sizes given).
We map the recent Green AI literature into seven themes: training footprint is the largest cluster, while inference efficiency and system level optimisation are growing rapidly, alongside measurement protocols, green algorithms, governance, and security and efficiency trade-offs.
Bibliometric / thematic mapping of recent Green AI literature described in the paper (method: literature mapping; exact number of papers or mapping procedure not specified in abstract).
Average ratings [for same-caste matches were] up to 25% higher (on a 10-point scale) than inter-caste matches.
Quantitative result reported in the analysis comparing average ratings (10-point scale) between same-caste and inter-caste matches; statement specifies magnitude 'up to 25%'.
Our analysis reveals consistent hierarchical patterns across models: same-caste matches are rated most favorably.
Reported results across evaluated LLMs showing consistent patterns where same-caste profile pairings received higher ratings than inter-caste pairings.
A representative incident (ISS-004) demonstrated boundary-based containment with 10-minute detection latency, zero user exposure, and 80-minute resolution.
Incident ISS-004 report in the paper giving specific timings for detection latency (10 minutes), user exposure (zero), and resolution (80 minutes).
The multi-agent approach improved reliability: audited handoffs detected and blocked a coordinate transformation error affecting all 2,452 stations before publication.
Incident detection reported in the SF2Bench deployment where audited handoffs prevented publication of a coordinate transformation error that would have affected all 2,452 stations.
The multi-agent approach improved efficiency — the SF2Bench deployment was completed by a single operator in two days with repeated artifact reuse across deployments.
Operational report from the production deployment: single operator completion time of two days and reuse of artifacts across deployments as stated in the paper.
SF2Bench, a compound flooding benchmark comprising 2,452 monitoring stations and 8,557 published files spanning 39 years, validates the multi-agent workflow.
Reported dataset composition and use in the paper: SF2Bench with stated counts and temporal span used to validate the multi-agent workflow.
EnviSmart treats reliability as an architectural property through two mechanisms: (1) a three-track knowledge architecture that externalizes behaviors (governance constraints), domain knowledge (retrievable context), and skills (tool-using procedures) as persistent, interlocking artifacts; and (2) a role-separated multi-agent design where deterministic validators and audited handoffs restore fail-stop semantics at trust boundaries before irreversible steps.
System architecture and design description in the paper; presented as the core reliability mechanisms implemented in EnviSmart.
We introduce EnviSmart, a production data management system deployed on campus-wide storage infrastructure for environmental research.
System description and statement of deployment in the paper; presented as a production deployment (no randomized evaluation reported).
Embedding LLM-driven agents into environmental FAIR data management can externalize operational knowledge and scale curation across heterogeneous data and evolving conventions.
Conceptual / argumentative claim made in the paper as a motivation for the system; no quantitative experiment tied to this statement in the excerpt.
The agentic-specificity classification helps organizations distinguish challenges that require novel approaches from those that are addressable with established practices.
Authors' proposed classification (agentic-specific vs. carried-over/amplified) intended as a practical decision aid; derived from the coding and comparative analysis.
The taxonomy provides a diagnostic framework for identifying priority barrier dimensions and understanding cross-dimensional amplification mechanisms.
Authors present a taxonomy derived from the review and claim it can be used diagnostically by organizations; supported by the coded barrier classification and STS mapping.
Organizations and policymakers that treat work-time policy as foundational economic planning will better position their economies to harness AI's benefits while mitigating systemic instability.
Policy-prescriptive conclusion based on cross-disciplinary analysis; no empirical trial or quantification offered in the summary.
Work-time reduction can distribute productivity gains more equitably.
Argument supported by examination of historical work-time transitions and pilot programs referenced in the article; no empirical effect sizes or sample details in the summary.
Coordinated reduction in working hours helps maintain aggregate demand.
The paper's synthesis of historical transitions and pilot programs and argument about distribution of productivity gains; no quantitative evidence or sample sizes provided in the summary.
Gradual, policy-led reduction in standard working hours can preserve employment.
Claim based on examination of historical work-time transitions, contemporary pilot programs, and cross-sector implementation strategies referenced in the paper; no specific studies or sample sizes cited in the summary.
Competition law assessments of a dominant undertaking’s conduct must consider not only the product market but also the labor market, particularly in cases of significant market structure changes.
Conclusion stated in abstract summarizing the paper’s findings; supported by the paper's legal analysis and referenced case law (no empirical sample provided in abstract).