Evidence (13870 claims)
Adoption
8467 claims
Productivity
7558 claims
Governance
6805 claims
Human-AI Collaboration
6363 claims
Org Design
4132 claims
Innovation
4065 claims
Labor Markets
3526 claims
Skills & Training
2945 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 749 | 196 | 98 | 892 | 1984 |
| Governance & Regulation | 817 | 394 | 188 | 121 | 1544 |
| Organizational Efficiency | 771 | 189 | 124 | 83 | 1177 |
| Technology Adoption Rate | 627 | 233 | 123 | 96 | 1088 |
| Research Productivity | 411 | 123 | 56 | 332 | 933 |
| Output Quality | 467 | 178 | 59 | 47 | 751 |
| Decision Quality | 320 | 174 | 75 | 42 | 618 |
| Firm Productivity | 435 | 55 | 88 | 20 | 604 |
| AI Safety & Ethics | 214 | 276 | 65 | 33 | 593 |
| Market Structure | 178 | 167 | 122 | 24 | 496 |
| Task Allocation | 207 | 64 | 71 | 32 | 379 |
| Skill Acquisition | 165 | 59 | 60 | 17 | 301 |
| Innovation Output | 203 | 27 | 43 | 18 | 292 |
| Employment Level | 105 | 52 | 107 | 13 | 279 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 116 | 63 | 42 | 11 | 232 |
| Firm Revenue | 150 | 48 | 26 | 3 | 227 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Task Completion Time | 169 | 29 | 8 | 12 | 219 |
| Worker Satisfaction | 89 | 63 | 20 | 12 | 184 |
| Error Rate | 69 | 92 | 10 | 2 | 173 |
| Regulatory Compliance | 76 | 68 | 14 | 5 | 163 |
| Training Effectiveness | 93 | 21 | 13 | 19 | 148 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Automation Exposure | 51 | 54 | 22 | 12 | 142 |
| Team Performance | 86 | 17 | 27 | 9 | 140 |
| Developer Productivity | 94 | 17 | 14 | 6 | 132 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 51 | 7 | 8 | 3 | 69 |
| Creative Output | 31 | 17 | 7 | 3 | 59 |
| Skill Obsolescence | 5 | 46 | 6 | 1 | 58 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 17 | 17 | — | 51 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Distinguishing between base models and fine-tuned systems is important for researchers using LLMs to study cultural patterns, because fine-tuning and alignment can change the behaviors relevant to behavioral research.
Analytical distinction and methodological guidance in the paper; claim grounded in conceptual reasoning about model development workflows rather than a specific experimental demonstration in the excerpt.
Contemporary artificial intelligence research has been organized around two dominant ambitions: productivity (treating AI systems as tools for accelerating work and economic output) and alignment (ensuring increasingly capable systems behave safely and in accordance with human values).
Literature synthesis and conceptual framing within the paper (review of prevailing research agendas and priorities in AI literature). No original empirical sample or experiment reported for this claim in the provided text.
This study analyzes comments and statements from party members in OECD countries from 2016 to 2025 through content analysis, examining media interviews, speeches, and debates.
Description of the study's data and method: content analysis of party member comments and statements drawn from media interviews, speeches, and debates across OECD countries over the 2016–2025 period (sample size and selection details not reported in the excerpt).
The study contributes to the literature by integrating evidence across higher education, vocational training, and lifelong learning to emphasize the need for balanced policy approaches to skill formation.
Stated contribution in the paper: cross-pathway synthesis of existing empirical evidence and secondary data (methods described as comparative synthesis; no primary empirical contribution reported in the summary).
The study uses secondary data and comparative evidence from prior empirical studies to analyze relationships between higher education, vocational education, and lifelong learning.
Stated methodology in the paper: analysis of secondary data and synthesis of prior empirical/comparative studies (no primary data collection; no sample sizes reported).
This study analyzed survey data from 466 Chinese food delivery riders using structural equation modeling and bootstrapping procedures, modeling work pressure as a mediator and perceived autonomy as a moderator.
Statement in abstract describing sample size (466 Chinese food delivery riders) and analytic approach (SEM and bootstrapping) and modeled variables (work pressure mediator, perceived autonomy moderator).
Drawing on leadership theory, emotional intelligence research and AI ethics informs the proposed framework.
Methodological/design statement in the paper describing its intellectual grounding; indicates literature-based synthesis rather than primary data collection.
The study uses topic modeling on a corpus of over 4,600 academic papers to identify the dominant themes in the economics of AI literature.
Unsupervised topic modeling applied to a compiled corpus of >4,600 papers (authors' described methodology and sample size).
The paper explores risk frameworks, ethical constraints, and policy imperatives related to AI.
Descriptive claim about the paper's analytic content (thematic/policy analysis); no empirical details or measurement approach are given in the abstract.
This paper investigates societal applications of AI across domains such as healthcare, education, accessibility, environmental management, emergency response, and civic administration.
Descriptive statement of the paper's scope and methods (literature review / cross-domain analysis implied); the abstract lists the domains but does not specify empirical procedures or sample sizes.
Chatbot suggestions were artificially varied in aggregate accuracy across treatment conditions from low (53%) to high (100%).
Paper describes experimental manipulation of chatbot suggestion accuracy with aggregate accuracies ranging from 53% to 100%; manipulation method (how suggestions were generated or sampled) described in methods (not fully detailed in excerpt).
Caseworkers in the control condition (no chatbot suggestions) had a mean accuracy of 49%.
Reported experimental outcome: mean accuracy for control group = 49%; based on the randomized experiment using the 770-question benchmark.
We conducted a randomized experiment with caseworkers recruited from nonprofit outreach organizations in Los Angeles.
Paper describes a randomized experiment recruiting caseworkers from nonprofit outreach organizations in Los Angeles; sample size and recruitment details not given in the excerpt.
The benchmark questions have corresponding expert-verified answers.
Paper states benchmark questions have expert-verified answers; verification method and number/credentials of experts not specified in the excerpt.
We created a 770-question multiple-choice benchmark dataset of difficult, but realistic questions that a caseworker might receive.
Paper reports creation of a benchmark dataset containing 770 multiple-choice questions described as difficult and realistic; questions and dataset construction described in methods (no sample-of-questions or external validation details provided in the excerpt).
The study's conclusions draw on three complementary evidence bases: (a) task-level evidence on what generative AI can already do in practice; (b) occupational exposure and complementarity analysis using Philippine labor force data; and (c) firm- and worker-level evidence on AI adoption.
Description of methods and data sources in the paper: task-level capability testing/assessment, analysis of national labor force/occupation data for exposure/complementarity, and firm/worker surveys or qualitative adoption evidence.
There is a need for more longitudinal and cross-country studies to better understand the long-term value creation of ERM in MSMEs.
Authors' conclusion and identified research gaps based on the scope and limitations of the existing literature reviewed (i.e., predominance of cross-sectional or single-country studies).
Extensive experiments were conducted using both synthetic and real hospital datasets to evaluate the framework.
Statement in the paper indicating experiments on synthetic and real datasets; exact sizes, sources, and composition of these datasets are not provided in the excerpt.
The paper explains the main legal frameworks that currently regulate AI in India, as well as proposals for future legislation.
Author's legal and policy analysis / document review of existing statutes and proposed laws (qualitative review). No quantitative sample size; based on review of legal texts and policy proposals cited in the article.
DDDM was quantified using AI language models, specifically BERT and ChatGLM2-6B.
Methodological description in the paper stating that BERT and ChatGLM2-6B were leveraged to quantify the extent of DDDM (implementation details, training/data specifics, and sample not provided in the excerpt).
A “macro approach” that (1) directly models equilibrium behavior of large employers, (2) combines macro data with empirical estimates of employers’ responses (from the micro approach) to estimate the model, and (3) uses the model to compute aggregate costs of monopsony and optimal policies, is the appropriate methodological response.
Methodological proposal set out by the paper; this is a description of the authors' recommended empirical/theoretical strategy rather than an empirical finding. The excerpt contains no implementation details, datasets, or estimation results.
The traditional theoretical and empirical “micro approach” to studying labor market power requires that firms are small and atomistic.
Conceptual/theoretical characterization of the micro approach stated by the paper; no empirical sample, dataset, or formal model provided in the excerpt.
The machine-learning based analytical approach used in the study captures complex, nonlinear relationships among emotional, psychological and economic variables.
Methodological claim: authors used machine learning (including ensembles) to model nonlinear and complex relationships. The excerpt does not provide algorithmic details, tuning, validation strategy, or sample size.
Work environment and digital/AI intensity were incorporated as contextual moderators in the analysis to reflect contemporary labor market conditions.
Methodological description in the excerpt states these variables were included as moderators; no details on measurement, operationalization, or sample size are provided.
Most evidence came from retrospective studies or meta-analyses, with limited prospective or randomized controlled trials.
Summary of study designs across the 40 included studies as reported in the review.
The impact of AI on patient outcomes (e.g., mortality, rebleeding) was rarely addressed.
Statement in results indicating few included studies reported patient-centered outcomes such as mortality or rebleeding.
This systematic review adhered to PRISMA 2020 guidelines.
Methods statement in the paper specifying adherence to PRISMA 2020; the review included 40 studies.
Coordination is treated as a structural property of the coupled dynamics (agents + incentives + persistent environment) rather than as the solution to a centralized global optimization objective or purely agent-centric learning problem.
Conceptual framing supported by the formal dynamical model and theorems showing properties of the closed-loop dynamics that do not rely on an underlying global objective.
The persistent environment component of the model stores accumulated coordination signals, and a distributed incentive field transmits those signals locally to adaptive agents, which update their states in response.
Model construction and definitions in the paper describing (i) an environmental state variable with persistent dynamics that accumulates signals, (ii) a spatially/distributed incentive field mapping environmental memory to local agent inputs, and (iii) adaptive update rules for agents.
The paper formalizes agents, incentives, and the environment as a recursively closed feedback architecture (i.e., a coupled dynamical system in which agents adapt to incentive signals that themselves depend on a persistent environmental memory produced by agent actions).
Mathematical model and definitions presented in the paper (formal system specification of agent states, incentive field, and persistent environment; no empirical data).
The review focuses on AI applications within small‑scale business environments, with a special focus on women‑owned micro firms in Jaipur, India.
Scope and aim articulated in the paper; geographic and demographic focus explicitly stated by the authors.
The systematic review follows PRISMA 2020 guidelines.
Methodological statement in the paper indicating adherence to PRISMA 2020 for the review process.
After screening and eligibility filtering, 55 open‑access journal articles were included for in‑depth analysis.
PRISMA‑guided screening and eligibility process reported in the review; final included sample explicitly stated as 55 open‑access journal articles.
A Scopus search identified 265 records using keywords related to women’s entrepreneurship and AI.
Systematic literature search reported in the paper following PRISMA 2020; search executed in Scopus with specified keywords; initial yield stated as 265 records.
This research examined three countries (China, the United States, and Germany) using panel vector autoregressive (panel VAR) and difference-in-differences (DID) methods to assess how technology and public policy interventions affect emissions reductions.
Study design reported in the paper: sample of three countries (China, US, Germany) and application of panel VAR and DID methods; specific time period and sample size not provided in the summary.
Social assistance (SA) is defined here as noncontributory social transfers (including cash, vouchers, or in-kind transfers to families or individuals, including the elderly), public works programs, fee waivers, and subsidies.
Explicit definitional statement in the introduction (authors' operational definition for the chapter).
This chapter focuses on low- and middle-income countries (LMICs) and uses a 'review of reviews' approach to summarize the policy discourse and evidence on social protection and gender in adulthood, concentrating on social assistance, social care, and social insurance.
Methodological and scope statement explicitly given in the introduction (author-declared approach and focus).
This study draws on a critical AI media literacy framework to analyze user-generated discussions in the two largest higher education subreddits on Reddit.com.
Author-reported study design: application of a critical AI media literacy theoretical framework to a qualitative dataset consisting of user-generated discussions from the two largest higher-education subreddits. (Sample size/number of posts/threads not specified in the provided excerpt.)
The study used a mixed-methods design incorporating surveys from 150 LEP immigrants, interviews with 50 employers, and interviews with 20 translation service providers in various linguistically diverse U.S. cities, with quantitative analysis performed in SPSS Version 28 and qualitative thematic coding in NVivo 14.
Reported study design and sample: survey n=150 LEP immigrants; employer interviews n=50; translation provider interviews n=20; analytic software specified as SPSS v28 (quantitative) and NVivo 14 (qualitative).
Viable transition pathways are operationally defined in this study as sharing at least 3 skills and achieving at least 50% skill transfer.
Methodological definition stated in the paper used to determine whether a job-to-job transition is considered viable.
We identified 4,534 feasible transitions between jobs in the dataset.
Count of feasible job-to-job transition pairs found in the knowledge graph analysis (4,534 transitions reported).
We constructed and validated a knowledge graph of 9,978 Egyptian job postings, 19,766 skill activities, and 84,346 job-skill relationships with a 0.74% error rate.
Empirical construction and validation of a knowledge graph using a dataset of 9,978 job postings, 19,766 distinct skill/activity nodes, and 84,346 job–skill edges; reported overall error rate 0.74% (validation method not detailed in the excerpt).
In a field experiment on the DiagnosUs medical crowdsourcing platform, the authors held the true prevalence in the unlabeled stream fixed at 20% (blasts) while varying the prevalence of positives in the gold-standard feedback stream (20% vs. 50%) and the response interface (binary labels vs. elicited probabilities).
Field experiment conducted on the DiagnosUs platform with experimental manipulations: (i) true prevalence in unlabeled stream fixed at 20% blasts, (ii) feedback-stream prevalence manipulated to 20% vs 50%, (iii) response interface manipulated between binary labels and elicited probabilities. (Sample size and number of workers not specified in the provided excerpt.)
The study examines 268 Chinese cities from 2010 to 2023 and integrates theoretical analysis with empirical testing to study AI innovation's employment effects.
Study description specifying sample size (268 cities), period (2010–2023), and combined theoretical and empirical approach.
The framework was evaluated on 2,847 queries across 15 task categories.
Paper reports an evaluation dataset consisting of 2,847 queries spanning 15 task categories; used as the sample for reported empirical results.
Non-text processing paths use SLM-assisted modality decomposition.
Paper reports that non-text queries are decomposed using SLM-assisted modality decomposition; described as the non-text routing approach in the framework.
For text-only queries, the framework uses learned routing via RouteLLM.
Paper states text-only routing is handled by a learned model named RouteLLM; presented as part of the system architecture.
A central Supervisor dynamically decomposes user queries, delegates subtasks to modality-appropriate tools (e.g., object detection, OCR, speech transcription), and synthesizes results through adaptive routing strategies rather than predetermined decision trees.
Methodological description in the paper of a Supervisor component that performs dynamic decomposition, delegation to modality-appropriate tools (examples given), and adaptive routing; supported by the framework's implementation details.
We present an agentic AI framework for autonomous multimodal query processing that coordinates specialized tools across text, image, audio, video, and document modalities.
Paper describes the framework design and components (Supervisor, modality-specific tools) and states support for text, image, audio, video, and document modalities; no external benchmark cited for this capability beyond the paper's own implementation.
The study employs an input–output (I–O) modeling framework using IMPLAN 2022 data to estimate direct, indirect, and induced impacts of investments in greenhouse and robotics sectors for Northwest Indiana as part of Project TRAVERSE.
Explicit methodological statement in the paper: use of IMPLAN 2022 I–O model; geographic scope NWI; linkage to EDA Project TRAVERSE.