Evidence (7953 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	402	112	67	480	1076
Governance & Regulation	402	192	122	62	790
Research Productivity	249	98	34	311	697
Organizational Efficiency	395	95	70	40	603
Technology Adoption Rate	321	126	73	39	564
Firm Productivity	306	39	70	12	432
Output Quality	256	66	25	28	375
AI Safety & Ethics	116	177	44	24	363
Market Structure	107	128	85	14	339
Decision Quality	177	76	38	20	315
Fiscal & Macroeconomic	89	58	33	22	209
Employment Level	77	34	80	9	202
Skill Acquisition	92	33	40	9	174
Innovation Output	120	12	23	12	168
Firm Revenue	98	34	22	—	154
Consumer Welfare	73	31	37	7	148
Task Allocation	84	16	33	7	140
Inequality Measures	25	77	32	5	139
Regulatory Compliance	54	63	13	3	133
Error Rate	44	51	6	—	101
Task Completion Time	88	5	4	3	100
Training Effectiveness	58	12	12	16	99
Worker Satisfaction	47	32	11	7	97
Wages & Compensation	53	15	20	5	93
Team Performance	47	12	15	7	82
Automation Exposure	24	22	9	6	62
Job Displacement	6	38	13	—	57
Hiring & Recruitment	41	4	6	3	54
Developer Productivity	34	4	3	1	42
Social Protection	22	10	6	2	40
Creative Output	16	7	5	1	29
Labor Share of Income	12	5	9	—	26
Skill Obsolescence	3	20	2	—	25
Worker Turnover	10	12	—	3	25

The paper constructs a multidimensional digitalization index composed of digital infrastructure, digital service capacity, and the digital development environment.

Index construction described in data/methods: composite indicator combining measures of connectivity/broadband (infrastructure), e-commerce/digital finance (service capacity), and policy/institutional/human capital indicators (development environment).

high null result Digital rural development and agricultural green total facto... Digitalization index components (infrastructure, service capacity, development e...

The study is observational (panel) and subject to limitations: residual confounding is possible; two-way fixed-effects estimators can be biased with heterogeneous treatment timing or dynamics; external validity beyond China and non-grain crops is not established.

Authors' stated limitations and caveats in the paper regarding identification and generalizability of results from the CLDS 2014–2018 observational panel.

high null result Whole-Process Agricultural Production Chain Management and L... study validity and generalizability (methodological limitation)

The study uses two-way fixed-effects (household and year) models as the primary identification strategy and employs propensity score matching (PSM) as a robustness check.

Methods section of the paper describing estimation strategy applied to the CLDS 2014–2018 panel of grain-producing households.

high null result Whole-Process Agricultural Production Chain Management and L... methodological approach (no substantive outcome)

The regional average minimum cost of salaried labor (MCSL) was 43.1% of GDP per worker in 2023.

Computed for the same 19-country sample (baseline 2023) using country statutory employer obligations and reporting MCSL relative to GDP per worker following the updated IDB approach.

high null result Salaried Labor Costs in Latin America and the Caribbean: A T... MCSL (minimum cost of salaried labor) as % of GDP per worker

The regional average non-wage cost of salaried labor (NWC) in Latin America and the Caribbean was 51.1% of formal wages in 2023.

Calculated for a sample of 19 Latin American and Caribbean countries for baseline year 2023 by compiling country-specific statutory employer obligations (payroll taxes, social contributions, mandated benefits, severance, etc.) and expressing employer non-wage costs relative to formal wages using the updated IDB methodology.

high null result Salaried Labor Costs in Latin America and the Caribbean: A T... NWC (employer non-wage costs) as % of formal wages

Attributing productivity changes specifically to AI requires causal identification beyond VIS accounting (e.g., experiments, instrumental variables, difference-in-differences).

Paper notes that VIS is an accounting framework and that causal attribution to AI requires econometric/experimental methods beyond input–output accounting.

high null result Measuring labor productivity dynamics in U.S. industrial and... need for causal identification methods to link observed productivity changes to ...

The method uses BEA for industry output and industry-by-industry transactions, BLS for employment and hours worked, and IMPLAN for detailed input–output structure and sector mapping; coverage period is 2014–2023.

Explicit data sources and time coverage stated: public BEA, BLS, and IMPLAN annual data 2014–2023 used to construct input–output matrices and labor measures.

high null result Measuring labor productivity dynamics in U.S. industrial and... data provenance and temporal coverage (2014–2023)

Limitations of the review include the small sample of studies, uneven geographic coverage, heterogeneity in methods across studies, and limited long‑run evidence (especially on generative AI), which complicate causal aggregation.

Author-reported limitations based on the meta-assessment of the 17 included studies (variation in methods, contexts, and time horizons).

high null result The role of generative artificial intelligence on labor mark... limitations to causal inference and generalizability

Design of this work: a systematic literature review and meta‑synthesis of empirical findings from peer‑reviewed journals (2020–2025), based on 17 publications.

Stated methods and inclusion criteria of the paper: systematic review of peer‑reviewed literature (sample = 17).

high null result The role of generative artificial intelligence on labor mark... study design / review methodology

Long-term evidence on generative AI’s structural labor‑market effects is scarce; few longitudinal studies exist.

Assessment of study horizons and methods among the 17 papers indicates limited long-run and longitudinal analyses specifically on generative AI impacts.

high null result The role of generative artificial intelligence on labor mark... availability of long-term / longitudinal studies on generative AI effects

Empirical coverage is limited for low‑income countries; evidence from such settings is scarce.

Geographic distribution of the 17 reviewed studies shows concentration in advanced economies with few or no studies focused on low-income countries.

high null result The role of generative artificial intelligence on labor mark... geographic representativeness of empirical evidence

The literature shows a surge in research activity on AI and labor markets in 2023–2025 and a concentration of studies in advanced economies.

Meta-analytic summary of the publication years and geographic focus among the 17 selected publications (temporal and geographic count of included studies).

high null result The role of generative artificial intelligence on labor mark... publication counts by year and geographic coverage

Results depend on accurate skill extraction from vacancy texts and valid measures of occupational exposure/complementarity; causal interpretation of diffusion effects may be limited by endogeneity (e.g., technology adoption responding to labor-market conditions).

Authors' stated methodological limitations: reliance on text-analysis identification of skills and on constructed measures of exposure/complementarity; acknowledgement of endogeneity concerns limiting causal claims.

high null result Bridging Skill Gaps for the Future Validity and causal interpretability of estimated diffusion effects (methodologi...

The paper proposes two conceptual models (AI/ML‑Driven Labor Market Transformation Model and Sectoral Impact and Resilience Model) to organize heterogeneous findings and generate testable hypotheses about how AI reshapes labor across sectors and skill levels.

Conceptual synthesis integrating Technological Determinism, Socio‑Technical Systems Theory (STS), and Skill‑Biased Technological Change (SBTC); the models are theoretical outputs of the review used to map mechanisms and heterogeneity rather than empirical findings.

high null result The Impact of AI Machine Learning on Human Labor in the Work... conceptual mapping of mechanisms (task automation vs augmentation, sectoral expo...

There are substantial measurement and identification gaps in the literature: heterogeneity in measuring 'AI adoption', limited long‑run causal evidence, and geographic bias toward advanced economies.

Methodological assessment within the review noting variability across studies in AI measures (patents, investment, task exposure proxies), paucity of long‑run causal designs, and concentration of empirical studies in advanced economies; this is a meta‑evidence limitation statement.

high null result The Impact of AI Machine Learning on Human Labor in the Work... quality and robustness of empirical evidence on AI's labor‑market impacts

The Iceberg Index indicates where capability exists but does not indicate whether or when job losses will occur.

Explicit caution in the paper noting the distinction between technical exposure (capability overlap) and realized labor-market outcomes; methodological limitation described.

high null result The Iceberg Index: Measuring Workforce Exposure in the AI Ec... distinction between capability exposure (Iceberg Index) and realized job loss/ad...

The Iceberg Index captures capability overlap but does not capture firm adoption choices, regulatory constraints, social acceptance, complementarity effects, or worker reallocation dynamics.

Limitations section in the paper explicitly listing these omitted factors; methodological boundaries of the Iceberg Index stated.

high null result The Iceberg Index: Measuring Workforce Exposure in the AI Ec... scope/limitations of the Iceberg Index (what it does not measure)

Model and simulations are implemented with the AgentTorch framework.

Implementation note in the paper indicating AgentTorch was used to build the agent-based models and run simulations.

high null result The Iceberg Index: Measuring Workforce Exposure in the AI Ec... implementation platform (AgentTorch)

The simulation model represents 151 million U.S. workers as autonomous agents, covers 32,000+ distinct skills, links agents to thousands of AI tools, and provides county-level resolution (~3,000 U.S. counties).

Model specification described in the paper: large-population agent-based model (AgentTorch) parameterized with occupation, skills portfolios, wages, and county locations; counts provided in the paper.

high null result The Iceberg Index: Measuring Workforce Exposure in the AI Ec... model scope metrics: number of agents (151M), skills (~32k), counties (~3k), and...

The Iceberg Index is a skills-centered metric that measures the wage value of specific skills AI systems can perform within each occupation; it quantifies technical exposure (capability overlap), not displacement, adoption timelines, or realized outcomes.

Methodological definition: mapping of ~32,000 skills to occupations with wage-value contributions, summing wages of skills that current AI capabilities cover to compute the index.

high null result The Iceberg Index: Measuring Workforce Exposure in the AI Ec... Iceberg Index value (wage-value of automatable skills per occupation/geography)

The study maps employment channels for AI-competent graduates and documents the most frequent job titles/roles and associated wage levels.

Descriptive analysis of employer channels, occupational role frequencies, and wage data compiled in the monitoring dataset covering graduates and alternative-route entrants.

high null result Employment og Graduates of Educational Programs in the Field... Distribution across employment channels, frequency of job titles/roles, and wage...

Quasi-experimental designs (difference-in-differences, instrumental variables, event studies) and panel regressions are useful methods for identifying causal effects of AI adoption where plausibly exogenous variation exists.

Methodological summary in the paper listing common empirical strategies used in the literature to estimate causal impacts of technology adoption.

high null result Intelligence and Labor Market Transformation: A Critical Ana... valid causal estimates of AI's effects on employment and wages

Current research is limited by measurement challenges in capturing AI capabilities and firm-level adoption, and by a lack of longitudinal worker-firm data and causal identification in many settings.

Explicit limitations noted by the paper: gaps in task measures, scarce longitudinal linked datasets, and methodological challenges in causal inference.

high null result Intelligence and Labor Market Transformation: A Critical Ana... quality and availability of AI exposure measures and longitudinal causal evidenc...

This paper's approach is qualitative and based on secondary literature synthesis; it does not collect primary survey, experimental, or administrative data.

Explicit statement in the Data & Methods section of the paper.

high null result Who Loses to Automation? AI-Driven Labour Displacement and t... type of data used (secondary qualitative synthesis rather than primary empirical...

Key empirical gaps remain: better measurement of K_T (AI/software capital), more granular matched employer‑employee and wealth data, and improved estimates of task-substitution elasticities are required to precisely quantify incidence and policy impacts.

Authors’ stated research agenda and limitations section, including sensitivity analyses showing outcome variation with parameter choices and measurement uncertainty.

high null result The Macroeconomic Transition of Technological Capital in the... quality/precision of measurement of K_T and task-substitution elasticities (rese...

Simulated teachers: for each LLM, we ran multiple independent runs treated as simulated teachers (typically around 30–40 per model in the Baseline Experiment and around 20–30 per condition × training-group cell in the Scaffolding Intervention Experiment); the conversation context was reset between teachers and preserved across trials within a teacher; LLMs were not given feedback about the outcome of their teaching to prevent learning during the task.

Methods section 'Simulated teachers' and prompt/instruction descriptions in Methods; sampling details provided in Methods 2.3.

high other Do Large Language Models Mentalize When They Teach? experimental design / simulation protocol (methodological claim)

Models are prompted to assess profiles along dimensions of social acceptance, marital stability, and cultural compatibility.

Experimental procedure: prompts asked models to rate profiles on the three named dimensions.

high other Sima AIunty: Caste Audit in LLM-Driven Matchmaking ratings for social acceptance, marital stability, cultural compatibility

We evaluate five LLM families (GPT, Gemini, Llama, Qwen, and BharatGPT).

Methods: models enumerated as the LLM families evaluated in the audit.

high other Sima AIunty: Caste Audit in LLM-Driven Matchmaking model set evaluated

We vary caste identity across Brahmin, Kshatriya, Vaishya, Shudra, and Dalit, and income across five buckets.

Experimental design described: caste identity explicitly manipulated across five named caste categories; income varied across five buckets.

high other Sima AIunty: Caste Audit in LLM-Driven Matchmaking manipulation of profile attributes (caste, income)

We conduct a controlled audit of caste bias in LLM-mediated matchmaking evaluations using real-world matrimonial profiles.

Described methodology in the paper: a controlled audit using real-world matrimonial profiles to probe LLMs for caste bias.

high other Sima AIunty: Caste Audit in LLM-Driven Matchmaking presence of caste bias in LLM-mediated matchmaking evaluations

Policy design to align high-tech industrial development with carbon-reduction goals should account for industrial life-cycle stages and value-chain positions.

Policy implication drawn from the empirical findings (inverted U-shape, stage-dependent mechanism, regional heterogeneity, and subsector differences) in the paper.

high other Exploring the nonlinear relationship between robotics manufa... policy alignment between industrial development and carbon-reduction goals

The modern labor market needs specialists in teaching professions.

Interpretation by the authors based on the counts of current vacancies and much larger pool of potential positions from the Unified Register of Vacancies and Ministry of Education data.

high positive DYNAMICS OF THE DOMESTIC LABOR MARKET IN RELATION TO THE DEM... demand for specialists in teaching professions

Employers mostly give preference to teachers who are ready to work in institutions of professional (vocational and technical) and specialized pre-higher education, as well as in private training centers.

Analysis of vacancy postings and/or comparison of vacancy counts across institution types using the Unified Register of Vacancies and Ministry of Education data (paper reports this pattern of employer preference).

high positive DYNAMICS OF THE DOMESTIC LABOR MARKET IN RELATION TO THE DEM... employer preference for teacher hires by type of educational institution

The potential (not-yet-vacant but may become so) teacher positions are much more, over 140,000.

Estimates of potential teacher positions taken from data of the Ministry of Education and Science of Ukraine (administrative data reported in the paper).

high positive DYNAMICS OF THE DOMESTIC LABOR MARKET IN RELATION TO THE DEM... number of potential teacher positions in specified educational institutions

The total number of current vacancies for teachers in the specified educational institutions is over 1,000.

Count of vacant teacher positions from the Unified Register of Vacancies of the State Employment Service (administrative register analysis reported in the paper).

high positive DYNAMICS OF THE DOMESTIC LABOR MARKET IN RELATION TO THE DEM... number of current vacancies for teachers in specified educational institutions

Repositioning informal systems as co-creators in urban governance (relational public administration) enables transformative governance and effective localization of SDGs in sustainable cities in South Africa.

Conceptual/analytical argumentation (theoretical paper; no empirical sample reported).

high positive EXPLORING THE LANDSCAPE OF DEMAND MANAGEMENT PRACTICES IN PU... governance inclusivity and localization of SDGs

Determinants that significantly increase the likelihood of participation in small-scale livestock production in Malawi include household size, access to credit, access to extension services, landholding size, distance to the market, and location in the Northern region.

Cross-sectional analysis of IHS5 (sample = 8,795 households); determinants identified as significant in the analysis.

high positive EXPLORING THE LANDSCAPE OF DEMAND MANAGEMENT PRACTICES IN PU... likelihood of participation in livestock production

Households engaged in small-scale livestock production in Malawi earned, on average, an additional MWK 36,405.76 compared to non-producing households.

Cross-sectional analysis of the Fifth Integrated Household Survey (IHS5) with a sample of 8,795 households.

high positive EXPLORING THE LANDSCAPE OF DEMAND MANAGEMENT PRACTICES IN PU... household income effect of livestock production

Individuals in Thohoyandou used traditional healing practices (e.g., steam inhalation with stones and salt; herbal concoctions including various named plants and mixtures) to survive COVID-19 without hospitalization, underscoring the significance of traditional healing practices during the pandemic.

Narrative inquiry based on in-depth interviews with three respondents (sample size = 3).

high positive EXPLORING THE LANDSCAPE OF DEMAND MANAGEMENT PRACTICES IN PU... use and perceived role of traditional healing practices during COVID-19

Teacher unions function as a counter-hegemonic force challenging neoliberal geopolitics and political norms and are repositioning as intellectual activists rather than compliant officials.

Qualitative interpretivist analysis of narrative interviews with unionized educators and public union discussions (no sample size reported).

high positive EXPLORING THE LANDSCAPE OF DEMAND MANAGEMENT PRACTICES IN PU... role/ideology of teacher unions in political emancipation

Digitalization significantly enhances market access and supplier diversity for SMMEs.

Qualitative secondary data thematic analysis (literature/reports/industry initiatives; no sample size reported).

high positive EXPLORING THE LANDSCAPE OF DEMAND MANAGEMENT PRACTICES IN PU... market access and supplier diversity for SMMEs

Indigenous Knowledge Systems (IKS) represent a dynamic body of wisdom encompassing sustainable agriculture, natural resource management, and community resilience, and offer proven, contextually grounded solutions to modern challenges like climate change and food insecurity.

Qualitative desktop research synthesizing existing literature (literature review; no sample size reported).

high positive EXPLORING THE LANDSCAPE OF DEMAND MANAGEMENT PRACTICES IN PU... role of IKS in addressing climate change and food insecurity

Overall, most LLMs achieve high Teaching Scores and are best fit by the Bayes Optimal Teacher, suggesting model-based (mentalizing) teaching strategies rather than model-free heuristics.

Synthesis of Baseline Experiment performance (high Teaching Scores, BOT BIC fits) and cognitive model comparisons shown in Figures 2–4.

high positive Do Large Language Models Mentalize When They Teach? Teaching Score and cognitive model fit

In scaffolding conditions LLMs reliably executed the auxiliary selection step: under Reward Scaffolding they preferentially selected edges ranked highest by reward, and under Inference Scaffolding they preferentially selected edges ranked as more likely unknown to the learner.

Scaffolding Intervention Experiment; auxiliary-step selection probabilities plotted in Figure 5 showing edge-rank dependent selection patterns for each LLM and humans.

high positive Do Large Language Models Mentalize When They Teach? Probability of choosing edges in auxiliary selection (scaffold compliance)

Bayes Optimal Teacher (BOT) provides the best overall account (via BIC) for the trial-by-trial teaching choices of most LLM models.

Cognitive model fitting and BIC model comparison applied to each simulated teacher's trial-by-trial choices; results shown in Figure 4 (BIC scores and fraction of simulated teachers best fit by each model).

high positive Do Large Language Models Mentalize When They Teach? Model fit (BIC) to LLM trial-by-trial choices

Most LLM teachers were concentrated in the higher-performing range, close to the Bayes Optimal Teacher benchmark, overlapping with higher-performing human subjects; some models (notably GPT-o4-mini, Gemini 2.5 Flash, Claude Sonnet 4.5) showed more variability.

Distribution of individual-level average Teaching Scores in Baseline Experiment (Figure 3) comparing LLMs to humans and to cognitive-model benchmark scores.

high positive Do Large Language Models Mentalize When They Teach? Average Teaching Score (individual-level distribution)

Most LLMs showed strong alignment with humans in graph-by-graph performance: seven models had large positive Pearson correlations with human performance (r ≈ 0.76–0.89, all p < 10^-4), and two additional models showed moderate correlations (r ≈ 0.46–0.56, p < .05); GPT-3.5 and Llama-4 Maverick were not significantly correlated with humans.

Baseline Experiment; graph-wise mean Teaching Score computed for 20 unique graphs; Pearson correlations between each model's graph-wise profile and human profile reported in Figure 2 with r and p-values.

high positive Do Large Language Models Mentalize When They Teach? Graph-wise Teaching Score correlation with human performance

Prompts can be treated as decision policies that allocate discretion between researcher and system, governing what is executed and when iteration stops.

Methodological framing advanced by the authors describing prompts as decision policies; conceptual claim based on the paper's analytic framework rather than empirical measurement.

high positive On the Carbon Footprint of Economic Research in the Age of G... conceptualization of prompts' role in workflow control and decision allocation

Operational constraints and decision rule prompts deliver large and stable footprint reductions while preserving decision equivalent topic outputs.

Experimental comparisons of prompt strategies in the benchmarked workflow showing reductions in runtime/CO2e and evaluated topic outputs' decision-equivalence (asserted in abstract; no numeric reductions or sample sizes provided).

high positive On the Carbon Footprint of Economic Research in the Age of G... carbon footprint / runtime reductions and preservation of topic output equivalen...

We benchmark a modern economic survey workflow, an LDA-based literature mapping implemented with GenAI assisted coding and executed in a fixed cloud notebook, measuring runtime and estimated CO2e with CodeCarbon.

Experimental benchmark described in the paper: single implemented workflow (LDA-based literature mapping) executed in a fixed cloud notebook with runtime and CO2e measured using CodeCarbon (methodological claim).

high positive On the Carbon Footprint of Economic Research in the Age of G... runtime and estimated CO2e (carbon footprint) of the benchmarked workflow

« Prev 1 2 3 … 40 41 42 … 159 160 Next »