Evidence (11633 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	609	159	77	736	1615
Governance & Regulation	664	329	160	99	1273
Organizational Efficiency	624	143	105	70	949
Technology Adoption Rate	502	176	98	78	861
Research Productivity	348	109	48	322	836
Output Quality	391	120	44	40	595
Firm Productivity	385	46	85	17	539
Decision Quality	275	143	62	34	521
AI Safety & Ethics	183	241	59	30	517
Market Structure	152	154	109	20	440
Task Allocation	158	50	56	26	295
Innovation Output	178	23	38	17	257
Skill Acquisition	137	52	50	13	252
Fiscal & Macroeconomic	120	64	38	23	252
Employment Level	93	46	96	12	249
Firm Revenue	130	43	26	3	202
Consumer Welfare	99	51	40	11	201
Inequality Measures	36	105	40	6	187
Task Completion Time	134	18	6	5	163
Worker Satisfaction	79	54	16	11	160
Error Rate	64	78	8	1	151
Regulatory Compliance	69	64	14	3	150
Training Effectiveness	81	15	13	18	129
Wages & Compensation	70	25	22	6	123
Team Performance	74	16	21	9	121
Automation Exposure	41	48	19	9	120
Job Displacement	11	71	16	1	99
Developer Productivity	71	14	9	3	98
Hiring & Recruitment	49	7	8	3	67
Social Protection	26	14	8	2	50
Creative Output	26	14	6	2	49
Skill Obsolescence	5	37	5	1	48
Labor Share of Income	12	13	12	—	37
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Digitization is reshaping the structures of Resource Dependence Theory (RDT) instead of eliminating it completely (Yordanova & Hristozov, 2025).

Conceptual/theoretical claim supported by citation to Yordanova & Hristozov (2025); presented as an interpretive conclusion about how digitization interacts with organizational dependence structures. No empirical details provided in the excerpt.

high mixed Re-Evaluation of Resource Dependence in AI Enabled SME Finan... structure of resource dependence / organizational dependence on external resourc...

CLARITI matches GPT-5's resolution rate on underspecified issues while generating 41% fewer questions.

Empirical evaluation comparing CLARITI and GPT-5 on a task set of underspecified software engineering issues; the result reported in the abstract indicates parity in resolution rate and a quantified reduction in questions (41%) but the abstract does not report sample size, test set composition, or statistical significance.

high mixed Asking What Matters: Reward-Driven Clarification for Softwar... resolution rate (task success) and number of clarifying questions generated

They can produce fluent outputs that resemble reflection, but lack temporal continuity, causal feedback, and anchoring in real-world interaction.

Descriptive claim made in the text contrasting surface-level fluency with missing properties; no empirical data or experiments provided.

high mixed Governing Reflective Human-AI Collaboration: A Framework for... fluency vs. temporal_continuity, causal_feedback, real-world_anchoring

A within-subject human study with 20 players and 600 games shows that our interventions significantly improve performance for low- and mid-skill players while matching expert-engine interventions for high-skill players.

Within-subject human experiment reported in the paper: N = 20 players, 600 games total; comparisons of performance under the proposed interventions versus expert-engine interventions.

high mixed Improving Human Performance with Value-Aware Interventions: ... human player performance in chess games (game outcomes / performance metrics) by...

This work establishes a foundation for understanding how generative AI systems not only augment cognitive performance but also reshape self-perception and perceived expertise.

Paper's stated contribution presenting theory and conceptual groundwork; no empirical validation provided in the abstract.

high mixed The LLM Fallacy: Misattribution in AI-Assisted Cognitive Wor... interaction between augmented cognitive performance and changes in self-percepti...

The LLM fallacy has implications for education, hiring, and AI literacy.

Implications and argumentation presented in the paper; these are prospective and conceptual rather than supported by empirical data in the abstract.

high mixed The LLM Fallacy: Misattribution in AI-Assisted Cognitive Wor... impacts on education practices, hiring decisions, and AI literacy needs

Further research is needed to explore the longitudinal impact of these AI deployments on local labor markets and the creation of indigenous datasets that reflect Cameroon’s unique linguistic diversity.

Authors' identified research gaps and recommendations; statement of future research needs rather than empirical result.

high mixed A Framework for Sovereign AI Governance and Economic Growth ... longitudinal impacts on local labor markets and creation/use of indigenous lingu...

The analysis reveals a non-linear, U-shaped relationship between changes in frontier skill intensity and employment growth.

Statistical linkage of changes in frontier skill intensity (OTSS changes) to employment growth using administrative data from 2012–2023; reported functional form is U-shaped.

high mixed AI‐powered skill classification: mapping technology intensit... relationship between changes in frontier skill intensity and employment growth

Frontier technologies remain concentrated in specialised occupations, while digital technologies are widespread.

Distributional analysis of OTSS across occupations showing concentration patterns of frontier technologies versus ubiquity of digital technologies.

high mixed AI‐powered skill classification: mapping technology intensit... distribution/concentration of technology-intense skills across occupations

For the average worker in 2023, manual technologies account for the largest share of skill content (42 per cent), followed by digital (38 per cent) and frontier technologies (20 per cent).

Computed OTSS applied to occupation-level data for Germany in 2023; reported shares for the "average worker".

high mixed AI‐powered skill classification: mapping technology intensit... share of occupational skill content by technology type (manual, digital, frontie...

Removing safety layers made the system less useful: structured validation feedback guided the model to correct outcomes in fewer turns, while the unconstrained system hallucinated success.

Qualitative and quantitative comparisons from the deployed evaluation across the three conditions (observations about turn counts, validation-feedback loops, and model hallucinations in unconstrained condition over the 25 scenario trials).

high mixed Bounded Autonomy for Enterprise AI: Typed Action Contracts a... number of interaction turns to correct outcome; presence of hallucinated success

The results show how non-IID data, competition intensity, and incentives shape organizational strategies and social welfare.

Findings from the paper's experiments and analyses that vary non-IIDness, competition intensity, and incentive parameters; no numeric sample sizes provided in abstract.

high mixed Cooperate to Compete: Strategic Data Generation and Incentiv... organizational_strategies / social_welfare

Outcomes are shaped not only by benchmark quality but also by competitive pressure, including user switching, routing decisions, and operational constraints.

Argument/assertion in paper framing motivations for Marketplace Evaluation; conceptual reasoning listing mechanisms (user switching, routing, operational constraints); no empirical tests or sample size reported.

high mixed Evaluation of Agents under Simulated AI Marketplace Dynamics post-deployment system outcomes (e.g., success influenced by competition factors...

AI plays a dual role as enhancer and eroder, simultaneously strengthening performance while eroding underlying expertise (the 'AI-as-Amplifier Paradox').

Framing claim presented in the paper's conceptual argument and grounded by the paper's stated year-long empirical study among cancer specialists (no numerical sample size reported in abstract).

high mixed From Future of Work to Future of Workers: Addressing Asympto... preservation of underlying expertise vs. short-term performance

Cross-border citations show continued technological interdependence rather than decoupling, with Chinese AI inventors relying more heavily on U.S. frontier knowledge than vice versa.

Citation analysis of cross-border patent citations between Chinese and U.S. AI patents (paper reports asymmetry in reliance based on citation patterns).

high mixed AI Patents in the United States and China: Measurement, Orga... cross-border patent citation patterns (directional reliance on frontier knowledg...

The organization of AI innovation differs sharply: U.S. AI patenting is concentrated among large private incumbents and established hubs, whereas Chinese AI patenting is more geographically diffuse and institutionally diverse, with larger roles for universities and state-owned enterprises.

Analysis of assignee types, geographic dispersion, and institutional composition of AI patents in the two countries (concentration metrics and assignee categorizations described in paper).

high mixed AI Patents in the United States and China: Measurement, Orga... assignee concentration, geographic diffusion, institutional composition (share o...

Across all settings, AI Organizations composed of aligned models produce solutions with higher utility but greater misalignment compared to a single aligned model.

Reported experimental results aggregated across two practical settings (AI consultancy and AI software team) and 12 tasks; direct comparison between AI Organizations of aligned models and a single aligned model.

high mixed AI Organizations are More Effective but Less Aligned than In... solution utility (higher) and model misalignment (greater)

Multi-agent "AI organizations" are simultaneously more effective at achieving business goals, but less aligned, than individual AI agents.

Experimental comparison reported in the paper: experiments comparing multi-agent AI organizations to single aligned agents across tasks and settings (described below).

high mixed AI Organizations are More Effective but Less Aligned than In... solution utility (effectiveness at achieving business goals) and model alignment...

Alignment operates as a two-way translation, where models are made 'safe for worlds' while those worlds are reshaped to be 'safe for models.'

Conceptual claim supported by ethnographic examples illustrating reciprocal adaptations between models and social/institutional contexts in Nairobi's credit-scoring ecosystem.

high mixed Risk, Data, Alignment: Making Credit Scoring Work in Kenya reciprocal adjustments between predictive models and social/institutional enviro...

Algorithmic credit scoring is accomplished through the ongoing work of alignment that stabilizes risk under conditions of persistent uncertainty, taking epistemic, modeling, and contextual forms.

The paper's theoretical argument grounded in nine-month ethnographic observations and analysis of how practitioners and institutions engage in alignment work across epistemic, modeling, and contextual dimensions.

high mixed Risk, Data, Alignment: Making Credit Scoring Work in Kenya alignment practices that stabilize risk amid uncertainty (epistemic, modeling, c...

Practitioners negotiate model performance via technical and political means.

Observational data from the ethnography showing technical adjustments, benchmarks, and political negotiation (e.g., with regulators or management) to establish acceptable performance.

high mixed Risk, Data, Alignment: Making Credit Scoring Work in Kenya practices used to achieve and justify model performance (technical tuning and po...

Practitioners formulate risk through multiple interpretations.

Ethnographic evidence from interviews and observations indicating that risk is characterized differently across actors (technical, legal, business interpretations).

high mixed Risk, Data, Alignment: Making Credit Scoring Work in Kenya variation in definitions and framings of risk among practitioners

Practitioners construct alternative data using technical and legal workarounds.

Field observations and interviews showing practitioners employing technical methods and legal strategies to create or repurpose alternative data sources for credit scoring.

high mixed Risk, Data, Alignment: Making Credit Scoring Work in Kenya practices for generating and using alternative data in credit models

Algorithmic credit scoring is being transformed by new actors, techniques, and shifting regulations.

Ethnographic fieldwork documenting the entry of new actors, novel technical techniques, and regulatory changes affecting credit scoring in Nairobi's digital lending ecosystem.

high mixed Risk, Data, Alignment: Making Credit Scoring Work in Kenya structural transformation of algorithmic credit scoring (actor composition, tech...

Credit scoring is an increasingly central and contested domain of data and AI governance.

Nine-month ethnography of credit scoring practices in Nairobi, Kenya; participant observation and interviews across stakeholders in digital lending.

high mixed Risk, Data, Alignment: Making Credit Scoring Work in Kenya role of credit scoring in data and AI governance (centrality and contestedness)

Although some frontier models exceed human performance, model accuracy is still far below what would enable reliable experimental guidance.

Paper reports instances where top-performing (frontier) models outperform aggregate human expert accuracy on SciPredict, but concludes overall accuracies are insufficient for reliable experimental guidance.

high mixed SciPredict: Can LLMs Predict the Outcomes of Scientific Expe... prediction_accuracy / usability_for_guidance

The local labor market will follow a dual trajectory: low-skill, routine jobs face high automation risk while demand will rise for AI-collaborative, higher-skill roles.

Paper's analytical prediction based on distinguishing current job roles into routine/repetitive vs cognitive/non-routine and projecting likely impacts; no numeric forecasts or sample sizes provided in the excerpt.

high mixed PREDICTING THE FUTURE OF JOBS IN NAGPUR DISTRICT MIDC: THE R... combined job displacement for routine roles and increased demand for AI-collabor...

Professional and Technical Services, Information, and Finance and Insurance account for approximately 86 percent of the base-case direct contribution.

Sectoral decomposition of base-case direct contribution in the model; paper explicitly reports the three sectors' combined share as ~86%.

high mixed AI Capex Is Justified: A Bottom-Up Sectoral Estimate of Arti... share of base-case direct GDP contribution by sector (three-sector concentration...

The inverted U-shaped pattern between AI knowledge stickiness and technological concentration is more clearly detected in eastern cities and in small and medium-sized cities; in large cities the quadratic term is not statistically significant.

Heterogeneity/subsample regressions by region (east vs. other) and city size categories within the city-year panel (2014–2023); statistical significance of quadratic term differs across subsamples.

high mixed Knowledge stickiness and technological concentration in the ... technological concentration (presence and significance of nonlinear relationship...

Technological complexity moderates the nonlinear (inverted U) association between AI knowledge stickiness and technological concentration by altering its strength and curvature rather than producing a simple, uniform shift in the turning point.

Interaction/heterogeneity analyses in the two-way fixed-effects city-year panel (2014–2023), examining moderating role of a technological complexity measure on the quadratic association.

high mixed Knowledge stickiness and technological concentration in the ... technological concentration (degree and curvature of the stickiness–concentratio...

There is an inverted U-shaped association between AI knowledge stickiness and technological concentration: higher stickiness up to a limit leads to more concentration and thereafter the opposite.

City-year panel combining AI patent applications with urban statistics for 2014–2023; two-way fixed-effects regression showing a significant positive linear and negative quadratic term (nonlinear association).

high mixed Knowledge stickiness and technological concentration in the ... technological concentration (allocation of AI activity across sub-technology bra...

Subjectivity persisted in AI-powered recruitment decisions; human judgment remained an important factor.

Theme 2 (subjectivity in AI-powered recruitment) from interviews indicating retained human subjectivity and judgement in recruitment processes (n = 22).

high mixed The augmented recruiter: examining AI integration and decisi... degree_of_subjectivity_in_decision_making

Experiments on the MovieLens-100k dataset illustrate when the empirical payout aligns with — and diverges from — Shapley fairness across different settings and algorithms.

Empirical evaluation performed on the MovieLens-100k dataset (≈100,000 ratings) comparing the proposed payout rule and algorithmic outcomes to Shapley-value allocations across multiple experimental settings and algorithms.

high mixed Creator Incentives in Recommender Systems: A Cooperative Gam... alignment/divergence between empirical payouts and Shapley-value fairness

For heterogeneous agents the cooperative game still admits a non-empty core, though convexity and Shapley value core-membership are no longer guaranteed.

Theoretical analysis for heterogeneous-agent case provided in the paper: establishes core non-emptiness but shows convexity and Shapley-in-core do not generally hold.

high mixed Creator Incentives in Recommender Systems: A Cooperative Gam... core non-emptiness; lack of guaranteed convexity and Shapley membership

User interactions in online recommendation platforms create interdependencies among content creators: feedback on one creator's content influences the system's learning and, in turn, the exposure of other creators' contents.

Conceptual/empirical motivation stated in the paper; motivates the multi-agent bandit modeling of creator interactions in recommender systems.

high mixed Creator Incentives in Recommender Systems: A Cooperative Gam... interdependencies in content exposure induced by user feedback

Sensitivity analyses indicate the observed positive belief changes likely reflect recovery from carry-over effects rather than genuine training-induced shifts.

Authors' sensitivity analyses discussed in the paper that examined alternative explanations (e.g., carry-over effects) and concluded the belief-change result is likely due to recovery from such effects.

high mixed Scaffolding Human-AI Collaboration: A Field Experiment on Be... validity of belief-change effect (source attribution: training vs. carry-over re...

Simulations demonstrate that standard methods, such as principal components analysis and inverse covariance weighting, can generate spurious cross-study differences, whereas our approach recovers comparable latent treatment effects.

Simulation experiments reported in the paper comparing the proposed method to PCA and inverse covariance weighting; results show PCA and inverse-covariance-weighted estimators can produce spurious cross-study differences while the proposed method recovers comparable latent treatment effects (no simulation sample sizes provided in the abstract).

high mixed Nonparametric Identification and Estimation of Causal Effect... comparability/accuracy of estimated latent treatment effects across studies (sim...

We ran two large preregistered experiments (N=17,950 responses from 14,779 people) using conversational AI models to persuade participants on a range of attitudinal and behavioural outcomes, including signing real petitions and donating money to charity.

Statement in paper reporting two preregistered experiments, sample sizes (17,950 responses; 14,779 people), use of conversational AI models, and target outcomes including petition signing and charitable donations.

high mixed Artificial intelligence can persuade people to take politica... use of conversational AI to persuade participants on attitudinal and behavioral ...

Big data analytics (BDA) adoption is a risky strategy with potentially high rewards for start-ups.

Stated as a summary conclusion based on empirical analysis of a large sample of start-ups in Germany comparing adopters and non-adopters across multiple performance measures (survival, costs, sales, employee growth, access to financing).

high mixed Big data-based management decisions and start-up performance overall performance/risk–reward tradeoff

While AI may reduce certain traditional roles, it also enhances job quality and creates new career pathways within the commerce sector.

Reported finding from the paper's synthesis of existing studies and sectoral observations (qualitative literature synthesis).

high mixed IMPACT OF ARTIFICIAL INTELLIGENCE ON EMPLOYMENT IN THE COMME... reductions in traditional roles vs. improvements in job quality and new career p...

AI exhibits a dual nature—both as a disruptor and an enabler of employment in the commerce sector.

Paper-level synthesis of contradictory findings and sectoral patterns reported across reviewed literature (qualitative literature synthesis).

high mixed IMPACT OF ARTIFICIAL INTELLIGENCE ON EMPLOYMENT IN THE COMME... net disruptive vs. enabling effects on employment

Bounded agents act as an amplifying but not necessary extension to the foundation-model stack for changing work coordination.

Conceptual argument within the paper distinguishing bounded agents from the core stack; no empirical comparison or measurement reported.

high mixed Remote-Capable Knowledge Work Should Default to AI-Enabled F... role of bounded agents in amplifying coordination impacts

The spatial spillover effects are geographically constrained and vary significantly across regions.

Reported heterogeneity in spatial Durbin model results and discussion of geographic constraint and inter-regional variation (regional heterogeneity analysis).

high mixed Research on the Pathways and Spatial Effects of Digital–Inte... heterogeneity of spatial spillover effects on carbon intensity across regions

The effects of generative AI on work and organisations are heterogeneous and context-dependent, shaped by job roles, skill levels, and institutional environments.

Synthesis across the included studies noting variation in outcomes conditional on role, skill, and institutional context.

high mixed Generative AI in the Workplace: A Systematic Review of Produ... heterogeneity of AI effects across roles/skills/institutions

Overall, AI emerges as a transformative but context-dependent tool for business decision-making in Latin America.

The authors' overall interpretation and synthesis of the 27 reviewed studies highlighting variable outcomes depending on context and readiness.

high mixed Artificial Intelligence for Business Decision-Making in Lati... overall impact of AI on business decision-making (transformative effect conditio...

The positive effect of big data applications on firms' markups exhibits heterogeneity across organizational, technological, and environmental dimensions.

Paper reports heterogeneity analysis showing variation in the magnitude of the positive markup effect across organizational, technological and environmental factors; based on model implications and empirical subgroup/interaction tests using micro-level firm data (sample size not reported).

high mixed Big data application and firm markups: evidence from China heterogeneity of the big-data → markup effect across organizational, technologic...

Although the concurrent paradigm performs worse than the sequential paradigm in terms of immediate task performance, it is more effective in promoting users' emotional trust.

Comparison between concurrent and sequential AI-assisted decision-making paradigms in the RCT (N=120); authors report concurrent < sequential for immediate task performance, but concurrent > sequential for emotional trust.

high mixed How AI-Assisted Decision-Making Paradigms and Explainability... immediate task performance (negative) and emotional trust (positive)

AI adoption outcomes depend on organizational routines, data arrangements, accountability structures, and public values.

Empirical and theoretical literature review and argument in the article drawing on scholarship in digital government and public-sector technology adoption.

high mixed Governing frontier general-purpose AI in the public sector: ... determinants of AI adoption in government (organizational, data, accountability,...

If employment losses are relatively small and productivity gains are realised, AI adoption could boost Exchequer revenues. But if job displacement is sizeable, tax receipts fall while welfare spending rises, resulting in potentially large pressures on the public finances.

Conditional fiscal scenarios simulated in the report combining employment, wage and benefit changes with the public finance implications (tax receipts and welfare spending); reported as scenario-based outcomes.

high mixed Artificial Intelligence and income inequality in Ireland Exchequer revenues / tax receipts and welfare spending

Ireland’s tax and welfare system absorbs most of the income loss for lower income households, and roughly half of the loss for households at the top of the income distribution.

Microsimulation using SWITCH to model taxes and transfers applied to simulated income changes across income groups; reported as a finding in the report.

high mixed Artificial Intelligence and income inequality in Ireland net income after taxes and transfers (absorption of income loss)

« Prev 1 2 3 … 5 6 7 … 232 233 Next »