Evidence (7278 claims)

Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.

The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).

Browse by theme

Nine broad, paper-level topics. Click one to filter the claims below.

Human-AI Collaboration

Claims by outcome category

Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.

Outcome	Positive	Negative	Mixed	Null	Total
Other	795	210	105	955	2131
Governance & Regulation	886	414	197	126	1654
Organizational Efficiency	826	204	129	87	1257
Technology Adoption Rate	681	259	128	110	1189
Research Productivity	464	138	65	349	1028
Output Quality	503	196	61	53	813
Decision Quality	351	180	84	51	673
AI Safety & Ethics	238	288	71	34	637
Firm Productivity	455	58	92	20	631
Market Structure	186	172	123	25	511
Task Allocation	222	70	76	34	407
Innovation Output	238	28	48	18	334
Skill Acquisition	177	62	62	17	318
Employment Level	107	57	108	13	287
Fiscal & Macroeconomic	135	72	44	26	284
Firm Revenue	172	50	28	5	256
Consumer Welfare	121	68	45	12	246
Task Completion Time	183	33	10	13	240
Inequality Measures	45	126	50	6	227
Worker Satisfaction	95	74	23	12	204
Error Rate	77	98	11	4	190
Regulatory Compliance	84	73	17	7	181
Automation Exposure	61	61	27	14	166
Training Effectiveness	98	21	14	19	154
Wages & Compensation	78	37	25	6	146
Developer Productivity	105	18	14	6	144
Team Performance	87	17	28	10	143
Job Displacement	12	83	23	1	119
Hiring & Recruitment	53	8	8	3	72
Social Protection	39	17	8	2	66
Creative Output	32	20	8	3	64
Skill Obsolescence	5	50	6	1	62
Labor Share of Income	17	20	17	—	54
Worker Turnover	15	15	—	3	33
Industry	—	—	—	1	1

Governance Remove filter

The report has limited primary quantitative impact evaluation and relies on policy texts and secondary sources rather than large-scale empirical measurement of AI’s economic effects.

Explicit limitations section in the report describing methods and data constraints.

high null result AI Governance and Data Privacy: Comparative Analysis of U.S.... presence/absence of primary quantitative impact evaluation of AI's economic effe...

The paper's empirical and policy conclusions are limited by its jurisdictional sample size (eleven) and reliance on available empirical/operational data, which the authors note is increasingly patchy due to declining transparency.

Methods and limitations sections explicitly noting sample size (eleven jurisdictions) and data availability constraints.

high null result The Global Landscape of Environmental AI Regulation: From th... limitations in generalizability (scope of jurisdictional mapping) and data compl...

Methodological needs for AI-era labor models include dynamic skill taxonomies, high-frequency labor data (job postings, firm-level automation measures), and uncertainty quantification.

Paper's Research & policy recommendations and Methodological needs section (explicit recommendations).

high null result AI-Based Predictive Skill Gap Analysis for Workforce Plannin... requirements for model inputs and design (dynamic taxonomies, data frequency, un...

The scenario analysis framework varies economic growth, automation rates, policy interventions, and investment to produce probabilistic demand–supply gaps.

Methods description of scenario analysis components and the variables varied in scenario experiments (explicit in Data & Methods).

high null result AI-Based Predictive Skill Gap Analysis for Workforce Plannin... probabilistic demand–supply gap distributions produced under varied scenario par...

Intended users of the Hub include organizations, educational institutions, and policymakers to inform reskilling/education strategies, regional economic policy, and labor-market interventions.

Explicit statement of target users and use cases in the Key Points / Implications sections.

high null result AI-Based Predictive Skill Gap Analysis for Workforce Plannin... targeting of outputs to specified stakeholder groups (intended adoption/use-case...

The system produces interpretable outputs for stakeholders: demand–supply trend analysis, geospatial hotspot maps, skill-gap radar charts, and policy simulation dashboards.

Paper's description of outputs and interactive visual analytics (listed output modalities).

high null result AI-Based Predictive Skill Gap Analysis for Workforce Plannin... generation of interpretable visual/analytic artifacts (trend charts, hotspot map...

The core modeling approach uses probabilistic growth modeling combined with intelligent skill synthesis to estimate future workforce requirements under alternative economic and policy scenarios.

Methods section describing the modeling components: probabilistic growth modeling and intelligent skill synthesis (architectural description).

high null result AI-Based Predictive Skill Gap Analysis for Workforce Plannin... probabilistic forecasts of future workforce requirements by sector/region under ...

The platform integrates multiple indicators such as regional economic growth projections, automation velocity, policy intervention strength, investment intensity, and market volatility (macro- and micro-level indicators).

List of input indicators given in the Data & Methods section of the paper (explicit enumeration of macro and micro variables).

high null result AI-Based Predictive Skill Gap Analysis for Workforce Plannin... integration of listed macro- and micro-level indicators into the modelling pipel...

Significant empirical gaps remain on long-term impacts (wage trajectories, employment composition, firm-level returns), verification/remediation cost quantification, and public-good risks of insecure code proliferation.

Cross-study synthesis explicitly identifying missing longitudinal and firm-level empirical research in the reviewed literature.

high null result ChatGPT as a Tool for Programming Assistance and Code Develo... absence or paucity of longitudinal studies and firm-level quantitative measureme...

The paper's conclusions are limited by reliance on secondary sources, heterogeneous cross‑study comparisons, limited causal identification of long‑run macro effects, and measurement challenges for AI‑driven intangible capital.

Authors' stated limitations section summarizing the nature of evidence used (qualitative literature review, secondary macro indicators, sectoral examples); this is an explicit self‑reported methodological limitation rather than an external empirical finding.

high null result AI and Robotics Redefine Output and Growth: The New Producti... strength of causal inference and measurement validity

Methodology used in the paper is a narrative review relying on secondary sources (literature, legal cases, policy reports, empirical perception studies) and conceptual synthesis; no new primary data were collected.

Paper's Data & Methods section explicitly states narrative review and secondary-data analysis.

high null result Ethical and societal challenges to the adoption of generativ... study methodology (use of secondary sources; absence of primary data)

Important empirical research gaps remain (consumer willingness-to-pay for authenticated vs. synthetic content, labor-displacement elasticities, market concentration dynamics, and cost–benefit evaluations of regulatory options).

Explicit statement of limitations and research needs in the paper, based on the authors' narrative review and absence of primary empirical studies within the paper.

high null result Ethical and societal challenges to the adoption of generativ... identified gaps in empirical knowledge and priority research questions

The paper's methodology is a secondary-data, narrative (qualitative) literature review; it contains no original empirical data or primary quantitative analysis.

Explicit methodological statement in the paper describing secondary data analysis and narrative synthesis; absence of primary datasets or statistical analyses.

high null result Ethical and societal challenges to the adoption of generativ... presence or absence of original empirical data

This paper is conceptual/theoretical and does not conduct primary empirical data collection.

Explicit methodological statement in the paper's Data & Methods section.

high null result Continental shift: operations and supply chain management re... study type (conceptual vs empirical)

Further causal, experimental research (randomized deployments) is needed to precisely quantify net productivity and labor reallocation effects of AI agents.

Paper's stated research priorities and explicit acknowledgement of limitations from observational design; no randomized trials reported in the study.

high null result Artificial Intelligence Agents in Knowledge Work: Transformi... need for randomized causal estimates of productivity and labor reallocation

There are measurement challenges for quality-adjusted productivity—errors and downstream effects may reduce net benefits of agent automation and are under-measured in the study.

Authors' noted limitations and concerns about quality-adjusted productivity measurement (error rates, downstream externalities) based on observational deployment experience; no formal measurement of downstream costs reported.

high null result Artificial Intelligence Agents in Knowledge Work: Transformi... quality-adjusted productivity (including errors and downstream effects)

Small-scale, domain-specific deployments of Alfred AI limit external validity to other industries or larger firms.

Deployment context described as small-scale e-commerce; authors note generalizability limitations stemming from domain- and scale-specific nature of the experiments.

high null result Artificial Intelligence Agents in Knowledge Work: Transformi... external validity / generalizability

Because the study is observational and non-randomized, causal claims about the effect of AI agents on productivity and labor are limited.

Study design explicitly described as applied experimentation and observational deployments (non-randomized); potential confounding and selection biases acknowledged by the authors.

high null result Artificial Intelligence Agents in Knowledge Work: Transformi... causal identification ability (limits on attributing observed effects to the age...

Priority research areas include evaluating long‑run distributional impacts of AI diffusion in agriculture, interactions between digital technologies and labor markets, inclusive financing models for adoption, and macroeconomic effects on food prices and trade.

Stated research agenda and gap analysis in the paper’s conclusions, derived from the review of existing literature and identified gaps.

high null result MODERN APPROACHES TO SUSTAINABLE AGRICULTURAL TRANSFORMATION research coverage (presence/absence of long‑run distributional studies, labor ma...

The current evidence base has gaps: more rigorous impact evaluations, long‑term soil and emissions accounting, and studies on distributional outcomes are needed.

Meta‑assessment within the paper noting limitations of existing literature (many short‑term pilots, limited long‑run soil/emissions data, few studies on who captures value); the claim is based on the review's appraisal of methods used in cited studies.

high null result MODERN APPROACHES TO SUSTAINABLE AGRICULTURAL TRANSFORMATION research evidence sufficiency (availability of long‑term causal estimates, soil/...

Economists and policymakers should fund long‑run evaluations (RCTs, quasi‑experimental designs) to estimate causal effects of AI interventions on productivity, welfare, and environmental outcomes.

Evidence‑gap analysis and policy recommendations in the paper; explicit call for rigorous impact evaluation methods given current paucity of long‑run causal evidence.

high null result MODERN APPROACHES TO SUSTAINABLE AGRICULTURAL TRANSFORMATION existence and number of long‑run RCTs/quasi‑experimental studies measuring produ...

There are limited long‑run randomized controlled trials (RCTs) on AI/IoT impacts for smallholders and scarce cross‑country data on distributional effects.

Literature review and evidence‑gap identification within the study; explicit statement that long‑run RCTs and cross‑country distributional data are scarce.

high null result MODERN APPROACHES TO SUSTAINABLE AGRICULTURAL TRANSFORMATION availability of long‑run RCT evidence, number of cross‑country distributional st...

Heterogeneous contexts mean impacts vary; careful piloting, monitoring, and adaptive policy are necessary to manage uncertainty in outcomes.

Synthesis and explicit discussion of uncertainties; evidence gaps section noting variable results across regions and interventions.

high null result MODERN APPROACHES TO SUSTAINABLE AGRICULTURAL TRANSFORMATION variation in intervention impacts across contexts (heterogeneity measures), need...

There are limited standardized measures of 'AI capital,' scarce data on firm-level AI investment and implementation quality, and few long-run causal estimates of AI’s effects on managerial productivity and labor outcomes.

Gap analysis based on literature review and methodological discussion within the book; observation about the state of available empirical evidence.

high null result Modern Management in the Age of Artificial Intelligence: Str... availability and standardization of AI investment/asset measures; existence of l...

The paper is primarily conceptual/architectural and does not present large empirical studies quantifying the phenomenon across firms or repositories.

Explicit methodological statement in the paper describing its use of thought experiments, mechanism reasoning, and illustrative examples rather than empirical datasets.

high null result Overton Framework v1.0: Cognitive Interlocks for Integrity i... presence/absence of empirical studies within the paper (binary)

There is a lack of large‑scale causal evidence on generative AI’s effects; the paper recommends RCTs, difference‑in‑differences, matched employer–employee panels, and longitudinal studies to fill empirical gaps.

Methodological critique and research agenda provided in the review; observation based on the authors' survey of the literature.

high null result The Use of ChatGPT in Business Productivity and Workflow Opt... n/a (research design recommendation; outcome is future evidence generation)

Policy interventions are needed for data protection, bias mitigation, model transparency, accountability, and public investments in workforce retraining to smooth transitions and reduce inequality.

Normative policy recommendations grounded in the review's synthesis of risks and distributional concerns; not an empirical claim but a recommendation.

high null result The Use of ChatGPT in Business Productivity and Workflow Opt... policy adoption (existence of regulations, programs), outcomes: retraining parti...

New productivity metrics are needed to capture AI impacts, including time‑use changes, quality‑adjusted output, and accounting for intangible AI capital.

Methodological recommendation from the conceptual synthesis, motivated by limitations of existing measures discussed in the paper.

high null result The Use of ChatGPT in Business Productivity and Workflow Opt... n/a (recommendation for metrics: time use, quality‑adjusted output, AI capital a...

The paper is a policy-design and conceptual-architecture work and presents no original microdata or econometric estimates.

Methods section explicitly states absence of original empirical data; document contains policy proposals and modeling agenda only.

high null result Token Taxes: mitigating AGI's economic risks presence/absence of original empirical data in the paper

Token taxes are usage-based surcharges applied at the point of sale for model inference (i.e., charged per token or per inference request).

Paper's definitional specification and conceptual description; policy-design discussion (no empirical data).

high null result Token Taxes: mitigating AGI's economic risks tax charged per token / per inference request (tax base definition)

Further empirical calibration and validation against observed behavioral and economic data are necessary; the framework primarily demonstrates method and emergent phenomena rather than ready predictive deployment.

Paper explicitly notes the necessity of further empirical calibration and frames results as demonstration of method and emergent phenomena. This is an explicit limitation statement in the summary.

high null result An LLM-Driven Multi-Agent Simulation Framework for Coupled E... level of empirical calibration/validation (current framework not yet empirically...

This paper is a narrative review synthesizing heterogeneous studies and case reports rather than providing meta-analytic estimates of effect sizes.

Methods statement in the paper describing review type as narrative synthesis and noting limitations (no meta-analysis).

high null result Artificial Intelligence in Drug Discovery and Development: R... presence/absence of pooled/meta-analytic effect size estimates

The paper proposes measurable metrics such as projection congruence indices, alignment persistence measures, monitoring/oversight burden, and outcome variability/tail risks attributable to agentic autonomy.

Explicit metric proposals in the methods and metrics section of the paper; presented as part of a research agenda rather than empirically implemented.

high null result Visioning Human-Agentic AI Teaming: Continuity, Tension, and... proposed measurement constructs (projection congruence, alignment persistence, m...

The paper proposes specific empirical and analytic follow-ups — multi-agent simulations, lab experiments with humans and adaptive agents, field case studies, econometric analyses, and formal economic models — to test the conceptual claims.

Explicit methods and research agenda listed in the paper; these are recommended future methods, not evidence.

high null result Visioning Human-Agentic AI Teaming: Continuity, Tension, and... feasibility and design of empirical/analytic methods for studying agentic HAT

Agentic AI is characterized by three properties that drive structural uncertainty: open-ended action trajectories, generative representations/outputs, and evolving objectives.

Definitions and taxonomy developed in the paper based on conceptual synthesis; presented as framing rather than empirically measured properties.

high null result Visioning Human-Agentic AI Teaming: Continuity, Tension, and... presence of specified agentic properties

The framework provides sector-specific implementation guidance tailored to healthcare and public administration, accounting for existing governance and regulatory structures.

Case/sector guidance sections offering practical recommendations and considerations for deployment in those sectors; design-oriented, not empirically piloted in the paper.

high null result Human–AI Handovers: A Dynamic Authority Reversal Framework f... implementation_guidance_presence; sector_adaptation_features

DAR identifies four trigger classes that govern transitions between authority states: data superiority, contextual judgment requirements, risk thresholds, and ethics/legal overrides.

Conceptual derivation and classification in the framework; mapping of trigger types to transition rules. Theoretical, no empirical data.

high null result Human–AI Handovers: A Dynamic Authority Reversal Framework f... trigger_class (categorical) and resulting authority_state_transitions

The Dynamic Authority Reversal (DAR) framework formalizes four discrete intra-episode authority states: Human-Leader/AI-Follower (HL), AI-Leader/Human-Follower (AL), Co-Leadership (CO), and Mutual Override (MO).

Formal conceptual specification and formal modeling within the paper; definitions of the four states and their roles. No empirical sample; theoretical/design artifact.

high null result Human–AI Handovers: A Dynamic Authority Reversal Framework f... authority_state (categorical: HL, AL, CO, MO)

Further quantitative and comparative research is needed to measure net productivity effects, skill trajectories, and generalizability across firm types and industries.

Authors' methodological assessment and limitations section noting single-firm qualitative design (Netlight) and rapidly evolving toolchains; recommendation for future empirical work.

high null result Rethinking How IT Professionals Build IT Products with Artif... gaps in current empirical evidence (lack of quantitative, longitudinal, cross-fi...

Long-term effects of adaptive marketing (habit formation, churn, lifetime value) are important for welfare and valuation but are harder to measure and require longitudinal or structural economic models.

Conceptual claim in measurement challenges; argues that short-horizon A/B tests may miss long-run harms or benefits, recommending longitudinal studies and structural models; no empirical long-term study presented.

high null result Personalized Content Selection in Marketing Using BERT and G... long-term churn rates, habit formation indicators, lifetime value (LTV)

Offline evaluation metrics (intent/sentiment classification accuracy, human-rated generation quality and factuality, simulated policy evaluation) are useful for pipeline development but do not fully capture online performance.

Paper contrasts offline metrics with online A/B testing and notes the need for online experiments; this is a methodological claim supported by the described evaluation pipeline rather than a presented empirical study.

high null result Personalized Content Selection in Marketing Using BERT and G... offline classification accuracy, human-rated generation quality vs online CTR/en...

Another important gap is quantifying complementarities between AI and different skill types (evaluative vs. generative tasks).

Review observation that existing empirical work has not systematically quantified how AI productivity gains vary with worker skill composition and complementary roles.

high null result ChatGPT as an Innovative Tool for Idea Generation and Proble... magnitude of complementarities between AI assistance and various human skill typ...

Key research gaps include a lack of long-run causal evidence on the effects of LLMs on firm-level innovation rates, business formation, and industry structure.

Explicit identification of gaps in the literature within the nano-review; the review states that most studies are short-term, task-level, or descriptive.

high null result ChatGPT as an Innovative Tool for Idea Generation and Proble... long-run causal impacts of LLM adoption on firm innovation, business formation, ...

High-priority research includes randomized controlled trials on hybrid vs. automated routing, long-run studies on labor markets in service sectors, and models quantifying trust externalities and governance costs.

Paper's stated research agenda based on identified evidence gaps and limitations (lack of randomized long-run studies).

high null result The Effectiveness of ChatGPT in Customer Service and Communi... research output (RCTs, long-run studies, models) addressing the specified gaps

Current evidence is promising but early: case studies, pilot deployments, and short-run experiments dominate; long-run causal evidence on labor and welfare effects is limited.

Explicit methodological assessment in the paper noting source types (deployments, pilots, vendor reports, short-run experiments) and limitations (heterogeneity, lack of randomized controls, short horizons).

high null result The Effectiveness of ChatGPT in Customer Service and Communi... quality and duration of evidence (study types, presence of randomized controls)

The authors elicited additional insights via a survey of paper authors plus follow-up interviews to collect self-assessments of reproducibility and qualitative explanations for obstacles and motivations.

Methods section describing the mixed-methods approach: empirical reproduction attempts triangulated with surveys and interviews of original authors.

high null result On the Computational Reproducibility of Human-Computer Inter... use of surveys and interviews as data sources for qualitative corroboration and ...

Reproducibility (as used in this study) is defined as producing the reported results from the shared data and analysis code, distinct from replicability which involves independent recollection of data.

Authors' definitional statement in the paper clarifying reproducibility vs. replicability.

high null result On the Computational Reproducibility of Human-Computer Inter... operational definition of 'reproducibility' (ability to re-run provided data+cod...

Study limitations include reliance on perceptual measures (rather than solely objective performance), heterogeneity across institutional samples, and likely correlational rather than strictly causal identification.

Authors' own noted limitations in the paper's methods section: mixed-methods design using perceptions from questionnaires and interviews, sample heterogeneity across multinational institutions, and quantitative analyses that are associative rather than strictly causal.

high null result Human-AI Synergy in Financial Decision-Making: Exploring Tru... validity/causal identification of study findings

Measurement and research gaps (data scarcity, informality) complicate robust economic assessment of AI impacts; improved metrics, granular labour and firm‑level data, and mixed‑methods evaluation are required.

Methodological critique based on reviewed literature and identified gaps; no new data collection in the paper.

high null result Towards Responsible Artificial Intelligence Adoption: Emergi... availability and granularity of labour and firm-level datasets, prevalence of mi...

There is a lack of causal evidence on the long-run impacts of AI-driven HRM on employment, wages, and firm survival—this is a key research gap identified by the review.

Explicitly stated research gap in the review based on assessment of methodologies and findings across the 47 included studies.

high null result Data-Driven Strategies in Human Resource Management: The Rol... availability of causal studies on long-run employment, wage, and firm survival i...

« Prev 1 2 3 … 48 49 50 … 145 146 Next »