Evidence (4793 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	402	112	67	480	1076
Governance & Regulation	402	192	122	62	790
Research Productivity	249	98	34	311	697
Organizational Efficiency	395	95	70	40	603
Technology Adoption Rate	321	126	73	39	564
Firm Productivity	306	39	70	12	432
Output Quality	256	66	25	28	375
AI Safety & Ethics	116	177	44	24	363
Market Structure	107	128	85	14	339
Decision Quality	177	76	38	20	315
Fiscal & Macroeconomic	89	58	33	22	209
Employment Level	77	34	80	9	202
Skill Acquisition	92	33	40	9	174
Innovation Output	120	12	23	12	168
Firm Revenue	98	34	22	—	154
Consumer Welfare	73	31	37	7	148
Task Allocation	84	16	33	7	140
Inequality Measures	25	77	32	5	139
Regulatory Compliance	54	63	13	3	133
Error Rate	44	51	6	—	101
Task Completion Time	88	5	4	3	100
Training Effectiveness	58	12	12	16	99
Worker Satisfaction	47	32	11	7	97
Wages & Compensation	53	15	20	5	93
Team Performance	47	12	15	7	82
Automation Exposure	24	22	9	6	62
Job Displacement	6	38	13	—	57
Hiring & Recruitment	41	4	6	3	54
Developer Productivity	34	4	3	1	42
Social Protection	22	10	6	2	40
Creative Output	16	7	5	1	29
Labor Share of Income	12	5	9	—	26
Skill Obsolescence	3	20	2	—	25
Worker Turnover	10	12	—	3	25

Productivity Remove filter

Actionable takeaway: organizations should measure inter-model similarity and response diversity as part of ROI and procurement analyses and factor in governance and role-redesign costs when estimating net returns to LLM deployment.

Explicit recommendation in the paper grounded in empirical analyses of output similarity and diversity metrics; presented as operational guidance rather than tested via field ROI studies.

high positive The Artificial Hivemind: Rethinking Work Design and Leadersh... inclusion of diversity metrics and governance cost estimates in ROI/procurement ...

The paper provides practical diagnostic tools and metrics (e.g., inter-model similarity, response entropy) for detecting and tracking AI homogenization in workflows.

Methodological section describing diagnostic framework and example metrics used in the empirical analyses (semantic similarity measures, entropy, distinct-n), intended for operational use.

high positive The Artificial Hivemind: Rethinking Work Design and Leadersh... operational diagnostic metrics (inter-model similarity, entropy, distinct-n)

Organizational responses to homogenization include leadership communication strategies, work redesign (contrarian roles, ensemble workflows, mandated diversity checks), and governance frameworks (auditing, procurement policies avoiding monoculture).

Prescriptive recommendations in the paper synthesizing empirical results with organizational-design principles; proposed interventions are not evaluated empirically in the paper but are presented as actionable responses.

high positive The Artificial Hivemind: Rethinking Work Design and Leadersh... proposed organizational interventions to preserve cognitive and stylistic divers...

The analysis dataset comprises approximately 26,000 real-world user queries paired with outputs from over 70 distinct language models spanning different providers, architectures, and scales.

Explicit data description in the paper: ≈26,000 queries and outputs from 70+ models (paper lists model sets and sampling procedures in methods section).

high positive The Artificial Hivemind: Rethinking Work Design and Leadersh... dataset size and model count

The task frontier expands: new tasks become profitable and are created endogenously as coordination costs decline.

Analytical derivation in the model (proposition about task frontier) and simulation exercises that permit endogenous task entry.

high positive AI as Coordination-Compressing Capital: Task Reallocation, O... task frontier (set/number of profitable tasks)

Aggregate output increases when coordination costs fall because reduced frictions and endogenous task creation raise productive capacity.

Analytical result (one of the five propositions) showing comparative statics of output with respect to coordination compression; supported by calibrated numerical simulations.

high positive AI as Coordination-Compressing Capital: Task Reallocation, O... aggregate output (economy-wide production)

Lower coordination costs expand managers’ spans of control (managers can supervise more subordinates).

Analytical comparative statics derived in the model (one of the five propositions) and corroborating numerical simulations with heterogeneous agents.

high positive AI as Coordination-Compressing Capital: Task Reallocation, O... span of control (number of subordinates per manager)

Overinvestment increases inequality (greater tail concentration of income).

Model computations showing that exponential returns amplify income at the top; comparative statics indicate inequality measures rise with greater investment/technology under lognormal wage assumption.

high positive Janus-Faced Technological Progress and the Arms Race in the ... income inequality (tail concentration measures/Gini-like outcomes)

Overinvestment increases measured GDP (output).

Comparative statics in the theoretical model linking higher private investment/technology adoption to higher aggregate output; model shows positive effect on measured GDP despite welfare loss possibilities.

high positive Janus-Faced Technological Progress and the Arms Race in the ... aggregate GDP/output

The exponential returns to skill and technology create strong private incentives for agents to escalate skill (education) investment toward the high tail of the distribution (an educational arms race).

Equilibrium analysis and comparative statics in the theoretical model showing that marginal returns to additional investment are increasing toward the distribution tail, producing higher optimal private investment at the top relative to social optimum.

high positive Janus-Faced Technological Progress and the Arms Race in the ... individual education/skill investment level

When wages follow a lognormal distribution, technological progress makes wages increase exponentially in both skill and technology.

Analytical derivation in the paper's economic model that assumes a lognormal wage distribution and specifies wages as an exponential function of skill and a technology parameter; result follows from model algebra (no empirical data).

high positive Janus-Faced Technological Progress and the Arms Race in the ... individual wage level

The paper proposes a research agenda prioritizing interoperable, ethical‑by‑design platforms; metrics to measure social equity impacts; and adaptation of global standards to local institutional capacities.

Explicit list of three prioritized research directions provided in the paper, derived from the systematic synthesis of the 103 items.

high positive Models, applications, and limitations of the responsible ado... research priorities and agenda items

High‑income examples (e.g., Estonia, Singapore) demonstrate mature integration of digital/AI systems in e‑government, urban mobility, and e‑health.

Empirical case examples drawn from the reviewed literature and institutional reports cited in the review; specific country examples (Estonia, Singapore) repeatedly referenced as mature adopters.

high positive Models, applications, and limitations of the responsible ado... integration maturity of AI/digital systems in e‑government, urban mobility, and ...

Research priorities include developing robust measures of AI adoption and using causal methods (difference-in-differences, synthetic controls, RDD, IV) to estimate effects of AI and regulation on productivity, employment, and inequality.

Methodological recommendations in the report based on identified evidence gaps and normative evaluation of empirical priorities.

high positive AI Governance and Data Privacy: Comparative Analysis of U.S.... quality of AI adoption measures and causal estimates for productivity, employmen...

The American Artificial Intelligence Initiative emphasizes R&D and innovation leadership, standards development, workforce readiness, and fostering 'trustworthy AI' (transparency, fairness, accountability).

Primary source policy documents from the U.S. American Artificial Intelligence Initiative reviewed in the report.

high positive AI Governance and Data Privacy: Comparative Analysis of U.S.... policy emphasis areas (R&D investment, standards, workforce readiness, trustwort...

Vendor support, warranties, and service-level agreements (SLAs) are important for clinical adoption and liability management.

Policy and implementation literature, industry reports, and stakeholder feedback synthesized in the paper highlighting the role of vendor contractual commitments in adoption decisions.

high positive Framework for Government Policy on Agentic and Generative AI... clinical adoption / liability mitigation

Proprietary systems lead on reliability, maintenance, and validated integrations with clinical systems.

Literature synthesis including vendor case studies, deployment reports, and stakeholder surveys indicating more mature productization and validated integrations for proprietary offerings.

high positive Framework for Government Policy on Agentic and Generative AI... system reliability / maintenance burden / integration maturity

Open-source deployment options (e.g., on-premises) reduce data-sharing exposure and improve privacy.

Aggregated evidence from deployment reports and technical papers describing on-premises and local inference architectures; industry analyses of data governance tradeoffs.

high positive Framework for Government Policy on Agentic and Generative AI... data privacy / data-sharing exposure

Open-source models provide greater transparency and inspectability, enabling better auditability and explainability.

Systematic literature synthesis of peer-reviewed studies, industry reports, and case studies comparing open-source and proprietary systems; comparative analysis highlights inspectability of open-source code/models. No new primary experiments reported.

high positive Framework for Government Policy on Agentic and Generative AI... transparency / auditability / explainability

Coordinated policy reform, targeted infrastructure investment, workforce training, and equity-focused implementation are strategic priorities to realize AI’s potential in Indonesian healthcare.

Consensus recommendations drawn from the narrative synthesis, thematic analysis, and Delphi consensus studies included among the 42 supplementary documents and the broader 2020–2025 literature body.

high positive Artificial Intelligence in Healthcare in Indonesia: Are We R... policy adoption of coordinated reforms, level of infrastructure investment, work...

Recommended research priorities for economists include measuring how adoption changes task mixes and wages, quantifying verification/remediation costs, estimating productivity gains net of security/IP costs, and studying market dynamics from centralized model providers.

Author recommendations based on identified gaps in the empirical literature synthesized by the paper.

high positive ChatGPT as a Tool for Programming Assistance and Code Develo... generation of targeted empirical studies addressing task mix, wage impacts, veri...

Recommended policy levers include data-governance rules, provenance and watermarking standards, liability frameworks, copyright clarifications, competition policy, and taxes/subsidies to internalize externalities.

Policy recommendations synthesized from legal, regulatory, and economic literatures within the review; presented as qualitative guidance rather than tested policy interventions.

high positive Ethical and societal challenges to the adoption of generativ... effectiveness of specified regulatory instruments in mitigating harms from gener...

A structured three-stage framework (input/process/output) clarifies where different risks and regulatory rules apply to generative audiovisual systems.

Framework presented in the paper as a conceptual synthesis of reviewed literatures; supported by cross-references to legal, technical, and ethical sources within the review.

high positive Ethical and societal challenges to the adoption of generativ... clarity and mapping of risk types to development/use stages

The paper introduces IJOPM’s Africa Initiative (AfIn) to support Africa-based OSCM research, outlining motivation, objectives, review process, and researcher support mechanisms.

Descriptive account within the paper (administrative/initiative description rather than empirical evidence).

high positive Continental shift: operations and supply chain management re... institutional support mechanisms for Africa-based OSCM research and publication ...

Cognitive interlocks include concrete mechanisms such as policy-enforced gates, automated verification thresholds, role-based checks, and mandatory rebuttal workflows to force verification before outputs are trusted or deployed.

Design details and enumerated mechanisms within the Overton Framework as presented in the paper; no implementation case studies reported.

high positive Overton Framework v1.0: Cognitive Interlocks for Integrity i... existence and configuration of interlock mechanisms; number of outputs blocked u...

The Overton Framework is an architectural remedy that embeds 'cognitive interlocks' into development environments to enforce verification boundaries and restore system integrity.

Prescriptive architectural proposal described in the paper (design specification and principles); presented conceptually without empirical validation.

high positive Overton Framework v1.0: Cognitive Interlocks for Integrity i... presence/implementation of cognitive interlocks in dev environments; intended re...

High‑frequency sensor and satellite data, processed with AI, improve precision in measuring yields, input use, and environmental externalities, enhancing the quality of economic impact evaluations and policy targeting.

Methodological and validation studies using high‑resolution satellite imagery and field sensors that show improved measurement accuracy versus traditional survey methods; referenced empirical demonstrations in the literature.

high positive MODERN APPROACHES TO SUSTAINABLE AGRICULTURAL TRANSFORMATION measurement precision for yields, input use, emissions/environmental externaliti...

The paper proposes specific metrics and empirical follow-ups (e.g., generation-to-verification throughput ratios, defect accumulation rates, time-to-acceptance for machine-generated artifacts, incident rates attributable to unverified AI outputs) to validate the model.

Explicit recommendations and measurement proposals listed in the paper; no empirical implementation provided.

high positive Overton Framework v1.0: Cognitive Interlocks for Integrity i... proposed measurement constructs (generation:verification ratio, defect accumulat...

The paper’s own drafting began via casual AI conversation, presented as an illustrative case supporting the model.

Author-reported anecdote (N=1; the paper's drafting process).

high positive A Model of Action Initiation Barrier Reduction through AI Co... narrative/example of task initiation (author's time-to-start/draft generation) —...

Enhanced gross‑flows estimation using longitudinal microdata can better track transitions (job-to-job, upskilling, unemployment spells) and measure occupational churn and reallocation.

Established econometric practice cited in paper; recommendation to use panel/admin microdata (CPS longitudinal supplements, LEHD/LODES, UI records); no new empirical results but aligns with standard methods.

high positive Enhancing BLS Methodologies for Projecting AI's Impact on Em... transition rates, spell durations, occupation-to-occupation flows, upskilling in...

Team Situation Awareness (shared perception, comprehension, projection) remains a useful analytic anchor for HAT even with agentic AI.

Conceptual analysis mapping Team SA components onto agentic AI interactions; literature review of Team SA utility in HAT contexts.

high positive Visioning Human-Agentic AI Teaming: Continuity, Tension, and... usefulness of Team Situation Awareness as an analytic framework

DAR produces ten falsifiable propositions explicitly mapped to measurement constructs, making the framework empirically testable.

Derivation and listing of ten testable propositions in the paper, each linked to observable measures and prioritized by feasibility. Theoretical derivation, no empirical tests provided.

high positive Human–AI Handovers: A Dynamic Authority Reversal Framework f... testable_hypotheses_count; mapping_quality_to_measures

Common uses of AI among practitioners include generating code snippets, suggesting fixes, accelerating routine tasks, surfacing design patterns or documentation, and scaffolding prototypes.

Practice-focused qualitative data from interviews and workflow analysis at Netlight; authors list these use-cases as commonly reported by practitioners; frequency counts not provided.

high positive Rethinking How IT Professionals Build IT Products with Artif... frequency and nature of AI-assisted activities (code generation, suggestions, pr...

Practitioners use AI primarily as a practical assistant (coding, debugging, prototyping, knowledge retrieval) rather than as a fully autonomous developer.

Reported practitioner accounts and observations from the Netlight field study (interviews/observations); examples of tasks AI is used for were documented in the paper; sample limited to experienced consultants at one firm.

high positive Rethinking How IT Professionals Build IT Products with Artif... types of tasks assigned to AI (assistant vs autonomous development)

Experienced IT professionals at Netlight are already integrating AI tools into everyday development work.

Qualitative field study conducted at Netlight Consulting GmbH using interviews, observations, and analysis of practitioner workflows; single-firm sample (Netlight); exact number of participants not reported.

high positive Rethinking How IT Professionals Build IT Products with Artif... extent of AI tool use in day-to-day development workflows

BERT-family encoders provide superior contextual understanding for sentiment analysis, intent detection, behavioural segmentation, and feature extraction from user signals compared to simpler feature pipelines.

Use of BERT encoders for classification tasks with offline metrics reported such as classification accuracy for intent/sentiment and user embedding quality for segmentation. (Specific datasets and sample sizes are not provided.)

high positive Personalized Content Selection in Marketing Using BERT and G... intent classification accuracy, sentiment scoring accuracy, quality of user embe...

Enablers of value realization are high-quality, integrated data; explicit data governance and metadata; process standardization; clear KPIs; user training and change management; and executive sponsorship.

Consistent findings across standards-based guidance, practitioner reports, and case studies from the 2020–2025 review highlighting these enablers as prerequisites or facilitators of success.

high positive Integrating Artificial Intelligence and Enterprise Resource ... implementation success indicators (e.g., adoption levels, KPI improvements, proj...

Value pathways enabled by ERP-integrated AI include improved visibility and real-time decisioning, automation of routine tasks, better forecasts and risk detection, and faster exception handling.

Thematic analysis across the reviewed literature (case studies and conceptual papers) identifying recurring mechanisms by which AI produced value in ERP contexts.

high positive Integrating Artificial Intelligence and Enterprise Resource ... intermediate process measures (e.g., decision latency, automation rates, detecti...

Observed AI techniques used in ERP contexts include supervised and unsupervised machine learning, predictive forecasting, anomaly/fraud detection, optimization, and explainable AI.

Systematic review of peer-reviewed articles, technical evaluations, and practitioner reports (2020–2025) documenting the methods applied in ERP and enterprise planning/control systems.

high positive Integrating Artificial Intelligence and Enterprise Resource ... presence and reporting of specific AI techniques within ERP implementations (fre...

Durable benefits require the co‑evolution of technology, people, and process capabilities rather than technology deployment alone.

Interpretive framing and synthesis of multiple empirical case studies and best-practice guidance included in the 2020–2025 literature review; recurring theme across studies.

high positive Integrating Artificial Intelligence and Enterprise Resource ... durability of performance improvements following AI deployment (e.g., sustained ...

Continuous monitoring and observability for performance, compliance, and drift are essential to maintain operational stability and detect model or process degradation.

Prescriptive claim grounded in engineering practice and comparative analysis of failure modes; supported by illustrative deployments; no quantitative evaluation of monitoring impact reported.

high positive Governed Hyperautomation for CRM and ERP: A Reference Patter... detection rate/time for performance degradation, compliance violations, model dr...

Core governance components should include policy enforcement integrated into development and deployment pipelines, risk controls for data/model behavior/automated actions, explicit human-in-the-loop and human-on-the-loop oversight, continuous monitoring/logging/incident-response, and role-based governance structures linking legal, compliance, IT, and business units.

Prescriptive design based on literature synthesis and practitioner experience; described as core components in the proposed reference pattern (conceptual, case-illustrated).

high positive Governed Hyperautomation for CRM and ERP: A Reference Patter... presence and integration of specified governance controls and organizational rol...

Research needs include empirically measuring prevalence and average loss from prompt fraud incidents, evaluating effectiveness and cost-effectiveness of technical mitigations (watermarking, provenance), and modeling firm-level investment decisions under varying regulatory/insurance regimes.

Authors' recommended agenda for further research based on identified gaps in the paper's qualitative analysis.

high positive Prompt Engineering or Prompt Fraud? Governance Challenges fo... existence and quality of empirical datasets and models addressing prevalence, lo...

The United States manages the openness–security trade-off via a decentralized, rights‑based coordination emphasizing procedural transparency and public accountability.

Qualitative content analysis of national‑level policy texts: 18 U.S. policy documents coded across the same four analytical dimensions.

high positive Balancing openness and security in scientific data governanc... governance logic / institutional coordination type (decentralized, rights‑based)

Systems biology, constraint‑based metabolic modeling (e.g., FBA), kinetic modeling, and hybrid models are effective tools to predict fluxes and identify metabolic bottlenecks.

Discussion and aggregation of modeling studies using COBRA/OptFlux frameworks, FBA simulations, and kinetic/dynamic modeling applied to engineered strains to predict flux changes and suggest genetic interventions; validated in multiple reported DBTL cycles.

high positive Harnessing Microbial Factories: Biotechnology at the Edge of... accuracy/usefulness of flux predictions and identification of bottlenecks leadin...

Engineered microorganisms are maturing into modular, programmable “microbial factories” capable of producing complex chemicals, specialty compounds, and next‑generation biofuels.

Synthesis of multiple experimental case studies reported in the literature (bench and pilot scale fermentations) demonstrating microbial production of natural products, specialty chemicals, and biofuel molecules using engineered strains and heterologous pathways; methods include pathway assembly, enzyme engineering, and fermentation optimization.

high positive Harnessing Microbial Factories: Biotechnology at the Edge of... demonstrated ability to produce target complex molecules (presence/identity of p...

Cluster-level interpretation can be performed via LLM-based semantic decoding to generate concise human-readable labels and descriptions for discovered themes.

Pipeline step implemented: use of an LLM to decode cluster content and produce labels/descriptions; reported in experimental workflow on ICML and ACL abstracts.

high positive Soft-Prompted Semantic Normalization for Unsupervised Analys... quality of cluster labels / human-readability of cluster descriptions

Normalized representations can be embedded into a continuous vector space and then clustered using density-based clustering to identify latent themes without pre-specifying the number of topics.

Methodological pipeline: embedding model applied to normalized representations followed by density-based clustering (algorithmic property: density-based methods do not require pre-specified cluster count). Demonstrated in experiments on ICML and ACL 2025 abstracts.

high positive Soft-Prompted Semantic Normalization for Unsupervised Analys... latent theme detection (cluster discovery) without predefining cluster count

Training improved exam scores by 0.27 grade points relative to optional access without training (p = 0.027).

Intent-to-treat comparison between the optional-access-with-training arm and the optional-access-without-training arm in the randomized trial (n = 164); reported effect size = +0.27 grade points and p-value = 0.027.

high positive Training for Technology: Adoption and Productive Use of Gene... Exam score (grade points) on a law-school issue-spotting exam

A brief, targeted training increased voluntary LLM use from 26% (optional access without training) to 41% (optional access with training).

Randomized experiment with 164 law students assigned to three arms (no access, optional access, optional access + ~10-minute training). Observed adoption rates in the two optional-access arms were 26% (untrained) vs. 41% (trained).

high positive Training for Technology: Adoption and Productive Use of Gene... LLM adoption (whether the student used the LLM)

« Prev 1 2 3 … 38 39 40 … 95 96 Next »