Evidence (4175 claims)
Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 758 | 199 | 100 | 900 | 2007 |
| Governance & Regulation | 826 | 400 | 191 | 122 | 1563 |
| Organizational Efficiency | 777 | 193 | 124 | 84 | 1189 |
| Technology Adoption Rate | 635 | 233 | 124 | 97 | 1098 |
| Research Productivity | 422 | 128 | 57 | 336 | 954 |
| Output Quality | 476 | 179 | 59 | 47 | 761 |
| Decision Quality | 328 | 177 | 81 | 47 | 640 |
| Firm Productivity | 435 | 57 | 88 | 20 | 606 |
| AI Safety & Ethics | 218 | 277 | 65 | 33 | 599 |
| Market Structure | 180 | 170 | 123 | 24 | 502 |
| Task Allocation | 213 | 64 | 72 | 33 | 387 |
| Skill Acquisition | 170 | 61 | 61 | 17 | 309 |
| Innovation Output | 203 | 27 | 43 | 18 | 292 |
| Employment Level | 105 | 54 | 107 | 13 | 281 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 117 | 63 | 42 | 11 | 233 |
| Firm Revenue | 153 | 48 | 26 | 3 | 230 |
| Task Completion Time | 173 | 31 | 8 | 12 | 225 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Worker Satisfaction | 89 | 65 | 22 | 12 | 188 |
| Error Rate | 69 | 92 | 10 | 2 | 173 |
| Regulatory Compliance | 77 | 69 | 14 | 5 | 165 |
| Automation Exposure | 56 | 56 | 26 | 13 | 154 |
| Training Effectiveness | 94 | 21 | 13 | 19 | 149 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Team Performance | 86 | 17 | 27 | 10 | 141 |
| Developer Productivity | 95 | 17 | 14 | 6 | 133 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 52 | 7 | 8 | 3 | 70 |
| Creative Output | 31 | 18 | 8 | 3 | 61 |
| Skill Obsolescence | 5 | 46 | 6 | 1 | 58 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 19 | 17 | — | 53 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Org Design
Remove filter
AI has moved from a peripheral digital capability to a central driver of corporate strategy, reshaping decision-making, customer engagement, operations, and risk exposure.
Statement presented in the paper's introduction and motivation; supported by integrative conceptual design and literature grounding (theory and descriptive citations). No empirical sample or quantitative analysis reported.
Adoption of AI can reduce procurement costs by 15.7%.
Field survey data (n=326) and regression analysis; authors report a 15.7% reduction in procurement costs associated with AI adoption.
Adoption of AI can shorten the procurement decision-making cycle by 21.3%.
Field survey data (n=326) analyzed (authors report a 21.3% reduction in procurement decision-making cycle associated with AI adoption); method described as questionnaire surveys and multiple linear regression.
Supplier AI capability positively drives AI adoption in procurement (β = 0.28, p < 0.01).
Same questionnaire survey (n=326) and multiple linear regression analysis; reported coefficient β=0.28 with p<0.01.
Perceived usefulness positively drives AI adoption in procurement (β = 0.32, p < 0.01).
Questionnaire survey of 326 procurement managers/supply chain managers in SMEs (Yangtze River Delta and Pearl River Delta) analyzed using multiple linear regression; reported coefficient β=0.32 with p<0.01.
The paper provides recommendations for designing strategic indicators to drive adoption, foster innovation, and objectively assess whether digital tools are delivering top-line impact.
Descriptive claim about the content of the perspective article (the authors state they provide these recommendations); the excerpt itself summarizes this contribution.
The shift from expert-driven computer-aided drug design (CADD) to semiautonomous AI necessitates a new framework of impact-oriented KPIs.
Stated by the EFMC2 community authors as a normative conclusion in the perspective piece; based on the characterisation of a technological shift rather than on presented empirical tests in the excerpt.
Harnessing AI's potential requires moving beyond measuring technical model performance (e.g., predictive accuracy) to measuring strategic impact.
Authors argue this as a conceptual requirement for realizing AI's benefits in R&D; presented as a recommendation rather than supported by quantified empirical evidence in the excerpt.
Preliminary analyses suggest that 'AI-native' companies may be outpacing traditional peers.
Explicitly stated in the paper as based on preliminary analyses; the excerpt provides no details on the analyses, metrics, or sample sizes.
The broad introduction of AI into the R&D landscape over the last years holds the promise to lift pharmaceutical R&D out of its productivity problem.
Framed as an expectation/promise in the paper; based on recent broad adoption trends of AI in R&D (no specific empirical evaluation or sample size reported in the excerpt).
The visualization preserved human control.
Reported result from the within-subjects experiment (N=32) indicating that using the visualization did not reduce human control/agency in the negotiation process.
In the same within-subjects experiment (N=32), the visualization improved efficiency.
Within-subjects experiment (N=32) reported in the paper; the authors state the visualization improved efficiency (likely measured as time, number of rounds, or steps to reach agreement).
In a within-subjects experiment (N=32), the uncertainty-based visualization improved human outcomes.
Within-subjects user experiment reported in the paper with N=32 participants comparing performance with and without the visualization.
We introduce a novel uncertainty-based visualization driven by Bayesian estimation of agreement probability that shows how the space of mutually acceptable agreements narrows as negotiation progresses, helping users identify promising options.
Design and implementation of a visualization technique described in the paper; the visualization is driven by Bayesian estimation of agreement probability and is presented as a tool to reveal the shrinking feasible agreement space during negotiation.
Generative AI can autonomously produce novel content, including text, images, models, and scenarios.
General technical/descriptive claim stated in the paper's background/introduction; not an empirically tested claim within the provided excerpt.
Generative AI facilitates the synthesis of structured and unstructured information from diverse sources, enabling managers to explore multiple decision pathways, identify potential risks, and optimize strategic choices.
Descriptive/functional claim made in the paper's introduction and conceptual framing; the empirical component (survey + SEM) is described generally but no specific measures or effect sizes for information synthesis or these capabilities are provided in the excerpt.
Generative AI augments human creativity by producing innovative solutions and scenario-planning alternatives that may not emerge through conventional analytical approaches.
Stated in the conceptual/argumentative portion of the paper; may be supported by survey items but no explicit empirical measure or effect size for creativity is provided in the provided text.
Decision quality and strategic agility positively influence organizational performance.
Reported SEM results from the paper linking the constructs (decision quality and strategic agility) to organizational performance using survey data from senior managers and AI adoption specialists; method = SmartPLS.
Generative AI adoption significantly enhances strategic agility.
Same empirical source as above: survey of senior managers/decision-makers/AI adoption specialists; tested via Structural Equation Modeling (SmartPLS) as reported in the paper.
Generative AI adoption significantly enhances decision quality.
Empirical analysis reported in the paper: survey data collected from senior managers, decision-makers, and AI adoption specialists across multiple industries; relationships assessed using Structural Equation Modeling (SmartPLS). No numeric sample size or effect estimate reported in the provided text.
Human-like presentations increased perceived usefulness and agency in certain tasks.
Experimental manipulation of the human-likeness of AI presentation in the study's three tasks; the abstract reports increased perceived usefulness and agency for human-like presentations in some tasks. No sample sizes, task specifics, or effect magnitudes reported in abstract.
A single dissent within a panel reduced pressure to conform.
Experimental manipulation of within-panel consensus (introducing a single dissent) in the study's three tasks; abstract reports that a single dissent lowered conformity pressure. No numerical data provided in abstract.
Accuracy improved for small panels relative to a single AI.
Reported experimental result from the paper's study: participants completed three tasks and received advice from AI panels; panel size was manipulated (small panels vs single AI). The abstract states this accuracy improvement for small panels. (Sample size and exact tasks not reported in abstract.)
By enabling developers without initial capital to participate in the digital economy, RSI could unlock the 'latent jobs dividend' in low-income countries and help address local challenges in health, agriculture, and services.
Societal-impact argument in the paper linking the RSI model to potential employment gains and localized solutions; speculative extrapolation, no empirical employment estimates or pilot studies reported.
The RSI model could stimulate innovation in the ecosystem.
Argument based on lowered financial barriers and incentive structures from the paper's theoretical comparative analysis; no empirical measures of innovation provided.
The RSI model aligns stakeholder interests (platforms and developers).
Theoretical argument and incentive-alignment reasoning in the paper's comparative framework; no empirical validation presented.
A comparative analysis in the paper shows that the RSI model lowers entry barriers for developers.
Detailed comparative (theoretical) analysis within the paper contrasting existing models and RSI; no empirical trial, sample, or randomized test reported.
Generative AI platforms (Google AI Studio, OpenAI, Anthropic) provide infrastructures (APIs, models) that are transforming the application development ecosystem.
Statement in paper based on literature review and descriptive framing of current platforms; no empirical sample or quantitative test reported.
In production, the system received high satisfaction from both domain experts and developers, with all participants reporting full satisfaction with communication efficiency.
Post-deployment user feedback / satisfaction reports mentioned in paper (no numeric participant count provided).
The automated workflow saved an estimated 979 engineering hours.
Aggregate time-savings estimate reported in paper (derived from per-API time reduction × number of APIs).
The automated workflow reduces per-API development time from approximately 5 hours to under 7 minutes.
Time-per-API comparison reported in paper based on evaluation on spapi (comparison of manual vs automated per-API time).
The automated workflow achieves 93.7% F1 score.
Empirical evaluation on spapi (F1 reported); presumably computed over the evaluated API items/endpoints.
We address this gap through a graph-based workflow optimization approach that progressively replaces manual coordination with LLM-powered services, enabling incremental adoption without disrupting established practices.
Description of proposed method (graph-based workflow + LLM-powered services) and claim of design enabling incremental adoption; supported by subsequent case evaluation.
A large portion of the interactive activities' AI market value (26%) involves transferring information.
Descriptive subcategory statistic: within interactive activities, authors report 26% of market value pertains to information transfer tasks.
Interactive activities (which include both information-based and physical activities) account for 48% of AI market value.
Descriptive aggregate: authors define an 'interactive' category spanning info and physical activities and report it holds 48% of AI market value.
A substantial portion of AI market value (36%) is used in activities that involve creating information.
Descriptive aggregate: subcategory within information-based activities—authors report 36% of market value allocated to 'creating information'.
Most of the AI market value is used in information-based activities (72%).
Descriptive aggregate: authors categorize activities into information-based vs physical and report that 72% of estimated AI market value maps to information-based activities.
There is a highly uneven distribution of AI market value across activities: the top 1.6% of activities account for over 60% of AI market value.
Descriptive statistical result from mapping estimated AI market values to the ~20K activities; authors report concentration metrics (top 1.6% share >60%).
We use the data about AI software and robotic systems to generate graphical displays of how the estimated units and market values of all worldwide AI systems used today are distributed across the work activities that these systems help perform.
Analytic/mapping procedure: authors combine classifications of software (13,275) and robots (20.8M) with market-value estimates to create visual distributions across activities.
We classify a worldwide tally of 20.8 million robotic systems using the developed work-activity ontology.
Empirical classification/counting: authors report mapping 20.8 million robotic systems worldwide to the activity ontology.
We classify descriptions of 13,275 AI software applications using the developed work-activity ontology.
Empirical classification: authors state they mapped 13,275 AI software application descriptions to the ontology.
We disaggregate and then substantially reorganize the approximately 20K activities in the US Department of Labor's O*NET occupational database to produce a comprehensive ontology of work activities.
Methodological: authors report transforming the O*NET activity taxonomy (~20,000 activity-level records) by disaggregation and reorganization into a new ontology.
Models trained in EnterpriseLab remain robust across diverse enterprise benchmarks, including EnterpriseBench (+10%) and CRMArena (+10%).
Benchmark evaluations reported in the paper showing reported +10% improvements on EnterpriseBench and CRMArena relative to baseline; exact baselines, statistical tests, and sample sizes are not specified in the abstract.
8B-parameter models trained in EnterpriseLab reduce inference costs by 8-10x compared to frontier models (implied GPT-4o).
Empirical cost comparison reported in the paper; the abstract states an 8-10x reduction in inference costs for the 8B models trained in EnterpriseLab versus the referenced frontier model(s). Detailed cost accounting and sample sizes not provided in the abstract.
8B-parameter models trained within EnterpriseLab match GPT-4o's performance on complex enterprise workflows.
Empirical evaluation reported in the paper comparing 8B-parameter models trained in EnterpriseLab to GPT-4o on complex enterprise workflows; specific benchmark tests and metrics are referenced but details (sample sizes, exact metrics) are not provided in the abstract.
We validate the platform through EnterpriseArena, an instantiation with 15 applications and 140+ tools across IT, HR, sales, and engineering domains.
Reported instantiation/experimental setup in the paper: EnterpriseArena contains 15 applications and 140+ tools spanning specified domains.
EnterpriseLab provides integrated training pipelines with continuous evaluation.
System/design claim in paper describing integrated training and evaluation tooling as part of the platform.
EnterpriseLab includes automated trajectory synthesis that programmatically generates training data from environment schemas.
System/design claim described in paper; supported by the authors' description of an automated data-generation component.
EnterpriseLab provides a modular environment exposing enterprise applications via a Model Context Protocol, enabling seamless integration of proprietary and open-source tools.
Feature/design claim in paper; supported by implementation details of the 'Model Context Protocol' and reported integration capabilities in the platform description.
We introduce EnterpriseLab, a full-stack platform that unifies tool integration, data generation, and training into a closed-loop framework.
System/design claim describing the contribution of the paper (platform implementation and architecture); supported by the paper's implementation description rather than independent validation.