The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (2215 claims)

Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 369 105 58 432 972
Governance & Regulation 365 171 113 54 713
Research Productivity 229 95 33 294 655
Organizational Efficiency 354 82 58 34 531
Technology Adoption Rate 277 115 63 27 486
Firm Productivity 273 33 68 10 389
AI Safety & Ethics 112 177 43 24 358
Output Quality 228 61 23 25 337
Market Structure 105 118 81 14 323
Decision Quality 154 68 33 17 275
Employment Level 68 32 74 8 184
Fiscal & Macroeconomic 74 52 32 21 183
Skill Acquisition 85 31 38 9 163
Firm Revenue 96 30 22 148
Innovation Output 100 11 20 11 143
Consumer Welfare 66 29 35 7 137
Regulatory Compliance 51 61 13 3 128
Inequality Measures 24 66 31 4 125
Task Allocation 64 6 28 6 104
Error Rate 42 47 6 95
Training Effectiveness 55 12 10 16 93
Worker Satisfaction 42 32 11 6 91
Task Completion Time 71 5 3 1 80
Wages & Compensation 38 13 19 4 74
Team Performance 41 8 15 7 72
Hiring & Recruitment 39 4 6 3 52
Automation Exposure 17 15 9 5 46
Job Displacement 5 28 12 45
Social Protection 18 8 6 1 33
Developer Productivity 25 1 2 1 29
Worker Turnover 10 12 3 25
Creative Output 15 5 3 1 24
Skill Obsolescence 3 18 2 23
Labor Share of Income 7 4 9 20
Clear
Innovation Remove filter
The CNN EM surrogate enables orders-of-magnitude faster evaluations than full-wave EM simulation, enabling global search of the discrete pixel design space.
Authors state the surrogate provides orders-of-magnitude speedups compared to full-wave EM, enabling global search; no quantitative speedup numbers or benchmarking details are provided in the provided summary.
medium positive Deep Learning-Driven Black-Box Doherty Power Amplifier with ... evaluation time per candidate layout (surrogate inference time vs full-wave EM s...
A deep convolutional neural network (CNN) trained as an electromagnetic (EM) surrogate can predict S-parameters of pixelated passive networks quickly and with sufficient accuracy to be used inside an optimizer loop.
Paper reports development and use of a CNN surrogate mapping pixelated network layouts to S-parameters; the surrogate was embedded in the optimizer and used to evaluate candidate layouts during global search. (Note: exact training dataset size, architecture, and error metrics are not provided in the summary.)
medium positive Deep Learning-Driven Black-Box Doherty Power Amplifier with ... S-parameter prediction accuracy and inference runtime sufficient for optimizer u...
Adopting this approach shifts required skills and organizational roles away from lengthy parametric modeling toward data engineering, controller integration, and monitoring.
Authors' discussion of practical/organizational implications (qualitative); argument based on removal of model-building step and increased emphasis on data infrastructure and online operations.
medium positive Data-driven generalized perimeter control: Zürich case study changes in required skills/organizational roles (qualitative workforce compositi...
DeePC outperforms baseline controllers (e.g., fixed-time and standard adaptive schemes) in the simulated experiments.
Comparative simulation experiments reported in the paper where DeePC-controlled signals achieve superior system-level metrics relative to baseline controllers.
medium positive Data-driven generalized perimeter control: Zürich case study system-level outcomes (total travel time, CO2 emissions) compared across control...
The method was validated on a very large, high-fidelity microscopic closed-loop simulator of Zürich; the paper reports this as the largest such closed-loop urban-traffic simulation in the literature.
Authors' description of the experimental environment: city-scale microscopic simulator of Zürich with controller in the loop; explicit statement in the paper claiming it is the largest closed-loop urban-traffic simulation reported in the literature.
medium positive Data-driven generalized perimeter control: Zürich case study scale of validation (city-scale microscopic closed-loop simulation)
Regularization and the use of measured Hankel/data matrices make the method more robust to measurement noise and limited data.
Method description includes regularization terms in the DeePC optimization and use of Hankel matrices built from measured trajectories; simulation experiments show continued performance under noisy / limited-data conditions.
medium positive Data-driven generalized perimeter control: Zürich case study robustness to measurement noise and limited data (performance degradation metric...
DeePC handles sparse or limited traffic measurements better than many machine-learning methods.
Claims in the paper supported by experiments and methodological notes: use of Hankel structures and regularization in DeePC to operate with limited/sparse sensing; comparative statements versus generic ML methods (qualitative and simulation evidence).
medium positive Data-driven generalized perimeter control: Zürich case study controller performance (e.g., travel time, emissions) under sparse sensing / lim...
The DeePC-based approach avoids the expensive, time-consuming model-building step required by model-based control methods.
Methodological argument and demonstration that controller uses historical input–output trajectories directly rather than requiring separate parametric model identification; supported by simulation implementation that bypasses model identification.
medium positive Data-driven generalized perimeter control: Zürich case study need for explicit parametric model identification (development time/effort proxy...
The model provides multi-mode reasoning: non-reasoning, Italian/English reasoning, and a 'turbo-reasoning' concise bullet-point mode intended for real‑time use cases.
Model functionality described by authors: the paper documents multiple operating modes including a concise 'turbo' mode for low-latency outputs. The summary lists these modes but does not provide quantitative latency/quality tradeoff metrics.
medium positive EngGPT2: Sovereign, Efficient and Open Intelligence existence of distinct inference modes and their intended behavioral differences ...
EngGPT2 uses far less training data (and, by implication, training compute) than some large models—reported as about 1/10–1/6 of the data used by larger dense models (e.g., vs. Qwen3 or Llama3).
Comparison of reported token counts: EngGPT2 at ~2.5T tokens vs. stated baselines (Qwen3 36T, Llama3 15T); authors assert training-data reduction in the 1/10–1/6 range. The paper reports token counts but does not provide matched compute/FLOP or training-time comparisons.
medium positive EngGPT2: Sovereign, Efficient and Open Intelligence relative training-data volume (tokens) compared to named baseline models
On benchmarks (MMLU-Pro, GSM8K, IFEval, HumanEval) EngGPT2 matches or is comparable to dense models in the 8B–16B parameter range.
Evaluation reported on the named benchmarks; the paper states comparable benchmark performance to dense 8B–16B models. The summary does not include exact scores, standard deviations, prompt engineering details, dataset overlap checks, or sample sizes per benchmark.
medium positive EngGPT2: Sovereign, Efficient and Open Intelligence benchmark performance metrics (accuracy/score) on MMLU-Pro, GSM8K, IFEval, Human...
Model-merging and targeted continual pre-training were used to amplify limited compute and improve performance without full from-scratch pre-training.
Paper describes using model-merging and targeted continual pre-training to leverage existing strong weights and inject language/domain data efficiently.
medium positive Fanar 2.0: Arabic Generative AI Stack performance improvement attributable to model-merging/continual pre-training met...
Prioritizing data quality over raw scale (curated 120B tokens instead of maximizing token counts) produced better Arabic and cross-lingual performance for the resource budget used.
Paper emphasizes a 'data quality over brute-force scale' strategy and reports benchmark improvements from the curated corpus and targeted training; the causal link is asserted via these results.
medium positive Fanar 2.0: Arabic Generative AI Stack model performance relative to data curation strategy
Those benchmark gains were achieved using roughly 1/8th the pre-training tokens of Fanar 1.0 (i.e., about 8× fewer pre-training tokens).
Paper states the approach used approximately 1/8th the pre-training tokens of Fanar 1.0 while improving benchmarks; exact token counts for Fanar 1.0 not provided in the summary.
medium positive Fanar 2.0: Arabic Generative AI Stack relative pre-training token count (Fanar 2.0 vs Fanar 1.0)
Fanar-27B reports benchmark gains relative to Fanar 1.0: Arabic knowledge +9.1 points, language ability +7.3 points, dialect handling +3.5 points, and English capability +7.6 points.
Paper reports these specific numeric benchmark improvements across Arabic knowledge, general language ability, dialects, and English capability; evaluation suite names, sample sizes, and statistical details are not specified in the summary.
medium positive Fanar 2.0: Arabic Generative AI Stack benchmark scores (Arabic knowledge, language ability, dialect handling, English ...
Pretraining corpora must be broadened across temporal scales and domains (including high-frequency domains) to improve TSFM generalization.
Recommendation follows from observed poor transfer and fine-tuning results; paper argues for inclusion of high-frequency, domain-diverse data in pretraining. This is prescriptive and driven by the benchmarking observations rather than an experiment demonstrating improved outcomes after broadened pretraining.
medium positive Bridging the High-Frequency Data Gap: A Millisecond-Resoluti... expected improvement in model generalization (forecasting performance) if pretra...
FederatedFactory recovers centralized-model performance without pooling raw data or relying on a central dataset, thereby weakening dependence on foundation-model vendors and their pretrained priors.
Empirical claims that federated results match centralized upper bounds on tested datasets and methodological statement that no external pretrained priors are required; the economic interpretation is drawn from these empirical and methodological properties.
medium positive FederatedFactory: Generative One-Shot Learning for Extremely... performance gap vs. centralized model; dependence on external pretrained priors
FederatedFactory enables exact modular unlearning: deterministic deletion of a client's generative module exactly removes that client's contribution to synthesized datasets.
Design claim in the paper: generative modules are modular assets, and deleting a module deterministically prevents its use when synthesizing the balanced dataset; paper asserts exact modular unlearning and reports it as a property of the method. (No formal auditing metrics or proofs provided in the summary.)
medium positive FederatedFactory: Generative One-Shot Learning for Extremely... unlearning correctness (module-level removal effect on synthesized dataset compo...
Downstream discriminative models trained on the synthesized, balanced datasets avoid conflicting optimization trajectories that cause collapse in standard federated learning under mutually exclusive labels.
Methodological reasoning (balanced synthesized training data removes label heterogeneity across clients) plus empirical demonstrations where standard FL collapses under mutual exclusivity (e.g., CIFAR baseline) and FederatedFactory recovers performance.
medium positive FederatedFactory: Generative One-Shot Learning for Extremely... optimization stability / avoidance of collapsed training (measured indirectly vi...
Across diverse medical imagery benchmarks (including MedMNIST and ISIC2019), FederatedFactory matches centralized upper-bound performance.
Empirical comparisons reported in the paper: FederatedFactory results are compared against a centralized upper bound on the same datasets and reported to be matched. (Details of which datasets and exact numeric comparisons beyond ISIC2019 are not enumerated in the summary.)
medium positive FederatedFactory: Generative One-Shot Learning for Extremely... classification performance vs. centralized upper bound (accuracy/AUROC)
FederatedFactory restores ISIC2019 performance to AUROC = 90.57% under the tested regime.
Empirical experiment reported on ISIC2019 (dermatology images); paper reports AUROC value of 90.57% for FederatedFactory. (Exact train/test splits and client partitioning not specified in the summary.)
FederatedFactory operates without relying on external pretrained foundation models (zero-dependency).
Paper explicitly states the framework does not depend on pretrained foundation models; experiments are reported without using external pretraining (datasets: MedMNIST suite, ISIC2019, CIFAR-10).
medium positive FederatedFactory: Generative One-Shot Learning for Extremely... dependency on pretrained models (binary: uses / does not use)
By synthesizing class-balanced datasets locally from exchanged generative modules, FederatedFactory eliminates gradient conflict among clients' discriminative updates.
Mechanistic argument in the paper (training discriminative models on locally synthesized, balanced data avoids heterogeneity-induced conflicting gradients) supported by empirical recovery of performance in experiments where baselines collapse under label heterogeneity.
medium positive FederatedFactory: Generative One-Shot Learning for Extremely... reduction/elimination of gradient conflict (inferred via improved downstream per...
FederatedFactory reframes federated learning by exchanging generative modules (priors) instead of exchanging discriminative model weights.
Methodological description in the paper: design of FederatedFactory where each client trains/contributes generative modules (class-specific priors) and shares those modules rather than classifier weights. Evidence is the described protocol and experiments that implement that protocol on the reported datasets.
medium positive FederatedFactory: Generative One-Shot Learning for Extremely... unit of federation / protocol (generative modules vs. discriminative weights)
There is an economic case for funding access to quantum hardware, standardized benchmarking infrastructure, and shared datasets to reduce deployment uncertainty and enable credible claims of usefulness.
Policy and R&D recommendation inferred from the review's finding of heterogeneous benchmarking and missing hardware tests; argued as a mitigation to the identified deployment gap.
medium positive Generative AI for Quantum Circuits and Quantum Code: A Techn... recommendation for funding/hardware access and standardized benchmarking
Most of the surveyed systems address semantic correctness (Layer 2) to some degree.
The review's application of Layer 2 found that a majority of the 13 systems include semantic-level evaluations (e.g., unitary equivalence tests, functional tests, simulator-based correctness checks), though the depth varied.
medium positive Generative AI for Quantum Circuits and Quantum Code: A Techn... presence and extent of semantic-correctness evaluation
Policies improving data sharing, standardization, and model transparency would increase overall welfare by reducing duplication and improving model performance.
Policy argumentation in the paper drawing on economic theory and examples where shared datasets/standards improved research productivity.
medium positive Has AI Reshaped Drug Discovery, or Is There Still a Long Way... research productivity and welfare as affected by data-sharing, standardization, ...
Organizations that tightly integrate AI teams with experimental groups achieve higher productivity.
Case studies and internal metrics cited in the paper showing improved throughput and candidate progression in integrated teams versus siloed approaches.
medium positive Has AI Reshaped Drug Discovery, or Is There Still a Long Way... organizational productivity (throughput, candidate progression) as a function of...
Value accrues to firms that control high-quality data, integrated platforms, and wet-lab validation—data and experimental capacity are strategic assets.
Market and organizational analysis in the paper citing examples of firms leveraging proprietary data/platforms and wet-lab capabilities to advance candidates more effectively.
medium positive Has AI Reshaped Drug Discovery, or Is There Still a Long Way... firm success/value correlated with possession of high-quality data, integrated p...
AI reduces time and cost in early-stage discovery (discovery-to-candidate), lowering per-candidate screening and design costs.
Reported case studies and cost/time comparisons in the paper showing faster candidate identification and reduced experimental burden in early stages; aggregated industry claims summarized.
medium positive Has AI Reshaped Drug Discovery, or Is There Still a Long Way... time and monetary cost from discovery to candidate selection; per-candidate scre...
Several AI-guided molecules have entered clinical trials and show encouraging early-phase indicators.
Industry reports and trial registries summarized in the paper reporting multiple AI-guided programs reaching Phase I/II; company disclosures and early-phase biomarker or safety readouts referenced.
medium positive Has AI Reshaped Drug Discovery, or Is There Still a Long Way... number of AI-guided molecules entering clinical trials and their early-phase cli...
Recommendations for policy include investing in public data infrastructure and standards, promoting regulatory clarity for AI validation, and supporting equitable access to AI-driven innovations.
Policy recommendations derived from synthesis of challenges and potential remedies presented in the narrative review; based on conceptual policy analysis and examples rather than empirical testing of interventions.
medium positive From Algorithm to Medicine: AI in the Discovery and Developm... policy adoption (infrastructure, standards); measures of equitable access and re...
Policies that incentivize interoperable, privacy-preserving data sharing (e.g., federated data, common standards) can reduce entry barriers and improve social returns from AI in drug R&D.
Policy analysis and recommendations from the review, supported by conceptual arguments and examples of federated/privacy-preserving platforms; limited empirical validation of large-scale impact.
medium positive From Algorithm to Medicine: AI in the Discovery and Developm... data-sharing uptake; entry barriers; measures of social return (access, innovati...
AI has the potential to raise R&D productivity by shortening timelines and reducing certain failure modes, thereby increasing the net present value (NPV) of successful drug projects.
Economic reasoning and projections based on documented process improvements in the reviewed studies and reports; not validated by longitudinal, generalized financial analyses in the literature.
medium positive From Algorithm to Medicine: AI in the Discovery and Developm... R&D productivity metrics (time, success probability) and financial outcomes (NPV...
AI enhances post-market safety signal detection using real-world data analytics.
Industry and regulatory reports and published studies in the review documenting improved detection or earlier identification of safety signals in pharmacovigilance applications using ML on real-world datasets.
medium positive From Algorithm to Medicine: AI in the Discovery and Developm... sensitivity/timeliness of safety signal detection; false positive/negative rates...
AI-enabled adaptive and enrichment trial designs increase trial efficiency and statistical power.
Methodological studies, clinical-trial case studies, and regulatory guidance summarized in the review showing applications of ML to adaptive/enrichment designs; evidence mainly illustrative and context-specific.
medium positive From Algorithm to Medicine: AI in the Discovery and Developm... trial efficiency metrics (sample size, duration, cost) and statistical power or ...
AI improves predictive toxicity and ADMET models, which can reduce late-stage failures.
Multiple empirical studies and industry case reports aggregated in the narrative review demonstrating improved in silico toxicity/ADMET prediction performance in specific settings; heterogeneity across datasets and endpoints; not a formal meta-analysis.
medium positive From Algorithm to Medicine: AI in the Discovery and Developm... predictive accuracy of toxicity/ADMET models; late-stage failure rates
AI can reduce time-to-market and lower some drug development costs.
Synthesis of case studies, industry reports, and empirical studies reported in the narrative review that document examples of compressed timelines and cost savings in parts of the pipeline; review notes lack of long-run, generalized ROI estimates.
medium positive From Algorithm to Medicine: AI in the Discovery and Developm... time-to-market; development costs (component-level, not comprehensive program-le...
AI is materially accelerating discovery and development steps in pharmaceutical R&D, improving target identification, lead optimization, safety prediction, and adaptive trial design.
Narrative review synthesizing published studies, review articles, industry and regulatory reports; evidence primarily consists of empirical studies and case studies covering preclinical and clinical-stage applications. No pooled quantitative meta-analysis; heterogeneous methods and therapeutic areas.
medium positive From Algorithm to Medicine: AI in the Discovery and Developm... discovery and development timeline (time-to-market); stage-specific process metr...
Firms with superior proprietary data and integration capability gain competitive advantage, increasing firm-level heterogeneity in AI returns.
Narrative analysis of market structure implications and examples; no cross-firm empirical heterogeneity study included.
medium positive Learning from the successes and failures of early artificial... differential R&D productivity / market performance across firms
Returns to complementary investments (data infrastructure, experiment automation, cross-disciplinary teams) increase as AI becomes more central to discovery workflows.
Synthesis of adoption lessons and case examples emphasizing complementary capital; no quantitative ROI estimates provided.
medium positive Learning from the successes and failures of early artificial... incremental R&D productivity attributable to complementary investments
Embedding AI into organizational processes, decision-making, and wet-lab validation is crucial to capturing its value.
Narrative review of adoption and integration lessons from large biopharma experience and illustrative case studies.
medium positive Learning from the successes and failures of early artificial... realized R&D productivity gains attributable to AI integration
Successful AI adoption requires investment in data, talent, and workflows rather than reliance on bolt-on point solutions.
Thematic analysis of adoption-level lessons and industry case examples indicating organizational and infrastructural requirements for realized value.
medium positive Learning from the successes and failures of early artificial... likelihood of successful AI-driven productivity gains / ROI from AI initiatives
AI has produced genuine early-stage breakthroughs in drug discovery, accelerating hit identification and early design cycles.
Narrative expert synthesis and thematic analysis of industry experience over the first decade of AI adoption, illustrated by early-case successes and firm-reported accelerations; no new primary experimental data or causal econometric estimates provided.
medium positive Learning from the successes and failures of early artificial... time-to-hit / hit identification rate / iteration cycle time in early discovery
Public policies that lower frictions for secure data sharing, standardize validation metrics, and support workforce retraining can accelerate beneficial diffusion of AI while managing risks.
Policy recommendation based on the paper's synthesis of enablers and constraints; not empirically tested within the paper.
medium positive AI as the Catalyst for a New Paradigm in Biomedical Research speed and equity of AI diffusion and risk management
AI has the potential to reduce marginal cost and time per candidate (shorter design loops, in silico screening), increasing effective productivity of R&D spend if improvements are validated.
Theoretical and conceptual argument referencing capabilities of generative models and simulation; paper states no new quantitative estimates were produced.
medium positive AI as the Catalyst for a New Paradigm in Biomedical Research marginal cost per candidate, time per candidate, R&D productivity
Workforce upskilling and new roles (e.g., ML engineers embedded in biology teams, AI product managers) are required for effective AI integration in pharma R&D.
Descriptive projection based on observed industry hiring trends and organizational needs; no workforce survey data provided.
medium positive AI as the Catalyst for a New Paradigm in Biomedical Research availability of AI-skilled workforce and role integration
Cloud/federated approaches reduce upfront infrastructure investments and facilitate distributed collaboration.
Conceptual argument based on cloud economics and federated architectures; no quantitative cost-savings or collaboration metrics presented.
medium positive AI as the Catalyst for a New Paradigm in Biomedical Research upfront infrastructure investment and degree of distributed collaboration
Cloud and federated approaches enable access to powerful pre-trained or fine-tunable models while allowing proprietary data to remain controlled (privacy-preserving sharing and model-to-data patterns).
Technological synthesis and examples of federated learning and cloud-hosted ML patterns; no empirical performance or privacy-utility tradeoff measurements reported.
medium positive AI as the Catalyst for a New Paradigm in Biomedical Research access to models, data control/privacy preservation, infrastructure investment n...
Startups can leverage pre-trained models, cloud compute, and hosted toolchains to compete on speed and niche innovation against larger incumbents.
Conceptual observation and illustrative examples; not supported by systematic comparison of startup vs incumbent performance metrics in the paper.
medium positive AI as the Catalyst for a New Paradigm in Biomedical Research startup competitive speed and niche innovation capability