The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (2954 claims)

Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 369 105 58 432 972
Governance & Regulation 365 171 113 54 713
Research Productivity 229 95 33 294 655
Organizational Efficiency 354 82 58 34 531
Technology Adoption Rate 277 115 63 27 486
Firm Productivity 273 33 68 10 389
AI Safety & Ethics 112 177 43 24 358
Output Quality 228 61 23 25 337
Market Structure 105 118 81 14 323
Decision Quality 154 68 33 17 275
Employment Level 68 32 74 8 184
Fiscal & Macroeconomic 74 52 32 21 183
Skill Acquisition 85 31 38 9 163
Firm Revenue 96 30 22 148
Innovation Output 100 11 20 11 143
Consumer Welfare 66 29 35 7 137
Regulatory Compliance 51 61 13 3 128
Inequality Measures 24 66 31 4 125
Task Allocation 64 6 28 6 104
Error Rate 42 47 6 95
Training Effectiveness 55 12 10 16 93
Worker Satisfaction 42 32 11 6 91
Task Completion Time 71 5 3 1 80
Wages & Compensation 38 13 19 4 74
Team Performance 41 8 15 7 72
Hiring & Recruitment 39 4 6 3 52
Automation Exposure 17 15 9 5 46
Job Displacement 5 28 12 45
Social Protection 18 8 6 1 33
Developer Productivity 25 1 2 1 29
Worker Turnover 10 12 3 25
Creative Output 15 5 3 1 24
Skill Obsolescence 3 18 2 23
Labor Share of Income 7 4 9 20
Clear
Human Ai Collab Remove filter
These operators are presented as conceptual/theoretical bridges rather than immediately quantifiable laboratory units.
Explicit methodological statement in the paper emphasizing interpretive/theoretical intent; no empirical operationalization reported.
high null result XChronos and Conscious Transhumanism: A Philosophical Framew... operationalizability (current lack of direct quantification) of Chronons/Hexachr...
The literature is heterogeneous (different LLM families/sizes, prompting techniques, participant persona modeling, environments, and evaluation protocols), which impedes general conclusions about when LLMs reliably mimic humans.
Review notes wide variation across study designs and methods in the 182 studies; inability to produce a single performance estimate motivated unified conceptual framing.
high null result Synthetic Participants Generated by Large Language Models: A... methodological heterogeneity across studies (variance in models, prompts, evalua...
Evaluations reporting outcomes predominantly relied on learner surveys, knowledge/skill tests, or self‑reported behavior change measures.
Methods of evaluation extracted from the included studies: most used surveys, tests, or self-report measures to assess Kirkpatrick‑Barr levels 1–3.
high null result Assessing the effectiveness of artificial intelligence educa... evaluation methods (surveys, tests, self-report behavior change)
The study used a cross-sectional quantitative survey (purposive sampling) of pharmaceutical-sector employees in Karnataka, India (N = 350) and analyzed relationships using PLS-SEM (SmartPLS 4.0).
Study design and methods as reported in the paper summary: cross-sectional survey, purposive sampling, N = 350, analysis via Partial Least Squares Structural Equation Modeling (SmartPLS 4.0).
high null result AI-driven stress management and performance optimization: A ... study design / methodological characteristics
Policy recommendations include: invest in open metadata standards; fund pilot programs to evaluate ROI (earnings, placement, employer satisfaction); require model governance and periodic external audits for AI-assisted curriculum tools; and support smaller providers via shared infrastructure or accreditation hubs.
Explicit policy recommendations in paper (prescriptive).
high null result Curriculum engineering: organisation, orientation, and manag... implementation of open metadata standards, number and outcomes of funded pilots,...
Careful attention is needed to validity/reliability of assessments and to selection bias in employment outcome measurement.
Paper's methodological caveat (prescriptive); no empirical bias analysis provided.
high null result Curriculum engineering: organisation, orientation, and manag... assessment validity/reliability metrics; selection bias indicators in outcome me...
Suggested evaluation metrics include placement rates, wage premiums, competency attainment, compliance scores, cost per qualification, and update latency.
Paper's recommended evaluation metrics (prescriptive).
high null result Curriculum engineering: organisation, orientation, and manag... placement rates, wage premiums, competency attainment, compliance scores, cost p...
Implementation requires integration with information systems for documentation, versioning, metadata, and audit trails, and benefits from continuous monitoring dashboards.
Paper's technical implementation recommendations (prescriptive).
high null result Curriculum engineering: organisation, orientation, and manag... IT integration level: documentation/versioning/metadata/audit trail availability...
Recommended analysis methods are qualitative (semi-structured interviews, focus groups, document review) and quantitative (surveys, competency mapping, statistical analysis of outcomes), plus systematic audit methods including traceability checks.
Paper's methods section (methodological specification).
high null result Curriculum engineering: organisation, orientation, and manag... use of specified qualitative, quantitative, and audit methods
Data inputs for the framework should include competency taxonomies, labor-market signals, regulatory requirements, learner assessment results, and stakeholder interviews.
Paper's data-input specification (descriptive).
high null result Curriculum engineering: organisation, orientation, and manag... presence and use of specified data inputs
Management principles emphasised are transparency, traceability of outcomes, IT integration for documentation, and continuous monitoring/evaluation.
Explicit management principles in paper (prescriptive).
high null result Curriculum engineering: organisation, orientation, and manag... degree of adherence to transparency, traceability, IT integration, continuous mo...
Research and audit should emphasise validity, reliability, and compliance using mixed methods (qualitative interviews/focus groups; quantitative surveys/statistics) and systematic curriculum audits.
Recommended research & audit approach in paper (methodological guidance).
high null result Curriculum engineering: organisation, orientation, and manag... application of mixed-methods and systematic audits to assess validity/reliabilit...
Tools recommended include logigrams (visual decision/compliance flows) and algorigram (algorithmic step-flows for planning, assessment, audit).
Tool definitions and recommendations in paper (descriptive).
high null result Curriculum engineering: organisation, orientation, and manag... adoption of logigrams and algorigrams in curricula tooling
Core components of the framework are inputs (learner needs, industry requirements, regulatory standards), processes (curriculum mapping, competency alignment, career assessment), and outputs (structured lesson plans, compliance-ready frameworks, career-path documentation).
Framework component list provided in paper (descriptive).
high null result Curriculum engineering: organisation, orientation, and manag... presence and completeness of inputs/processes/outputs in implementation
Scope of the program includes curriculum design, organisational management, career-alignment, and audit/compliance processes.
Explicit scope statement in paper (descriptive).
high null result Curriculum engineering: organisation, orientation, and manag... inclusion of specified scope elements in program design
The framework foregrounds logical modelling (logigrams, algorigrams) and mixed-methods data analysis to support design, auditability, and alignment with industry and regulatory standards.
Paper's methodological design and tool recommendations (conceptual). No empirical implementation data reported.
high null result Curriculum engineering: organisation, orientation, and manag... use of logical modelling tools and mixed-methods analysis in curriculum design
The program offers a comprehensive curriculum-engineering framework linking organizational orientation, management systems, lesson planning, and career assessment into traceable, compliance-ready curriculum products.
Paper's program description and framework specification (conceptual); no empirical evaluation or sample size reported.
high null result Curriculum engineering: organisation, orientation, and manag... availability of traceable, compliance-ready curriculum products (framework prese...
The paper calls for subsequent quantitative validation (using task-based, matched employer-employee, and provider-level panel data) to estimate causal impacts on productivity, health outcomes, wages, and employment composition across the three interaction levels.
Stated research agenda and measurement recommendations in the paper's discussion section.
high null result Toward human+ medical professionals: navigating AI integrati... need for causal estimates of productivity, health outcomes, wages, employment co...
The study is qualitative and small-sample (four case) and therefore interpretive and illustrative rather than statistically generalizable.
Explicit methodological statement in the paper: design = qualitative multiple case study, sample = four AI healthcare applications.
high null result Toward human+ medical professionals: navigating AI integrati... generalizability/external validity
The study identifies a three-level taxonomy of human–AI interaction in healthcare: AI-assisted, AI-augmented, and AI-automated.
Conceptual taxonomy derived from multiple qualitative case studies (n=4) using cross-case comparison and Bolton et al. (2018)'s three-dimensional service-innovation framework.
high null result Toward human+ medical professionals: navigating AI integrati... classification of AI–human interaction (taxonomic mapping)
Few longitudinal or randomized studies were found, which limits the evidence base for causal claims about digital transformation's effect on productivity.
Review recorded a limited number of longitudinal analyses and quasi-experimental designs among the 145 studies; randomized studies were scarce or absent.
high null result Digital transformation and its relationship with work produc... presence/absence of longitudinal/randomized designs relevant to causal inference
Measurement heterogeneity across studies includes self-reported productivity, output-per-worker metrics, and process efficiency indicators.
Extraction of productivity indicators from included studies (detailed in Methods/Extraction fields) showed multiple distinct measurement approaches.
high null result Digital transformation and its relationship with work produc... types of productivity measures used in studies
There is a lack of standardized instruments and inconsistent controls for confounding factors across studies, limiting causal inference about the effect of digital transformation on productivity.
Review extraction documented varied instruments/measures and inconsistent adjustment for confounders across the included studies; few randomized or robust longitudinal designs were found.
high null result Digital transformation and its relationship with work produc... quality of causal inference (control for confounding, presence of randomized/lon...
Heterogeneous definitions of 'digital transformation' and a variety of productivity measurement approaches prevented a formal quantitative meta-analysis.
Extraction found wide variation in how digital transformation and productivity were defined and measured across the 145 studies (self-reported productivity, output per worker, process efficiency metrics, etc.), leading authors to forgo meta-analysis.
high null result Digital transformation and its relationship with work produc... feasibility of quantitative meta-analysis / cross-study comparability
535 records were identified across Scopus, Web of Science, ScienceDirect, IEEE Xplore, and Google Scholar, of which 145 met PRISMA 2020 inclusion criteria.
Search and screening procedure documented in the review: initial database searches yielded 535 records → duplicates removed → screening → full-text evaluation → 145 included studies.
high null result Digital transformation and its relationship with work produc... study selection counts (records identified and studies included)
There are few large-scale randomized controlled trials (RCTs) showing direct patient outcome improvements from GenAI CDS; high-quality real-world and longitudinal studies are limited but essential.
Evidence-maturity statement in the paper summarizing the literature; the paper explicitly notes scarcity of large RCTs and longitudinal evaluations.
high null result GenAI and clinical decision making in general practice number of large-scale RCTs reporting patient outcome improvements; availability ...
The paper's empirical scope is primarily conceptual/theoretical and literature‑based rather than an empirical case study or large‑scale data experiment; it emphasizes the need for future empirical validation.
Explicit methodological description within the paper stating reliance on literature review and conceptual development; absence of empirical sample or case study.
high null result A Review of Manufacturing Operations Research Integration in... presence/absence of empirical validation within the study
This work is a conceptual framework and design proposal synthesizing methods from recommender systems and HRI rather than a report of novel empirical experiments.
Explicit statement in the Data & Methods section of the paper.
high null result Reimagining Social Robots as Recommender Systems: Foundation... presence/absence of original empirical experiments (absence)
The abstract does not report the study sample size, sectoral scope, or country/context—limiting assessment of external validity and generalizability.
Observation of reporting in the paper's abstract (absence of sample size, sectoral/country context information in the abstract as provided).
high null result Reimagining Stakeholder Engagement Through Generative AI: A ... Completeness of methodological reporting (sample/context disclosure)
The study used a two-stage mixed-methods design: a qualitative exploratory phase to surface determinants of trust and inertia, followed by a quantitative phase to validate the conceptual framework.
Methods description in the paper: explicit two-stage mixed-methods approach (qualitative then quantitative) used to identify and test determinants of initial trust and inertia toward GAICS.
high null result Reimagining Stakeholder Engagement Through Generative AI: A ... Study design / methodological approach
The study has potential selection and ecological-validity constraints because it was conducted at two institutions across six courses, limiting generalizability.
Authors note limitations regarding sample scope (two institutions, six courses) and the ecological validity of the experimental tasks/settings.
high null result Expanding the lens: multi-institutional evidence on student ... external validity/generalizability (limitation)
The study employed a multi-method approach combining experimental quantitative analysis (descriptives, GLM, non-parametric robustness checks) with qualitative topic-based coding of open-ended survey responses.
Methods description: randomized/experimental assignment; quantitative analyses using GLM and non-parametric tests; qualitative topic-based coding of student responses; sample N = 254 across six courses at two institutions.
high null result Expanding the lens: multi-institutional evidence on student ... study methodology (mixed-methods design)
The study did not directly measure accessibility or impacts on students with disabilities, though qualitative results suggest possible intersections with inclusive and multimodal learning design.
Limitation stated by authors: no direct measurement of accessibility outcomes; qualitative responses hinted at potential relevance to inclusive design but no empirical measurement of disability-related impacts.
high null result Expanding the lens: multi-institutional evidence on student ... accessibility/disability-related educational outcomes (not measured)
The study focused on short-term, knowledge-based tasks and did not measure long-term learning or retention.
Authors explicitly note as a limitation that the experimental tasks were short-term and knowledge-based and that long-term retention was not measured.
high null result Expanding the lens: multi-institutional evidence on student ... long-term learning/retention (not measured)
Empirical generalization across all climate-AI systems is constrained by heterogeneous data availability and proprietary models, limiting the ability to produce universal quantitative claims.
Stated methodological limitation in the paper, noting heterogeneous data and the proprietary nature of some models restrict broad generalization.
high null result The Rise of AI in Weather and Climate Information and its Im... Extent of empirical generalizability across climate-AI systems
The paper does not provide granular quantitative estimates of the economic cost of infrastructural asymmetries in climate-AI.
Explicit limitation stated by the authors in the Methods/Limitations section.
high null result The Rise of AI in Weather and Climate Information and its Im... Absence of quantified economic cost estimates in the paper
There is a need for empirical research quantifying earnings dispersion, labor substitution effects, and the welfare impacts of GenAI-driven content economies over time.
Explicit research recommendation made in the paper based on gaps identified during analysis of the 377 videos (study is qualitative and does not measure these outcomes).
high null result Monetizing Generative AI: YouTubers' Collective Knowledge on... absence of quantitative measures in current study / identified need for future m...
The analysis identifies ten shared use cases that creators present as pathways to income using GenAI.
Coding of the 377-video corpus resulted in a catalog of ten use cases (as reported in the paper).
high null result Monetizing Generative AI: YouTubers' Collective Knowledge on... count and identification of distinct use-case categories (ten)
Because the sample is small and purposive and the design is qualitative, insights are rich but not statistically representative or quantified across the broader research landscape.
Authors' stated study limitations in the paper acknowledging small purposive sample (n=16) and qualitative design.
high null result RCTs & Human Uplift Studies: Methodological Challenges and P... representativeness and generalizability of study findings
The study's data come from semi-structured interviews with 16 expert practitioners across biosecurity, cybersecurity, education, and labor.
Study methods reported in the paper: qualitative data source explicitly stated as 16 semi-structured interviews across listed domains.
high null result RCTs & Human Uplift Studies: Methodological Challenges and P... sample size and domain coverage of interviews
The authors released their code and data for reproducibility at https://github.com/blocksecteam/ReEVMBench/.
Statement in the paper indicating public release of code and dataset at the provided GitHub URL.
high null result Re-Evaluating EVMBench: Are AI Agents Ready for Smart Contra... code_and_data_availability (repository_link)
Crystallization Efficiency (CE) is defined as Useful_Crystallized_Knowledge / (Human_Effort × Time).
Operational formalism and metric definitions presented in the paper (explicit formula provided). This is a proposed metric, not an empirically validated measure.
high null result Nurture-First Agent Development: Building Domain-Expert AI A... Crystallization Efficiency as defined
The paper proposes operational patterns (Dual-Workspace Pattern separating live interaction workspace and persistent knowledge workspace) and a Spiral Development Model (iterative interaction → crystallization → validation → redeployment).
Operational framework section describing patterns and workflows; illustrated in the case study implementation.
high null result Nurture-First Agent Development: Building Domain-Expert AI A... existence and application of dual-workspace and spiral development workflows
The Knowledge Crystallization Cycle formalizes operations (extract, synthesize, validate, integrate) and proposes efficiency and quality metrics including Crystallization Efficiency (CE), Fidelity, Reuse Rate, and Freshness/Volatility Score.
Operational formalism section of the paper presenting metric definitions and proposed calculations (e.g., CE = Useful_Crystallized_Knowledge / (Human_Effort × Time)). These are proposed metrics, not validated at scale.
high null result Nurture-First Agent Development: Building Domain-Expert AI A... Crystallization Efficiency and related proposed metrics
The paper introduces a Three-Layer Cognitive Architecture that organizes agent knowledge by volatility and degree of personalization (stable/core knowledge; institutionalized heuristics/patterns; volatile/session-level tacit details).
Architectural specification presented in the paper (conceptual design document). No experimental validation beyond the illustrative case study.
high null result Nurture-First Agent Development: Building Domain-Expert AI A... categorization of knowledge artifacts into three volatility/personalization laye...
Nurture-First Development (NFD) reframes agent creation from a one-time engineering task into a continuous, conversational growth process.
Conceptual formalization in the paper (architectural and operational descriptions). No large-scale empirical test reported; supported by theoretical argumentation and illustrative examples.
high null result Nurture-First Agent Development: Building Domain-Expert AI A... characterization of development process (one-time vs. continuous conversational ...
Findings are based on a student sample rating decontextualized messages, so external validity to industry communication or real project logs is uncertain and requires replication.
Study sample consisted of 81 students in team-based software projects labeling decontextualized statements; authors explicitly note this limitation as a caveat.
high null result Exploring Indicators of Developers' Sentiment Perceptions in... generalizability/external validity of the study findings to non-student, context...
Many apparent correlations between predictors and sentiment labels do not remain significant after global multiple-testing correction.
Correlation analyses across many predictors with explicit application of multiple-testing correction procedures; many initial signals failed to survive correction.
high null result Exploring Indicators of Developers' Sentiment Perceptions in... statistical significance of correlations between predictors (e.g., mood, team me...
The paper does not provide quantitative estimates of time saved per report, cost reductions, or effects on employment/wages; such economic impacts remain to be quantified.
Caveats noted in the paper: absence of quantitative estimates for time/cost/employment effects and a call for field trials and economic modeling. This is explicitly stated in the summary.
high null result Bridging the Skill Gap in Clinical CBCT Interpretation with ... Absence of quantitative economic impact estimates (time saved, cost reduction, e...
The paper used a clinically grounded, multi-level evaluation framework that separately assessed raw AI drafts (automatic metrics + clinician review) and radiologist-AI collaborative final reports (how radiologists edit and downstream clinical effects), including comparisons across radiologist experience levels.
Methodology section summarized in the paper: multi-level assessment covering AI drafts and radiologist-edited collaborative reports; combination of automatic metrics and radiologist-/clinician-centered evaluations; experience-level stratified analyses (novice/intermediate/senior).
high null result Bridging the Skill Gap in Clinical CBCT Interpretation with ... Evaluation framework components (draft assessment, collaborative report assessme...