The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (7448 claims)

Adoption
5267 claims
Productivity
4560 claims
Governance
4137 claims
Human-AI Collaboration
3103 claims
Labor Markets
2506 claims
Innovation
2354 claims
Org Design
2340 claims
Skills & Training
1945 claims
Inequality
1322 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 378 106 59 455 1007
Governance & Regulation 379 176 116 58 739
Research Productivity 240 96 34 294 668
Organizational Efficiency 370 82 63 35 553
Technology Adoption Rate 296 118 66 29 513
Firm Productivity 277 34 68 10 394
AI Safety & Ethics 117 177 44 24 364
Output Quality 244 61 23 26 354
Market Structure 107 123 85 14 334
Decision Quality 168 74 37 19 301
Fiscal & Macroeconomic 75 52 32 21 187
Employment Level 70 32 74 8 186
Skill Acquisition 89 32 39 9 169
Firm Revenue 96 34 22 152
Innovation Output 106 12 21 11 151
Consumer Welfare 70 30 37 7 144
Regulatory Compliance 52 61 13 3 129
Inequality Measures 24 68 31 4 127
Task Allocation 75 11 29 6 121
Training Effectiveness 55 12 12 16 96
Error Rate 42 48 6 96
Worker Satisfaction 45 32 11 6 94
Task Completion Time 78 5 4 2 89
Wages & Compensation 46 13 19 5 83
Team Performance 44 9 15 7 76
Hiring & Recruitment 39 4 6 3 52
Automation Exposure 18 17 9 5 50
Job Displacement 5 31 12 48
Social Protection 21 10 6 2 39
Developer Productivity 29 3 3 1 36
Worker Turnover 10 12 3 25
Skill Obsolescence 3 19 2 24
Creative Output 15 5 3 1 24
Labor Share of Income 10 4 9 23
High upfront costs, weak digital/physical infrastructure, limited access to credit, low digital literacy, insecure land tenure, and sociocultural factors (including gendered access) limit uptake of digital and precision technologies among smallholders.
Consistent findings across program evaluations, qualitative stakeholder interviews, participatory assessments, and case studies cited in the synthesis.
high negative MODERN APPROACHES TO SUSTAINABLE AGRICULTURAL TRANSFORMATION technology adoption rates (uptake), barriers to adoption
Limited access to capital, data, digital infrastructure, skills, and insecure land tenure reduce adoption rates for advanced innovations among smallholders.
Multiple empirical studies and program evaluations synthesized in the review documenting adoption barriers; policy review identifying structural constraints across regions.
high negative MODERN APPROACHES TO SUSTAINABLE AGRICULTURAL TRANSFORMATION adoption rates of AI/IoT/precision tools, uptake of new practices
Integrating AI raises questions of accountability, transparency, fairness, privacy, and bias; managerial responsibility includes governance design, validation, and audit of AI decisions.
Normative and governance-focused synthesis citing ethical frameworks and illustrative cases; identifies governance tasks and validation/audit needs rather than empirical prevalence rates.
high negative Modern Management in the Age of Artificial Intelligence: Str... presence and quality of AI governance mechanisms (accountability frameworks, tra...
Deficits in governance, auditing, and interpretability constrain the safe deployment of generative AI in firms.
Synthesis of industry reports and conceptual literature noting gaps in governance and interpretability; no quantitative governance dataset reported.
high negative The Use of ChatGPT in Business Productivity and Workflow Opt... presence/absence of governance processes, frequency of audit findings, deploymen...
Algorithmic biases in generative AI can amplify and codify discriminatory patterns in organizational decisions.
Extensive literature on algorithmic bias synthesized in the review and applied to generative models; case examples referenced.
high negative The Use of ChatGPT in Business Productivity and Workflow Opt... disparities in decision outcomes (error rates, disparate impact metrics by group...
Generative AI use introduces significant organizational risks including data privacy breaches and leakage when models or third‑party services are used.
Conceptual analysis and references to documented incidents and industry reports within the review; no single aggregated incident dataset provided.
high negative The Use of ChatGPT in Business Productivity and Workflow Opt... incidence of data breaches/leakage, number of privacy violations
Generated code can introduce security vulnerabilities.
Security analyses and code audits documenting examples where LLM-generated code contains known vulnerability patterns; incident-oriented case studies and controlled experiments assessing vulnerability incidence.
high negative ChatGPT as a Tool for Programming Assistance and Code Develo... incidence of security vulnerabilities in AI-generated code
LLMs can produce plausible-looking but incorrect or insecure code (so-called 'hallucinations').
Benchmarks and controlled tests demonstrating incorrect outputs; security analyses and replicated examples showing erroneous or insecure snippets produced by LLMs across multiple models and prompts.
high negative ChatGPT as a Tool for Programming Assistance and Code Develo... code correctness/error rate and frequency of insecure code returned
The technical feasibility of robust token verification and resistance to spoofing needs demonstration; it is not yet proven.
Authors explicitly acknowledge this limitation in the paper; no prototypes or red-team results are presented.
high negative Token Taxes: mitigating AGI's economic risks robustness of token verification to spoofing/evasion
Risks: dependence on LLM behavior means hallucinations, bias, or misaligned reasoning can propagate into simulated outcomes; Chain-of-Thought reasoning may be hard to fully verify, posing interpretability/auditability challenges.
Paper's cautions section listing potential failure modes and ethical/interpretability risks; these are identified risks rather than quantified failures observed in experiments.
high negative An LLM-Driven Multi-Agent Simulation Framework for Coupled E... propagation of LLM-induced errors/bias into simulation outcomes and interpretabi...
AI-driven impacts will be heterogeneous across education, race, gender, age, firm size, and geography, implying crucial equity concerns and the need for disaggregated reporting and targeted validation.
Policy analysis and literature synthesis in the paper; this claim reflects widely-documented labor economics findings about heterogeneous technological impacts though no new empirical breakdowns provided here.
high negative Enhancing BLS Methodologies for Projecting AI's Impact on Em... distribution of employment/wage/transition impacts across demographic and firm/r...
The study is limited by being a single-domain (CMM) case study with a likely modest sample size and dependence on specific AR hardware and MLLM capabilities; further validation across other machines and larger samples is needed.
Authors note these limitations in their discussion; the summary explicitly lists single-case domain, likely modest sample size, and dependency on particular hardware/MLLM as limitations.
high negative Augmented Reality-Based Training System Using Multimodal Lan... External validity/generalizability of findings (limitations stated)
Key failure modes for AI in drug R&D include overfitting, poor generalizability, dataset bias, insufficient external validation, and misalignment with evolving regulatory expectations.
Synthesis of literature and case reports in the narrative review describing observed failures and risks across projects (qualitative evidence).
high negative Artificial Intelligence in Drug Discovery and Development: R... failure incidence of AI projects (model performance collapse, regulatory rejecti...
Absent rigorous controls (validation, applicability-domain reporting, attention to dataset bias), AI models risk overfitting, producing inequitable outcomes and regulatory friction that can undermine economic benefits.
Theoretical arguments plus case reports and literature cited in the review documenting instances and mechanisms of overfitting, dataset bias, and regulatory challenges; narrative summary rather than systematic quantification.
high negative Artificial Intelligence in Drug Discovery and Development: R... model generalizability (out-of-sample performance), subgroup performance dispari...
Governing-logic stability uncertainty (whether decision logic or objectives remain stationary) is a distinct risk posed by agentic AI.
Conceptual argument and proposed taxonomy; no empirical tests reported.
high negative Visioning Human-Agentic AI Teaming: Continuity, Tension, and... stability of AI decision logic/objectives over time
Epistemic grounding uncertainty (uncertainty about how/why an AI produced a particular output) increases with agentic AI.
Literature synthesis on model-level opacity and causal explanation limits; conceptual reasoning in the paper.
high negative Visioning Human-Agentic AI Teaming: Continuity, Tension, and... ability to explain/ground AI outputs
Behavioral trajectory uncertainty (difficulty predicting long-run actions) is a primary form of uncertainty introduced by agentic AI.
Conceptual classification and argument; proposed as one of three principal uncertainties; no empirical estimation.
high negative Visioning Human-Agentic AI Teaming: Continuity, Tension, and... predictability of long-run agentic AI actions
Integration cost: AI-generated outputs often require human revision, testing, and manual integration into existing systems.
Reported practitioner experience and observed practices from the field study at Netlight; authors note time and effort spent on revision and integration; no quantitative time-cost estimates provided.
high negative Rethinking How IT Professionals Build IT Products with Artif... human time/effort required to adapt AI outputs for production
AI systems lack full project context, design rationale, and long-term constraints, creating context gaps for development tasks.
Interviews and workflow observations at Netlight where practitioners reported contextual limitations of AI tools; qualitative examples provided; single-firm qualitative evidence.
high negative Rethinking How IT Professionals Build IT Products with Artif... degree of project/contextual awareness in AI-produced recommendations
AI outputs commonly contain errors and hallucinations: generated code can be incorrect, incomplete, or misleading.
Practitioner reports and observed interactions with AI tools documented in the Netlight qualitative study; specific instances and practitioner concerns described in the paper; no quantitative error rates provided.
high negative Rethinking How IT Professionals Build IT Products with Artif... accuracy and correctness of AI-generated outputs
Adaptive RL-driven campaigns complicate attribution and causal inference, so rigorous experimental designs (multi-armed trials, off-policy evaluation) are required for valid measurement.
Methodological claim in the implications section; supported by discussion of policy adaptivity and the need for specific evaluation techniques. No empirical demonstration provided.
high negative Personalized Content Selection in Marketing Using BERT and G... bias in causal estimates, validity of attribution, off-policy evaluation error
The system raises privacy, fairness, and safety risks including data leakage, demographic bias in generated content, manipulative targeting, and potential regulatory non-compliance.
Risk assessment and red-team / audit practices described; paper cites known classes of ML deployment risks and recommends logs/audits. This is a conceptual identification rather than a quantified empirical finding.
high negative Personalized Content Selection in Marketing Using BERT and G... incidence/risk of data leakage, demographic bias metrics, examples of manipulati...
Integration and engineering complexity (legacy systems, privacy/compliance pipelines, multi-channel platforms) is a persistent barrier to deployment.
Industry case studies and practitioner reports synthesized in the review documenting integration challenges; no systematic cost accounting or sample sizes presented.
high negative The Effectiveness of ChatGPT in Customer Service and Communi... integration complexity metrics, implementation time/cost, number of integration ...
Hallucinations and factual errors from generative AI can damage service quality and customer trust.
Documented failure cases and empirical reports from the literature aggregated by the review; no novel incident count or experimental data in this paper.
high negative The Effectiveness of ChatGPT in Customer Service and Communi... incidence of factual errors/hallucinations, measures of service quality and cust...
Generative AI is susceptible to social and representational biases and to factual errors or hallucinations; it lacks tacit, contextual domain expertise.
Documented examples in the literature of biased outputs and hallucinations; controlled evaluations and audits of model outputs; qualitative reports highlighting lack of tacit knowledge in domain-specific tasks.
high negative ChatGPT as an Innovative Tool for Idea Generation and Proble... incidence of biased content; factual error/hallucination rate; performance on do...
The quality of AI-generated outputs is highly variable; models frequently produce mediocre but plausible-sounding content that requires human filtering.
Multiple user studies and qualitative reports documenting variability in output quality and the need for human curation; outcome measures include error rates, user-rated quality, and time spent vetting.
high negative ChatGPT as an Innovative Tool for Idea Generation and Proble... output quality distributions; user-perceived quality; time/effort for human filt...
Factual errors and 'hallucinations' create misinformation risks and can produce costly service failures.
Model evaluation studies, incident case reports from deployments, and academic/industry analyses documenting hallucination rates and concrete failure examples.
high negative The Effectiveness of ChatGPT in Customer Service and Communi... factual accuracy / hallucination rate; incidents of service failure (operational...
The study population was restricted to CHI conference papers that had publicly shared study data and analysis code (a self-selected subset), which introduces a self-selection bias that may overestimate reproducibility rates for the broader set of CHI papers.
Authors' stated sampling strategy and limitations noted in the paper (sample restricted to artifact-sharing papers and potential overestimation of reproducibility).
high negative On the Computational Reproducibility of Human-Computer Inter... generalizability of the measured reproducibility rate (bias due to sampling)
Ethical, privacy, and legal restrictions sometimes limit the ability to share data and thereby hamper reproducibility.
Authors' observations from reproduction work and survey/interview responses indicating that some datasets could not be shared for legal/ethical reasons.
high negative On the Computational Reproducibility of Human-Computer Inter... incidence of data-sharing restrictions affecting reproducibility
Resource, compute, privacy, and deployment costs associated with CRAEA were not fully quantified in the paper.
Authors note that resource, compute, privacy, and deployment costs were not fully quantified; no cost analyses or benchmarks provided in the summary.
high negative Context-Rich Adaptive Embodied Agents: Enhancing LLM-Powered... Quantification of resource/compute/privacy/deployment costs (absence of measurem...
Evaluation was performed in an artificial/simulated home environment; therefore real-world transfer, robustness to noisy perception, and hardware constraints remain open questions.
Authors explicitly state evaluations occurred in a simulated home environment and acknowledge limits on real-world transfer and robustness. This is a stated limitation rather than an experimental finding.
high negative Context-Rich Adaptive Embodied Agents: Enhancing LLM-Powered... Generalizability/real-world transfer (qualitative limitation)
High linguistic diversity in Africa makes building and evaluating multilingual language technologies more difficult and is a barrier to inclusive AI.
Synthesis of technical literature on NLP and multilingual model development and policy/NGO reports highlighting missing language resources; no original model evaluation reported.
high negative Towards Responsible Artificial Intelligence Adoption: Emergi... language technology availability, model performance across African languages, nu...
Structural constraints—limited digital infrastructure, scarce and skewed data, and high linguistic diversity—complicate AI development, deployment and evaluation in African contexts.
Desk review of infrastructure and data availability reports and scholarly literature demonstrating gaps and their effects; no new measurement in this paper.
high negative Towards Responsible Artificial Intelligence Adoption: Emergi... internet/digital infrastructure coverage, availability and representativeness of...
Privacy concerns, regulatory/compliance issues, biased or opaque models, and the need for change management and HR analytics capability building are significant risks constraining adoption.
Recurring risks and constraints reported by multiple included studies; summarized in the review's 'risks and constraints' theme.
high negative Data-Driven Strategies in Human Resource Management: The Rol... adoption constraints, incidence of privacy/regulatory/ bias issues
Implementation of data-driven HRM faces recurring challenges: data quality, privacy and ethics, algorithmic bias, and deficiencies in skills and organizational readiness.
Commonly reported implementation issues across the 47 reviewed studies; extracted as a central theme in the review's thematic analysis.
high negative Data-Driven Strategies in Human Resource Management: The Rol... implementation success/failure factors, incidence of data/ethical issues
Rapid skill obsolescence in AI necessitates frequent curriculum updates and responsive governance.
Identified as a risk: the paper notes AI skill change rates and recommends frequent updates and governance mechanisms. This aligns with general domain knowledge; the paper does not provide empirical measurement of obsolescence rates.
high negative Curriculum engineering: organisation, orientation, and manag... update frequency, lag between skill demand change and curriculum update
Aligning multiple standards is complex, posing a disadvantage and implementation risk.
Stated explicitly in Disadvantages/Risks: complexity of aligning multiple standards is listed. This is a reasoned observation in the paper rather than empirically demonstrated.
high negative Curriculum engineering: organisation, orientation, and manag... complexity measures (number of standards to reconcile, conflicts identified), ti...
Implementing this framework requires significant resources and continuous updating.
Stated explicitly under Main Finding and Disadvantages/Risks; paper lists cost/time metrics to track (cost-per-curriculum, time-to-update) and highlights resource intensity. Support is descriptive/analytic rather than empirical.
high negative Curriculum engineering: organisation, orientation, and manag... resource intensity (cost-per-curriculum), time-to-update, maintenance burden
Constraints and risks include model risk (overfitting, drift), algorithmic bias, privacy and data-sharing limits, legacy ERP complexity, interoperability challenges, and limited organizational readiness and skills.
Reviewed literature (empirical studies, technical evaluations, and standards) documenting technical and organizational failures, risk incidents, and common barriers to implementation.
high negative Integrating Artificial Intelligence and Enterprise Resource ... risk-related outcomes (e.g., model degradation rates, incidence of biased decisi...
Algorithmic bias, unequal digital financial literacy, caregiving time constraints, and limited access to personalized solutions can sustain or reproduce gender investment gaps if not addressed.
Synthesis of literature on barriers to financial inclusion and AI fairness concerns, plus platform report observations (review of empirical and conceptual studies; not a single empirical test).
high negative Women's Investment Behaviour and Technology: Exploring the I... gender investment gap, differential product offerings, access metrics
Women statistically exhibit greater risk aversion in some settings compared with men.
Summary of empirical survey and experimental studies on gender differences in risk attitudes discussed in the review (multiple cross‑sectional and lab/field experiments referenced).
high negative Women's Investment Behaviour and Technology: Exploring the I... measured risk aversion / willingness to take financial risk
The digital divide (lack of reliable electricity and connectivity) constrains adoption of MIS and AI, creating geographic and regional inequities in who benefits from the framework.
Infrastructure constraint argument presented in the paper; no quantified coverage maps or population-level access statistics included.
high negative Establishes a technical and academic bridge between the educ... coverage of system access, differential adoption rates by region, inequality in ...
AI-driven equivalency systems carry risks including algorithmic bias, opaque decisions without explainability, and potential reinforcement of inequities when training data under-represents some regions/institutions.
Risk assessment drawing on established AI ethics literature; no empirical bias audit from the proposed system is provided.
high negative Establishes a technical and academic bridge between the educ... measures of algorithmic bias (disparate impact), explainability scores, unequal ...
The major disadvantage of an MIS is dependency on reliable electricity and internet, creating systemic vulnerability due to the digital divide.
Paper notes infrastructure dependency as a constraint; assertion grounded in common infrastructural realities but no measured connectivity or outage statistics from DRC/SA are provided.
high negative Establishes a technical and academic bridge between the educ... geographic/regional access to equivalency services and system uptime availabilit...
Key audit/control weaknesses with respect to prompt fraud include lack of provenance for inputs/prompts and model outputs, inadequate access controls, and missing or ineffective monitoring and anomaly detection for AI outputs.
Qualitative control analysis and adaptation of established auditing principles to GenAI workflows; recommendations based on threat modeling rather than field data.
high negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... presence or absence of specific control capabilities (provenance, access control...
GenAI outputs can be tailored to mimic corporate styles, templates, and evidence artifacts (e.g., summaries, memos, audit trails), which increases their credibility to auditors, managers, or customers.
Illustrative examples and scenario mapping demonstrating templated output mimicry; no controlled experiments or corpus analysis provided.
high negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... perceived credibility of machine-generated artifacts when formatted to corporate...
Large language models produce fluent, human-like outputs that can mask falsehoods (hallucinations) as facts, making prompt fraud effective.
Well-established LLM behavior cited conceptually and supported in the paper by illustrative examples; no new empirical measurement in this article.
high negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... propensity of LLM outputs to present fabricated information as authoritative
Prompt fraud does not require system intrusion, credential theft, or software exploits; it operates at the reasoning/language layer of large language models and therefore can be executed without technical breaches.
Logical/technical argumentation built from properties of LLMs and illustrative hypothetical attack chains; threat modeling rather than empirical attack logs.
high negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... necessity of technical breach for successful fraud (binary: required/not require...
Prompt fraud is a new, distinct fraud modality in which adversaries intentionally craft natural-language prompts (or manipulate prompt inputs) to steer generative AI outputs into producing misleading, fabricated, or compliance-evading artifacts that bypass traditional internal controls.
Conceptual definition presented by the paper based on threat taxonomy and scenario mapping; illustrated with case-style examples. No empirical incident dataset or prevalence statistics provided.
high negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... existence/recognition of a distinct fraud modality ('prompt fraud')
Potential limitations include limited methodological detail on case selection and measurement, possible selection and reporting bias from practitioner-sourced examples, and variable generalizability to small firms or highly regulated industries.
Authors' self-reported limitations in the Methods/Limitations section (qualitative assessment).
high negative Governed Hyperautomation for CRM and ERP: A Reference Patter... methodological completeness and generalizability (qualitative limitation)