Evidence (14055 claims)
Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 758 | 199 | 100 | 900 | 2007 |
| Governance & Regulation | 826 | 400 | 191 | 122 | 1563 |
| Organizational Efficiency | 777 | 193 | 124 | 84 | 1189 |
| Technology Adoption Rate | 635 | 233 | 124 | 97 | 1098 |
| Research Productivity | 422 | 128 | 57 | 336 | 954 |
| Output Quality | 476 | 179 | 59 | 47 | 761 |
| Decision Quality | 328 | 177 | 81 | 47 | 640 |
| Firm Productivity | 435 | 57 | 88 | 20 | 606 |
| AI Safety & Ethics | 218 | 277 | 65 | 33 | 599 |
| Market Structure | 180 | 170 | 123 | 24 | 502 |
| Task Allocation | 213 | 64 | 72 | 33 | 387 |
| Skill Acquisition | 170 | 61 | 61 | 17 | 309 |
| Innovation Output | 203 | 27 | 43 | 18 | 292 |
| Employment Level | 105 | 54 | 107 | 13 | 281 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 117 | 63 | 42 | 11 | 233 |
| Firm Revenue | 153 | 48 | 26 | 3 | 230 |
| Task Completion Time | 173 | 31 | 8 | 12 | 225 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Worker Satisfaction | 89 | 65 | 22 | 12 | 188 |
| Error Rate | 69 | 92 | 10 | 2 | 173 |
| Regulatory Compliance | 77 | 69 | 14 | 5 | 165 |
| Automation Exposure | 56 | 56 | 26 | 13 | 154 |
| Training Effectiveness | 94 | 21 | 13 | 19 | 149 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Team Performance | 86 | 17 | 27 | 10 | 141 |
| Developer Productivity | 95 | 17 | 14 | 6 | 133 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 52 | 7 | 8 | 3 | 70 |
| Creative Output | 31 | 18 | 8 | 3 | 61 |
| Skill Obsolescence | 5 | 46 | 6 | 1 | 58 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 19 | 17 | — | 53 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Hallucinations and factual errors from generative AI can damage service quality and customer trust.
Documented failure cases and empirical reports from the literature aggregated by the review; no novel incident count or experimental data in this paper.
Generative AI is susceptible to social and representational biases and to factual errors or hallucinations; it lacks tacit, contextual domain expertise.
Documented examples in the literature of biased outputs and hallucinations; controlled evaluations and audits of model outputs; qualitative reports highlighting lack of tacit knowledge in domain-specific tasks.
The quality of AI-generated outputs is highly variable; models frequently produce mediocre but plausible-sounding content that requires human filtering.
Multiple user studies and qualitative reports documenting variability in output quality and the need for human curation; outcome measures include error rates, user-rated quality, and time spent vetting.
Factual errors and 'hallucinations' create misinformation risks and can produce costly service failures.
Model evaluation studies, incident case reports from deployments, and academic/industry analyses documenting hallucination rates and concrete failure examples.
The study population was restricted to CHI conference papers that had publicly shared study data and analysis code (a self-selected subset), which introduces a self-selection bias that may overestimate reproducibility rates for the broader set of CHI papers.
Authors' stated sampling strategy and limitations noted in the paper (sample restricted to artifact-sharing papers and potential overestimation of reproducibility).
Ethical, privacy, and legal restrictions sometimes limit the ability to share data and thereby hamper reproducibility.
Authors' observations from reproduction work and survey/interview responses indicating that some datasets could not be shared for legal/ethical reasons.
Resource, compute, privacy, and deployment costs associated with CRAEA were not fully quantified in the paper.
Authors note that resource, compute, privacy, and deployment costs were not fully quantified; no cost analyses or benchmarks provided in the summary.
Evaluation was performed in an artificial/simulated home environment; therefore real-world transfer, robustness to noisy perception, and hardware constraints remain open questions.
Authors explicitly state evaluations occurred in a simulated home environment and acknowledge limits on real-world transfer and robustness. This is a stated limitation rather than an experimental finding.
High linguistic diversity in Africa makes building and evaluating multilingual language technologies more difficult and is a barrier to inclusive AI.
Synthesis of technical literature on NLP and multilingual model development and policy/NGO reports highlighting missing language resources; no original model evaluation reported.
Structural constraints—limited digital infrastructure, scarce and skewed data, and high linguistic diversity—complicate AI development, deployment and evaluation in African contexts.
Desk review of infrastructure and data availability reports and scholarly literature demonstrating gaps and their effects; no new measurement in this paper.
Privacy concerns, regulatory/compliance issues, biased or opaque models, and the need for change management and HR analytics capability building are significant risks constraining adoption.
Recurring risks and constraints reported by multiple included studies; summarized in the review's 'risks and constraints' theme.
Implementation of data-driven HRM faces recurring challenges: data quality, privacy and ethics, algorithmic bias, and deficiencies in skills and organizational readiness.
Commonly reported implementation issues across the 47 reviewed studies; extracted as a central theme in the review's thematic analysis.
Rapid skill obsolescence in AI necessitates frequent curriculum updates and responsive governance.
Identified as a risk: the paper notes AI skill change rates and recommends frequent updates and governance mechanisms. This aligns with general domain knowledge; the paper does not provide empirical measurement of obsolescence rates.
Aligning multiple standards is complex, posing a disadvantage and implementation risk.
Stated explicitly in Disadvantages/Risks: complexity of aligning multiple standards is listed. This is a reasoned observation in the paper rather than empirically demonstrated.
Implementing this framework requires significant resources and continuous updating.
Stated explicitly under Main Finding and Disadvantages/Risks; paper lists cost/time metrics to track (cost-per-curriculum, time-to-update) and highlights resource intensity. Support is descriptive/analytic rather than empirical.
Constraints and risks include model risk (overfitting, drift), algorithmic bias, privacy and data-sharing limits, legacy ERP complexity, interoperability challenges, and limited organizational readiness and skills.
Reviewed literature (empirical studies, technical evaluations, and standards) documenting technical and organizational failures, risk incidents, and common barriers to implementation.
Algorithmic bias, unequal digital financial literacy, caregiving time constraints, and limited access to personalized solutions can sustain or reproduce gender investment gaps if not addressed.
Synthesis of literature on barriers to financial inclusion and AI fairness concerns, plus platform report observations (review of empirical and conceptual studies; not a single empirical test).
Women statistically exhibit greater risk aversion in some settings compared with men.
Summary of empirical survey and experimental studies on gender differences in risk attitudes discussed in the review (multiple cross‑sectional and lab/field experiments referenced).
The digital divide (lack of reliable electricity and connectivity) constrains adoption of MIS and AI, creating geographic and regional inequities in who benefits from the framework.
Infrastructure constraint argument presented in the paper; no quantified coverage maps or population-level access statistics included.
AI-driven equivalency systems carry risks including algorithmic bias, opaque decisions without explainability, and potential reinforcement of inequities when training data under-represents some regions/institutions.
Risk assessment drawing on established AI ethics literature; no empirical bias audit from the proposed system is provided.
The major disadvantage of an MIS is dependency on reliable electricity and internet, creating systemic vulnerability due to the digital divide.
Paper notes infrastructure dependency as a constraint; assertion grounded in common infrastructural realities but no measured connectivity or outage statistics from DRC/SA are provided.
Key audit/control weaknesses with respect to prompt fraud include lack of provenance for inputs/prompts and model outputs, inadequate access controls, and missing or ineffective monitoring and anomaly detection for AI outputs.
Qualitative control analysis and adaptation of established auditing principles to GenAI workflows; recommendations based on threat modeling rather than field data.
GenAI outputs can be tailored to mimic corporate styles, templates, and evidence artifacts (e.g., summaries, memos, audit trails), which increases their credibility to auditors, managers, or customers.
Illustrative examples and scenario mapping demonstrating templated output mimicry; no controlled experiments or corpus analysis provided.
Large language models produce fluent, human-like outputs that can mask falsehoods (hallucinations) as facts, making prompt fraud effective.
Well-established LLM behavior cited conceptually and supported in the paper by illustrative examples; no new empirical measurement in this article.
Prompt fraud does not require system intrusion, credential theft, or software exploits; it operates at the reasoning/language layer of large language models and therefore can be executed without technical breaches.
Logical/technical argumentation built from properties of LLMs and illustrative hypothetical attack chains; threat modeling rather than empirical attack logs.
Prompt fraud is a new, distinct fraud modality in which adversaries intentionally craft natural-language prompts (or manipulate prompt inputs) to steer generative AI outputs into producing misleading, fabricated, or compliance-evading artifacts that bypass traditional internal controls.
Conceptual definition presented by the paper based on threat taxonomy and scenario mapping; illustrated with case-style examples. No empirical incident dataset or prevalence statistics provided.
Potential limitations include limited methodological detail on case selection and measurement, possible selection and reporting bias from practitioner-sourced examples, and variable generalizability to small firms or highly regulated industries.
Authors' self-reported limitations in the Methods/Limitations section (qualitative assessment).
Prompt fraud exploits the natural-language interface of large language models (LLMs) to produce outputs that appear authoritative (reports, audit trails, explanations) without system intrusion, credential theft, or software exploitation.
Definition and threat-model description using conceptual examples and case vignettes; literature/regulatory review to position the threat relative to traditional fraud vectors.
Data privacy and cross-border compliance issues arise from using cloud and SECaaS, complicating legal compliance for firms.
Regulatory analyses and compliance reports; documented examples in case studies and industry guidance on cross-border data flows.
The cloud shared responsibility model creates potential ambiguities in liability between providers and customers.
Regulatory guidance, legal analyses, and documented post-incident case studies showing confusion over responsibilities.
China manages the openness–security trade-off through a centralized, developmentalist, techno‑sovereignty approach that privileges coordinated state direction and control.
Qualitative content analysis of national‑level policy texts: 18 Chinese policy documents coded across four analytical dimensions (coordination objectives, institutional actors, governance mechanisms, stakeholder legitimacy).
Antibiotic use in humans and animals, along with environmental antibiotic residues, generates converging selection pressures that drive AMR relevant to children.
Well-established ecological and microbiological literature summarized in the review showing cross-sector selection pressures; narrative integration rather than new empirical analysis.
Child behaviors (hand-to-mouth activity, play, outdoor exposure) increase contact with environmental and animal reservoirs and therefore exposure risk.
Behavioral and exposure studies synthesized narratively; observational evidence from exposure assessments and pediatric environmental health studies cited in review (no meta-analysis).
Developmental windows imply early-life exposures can have long-term consequences for health and human capital.
Developmental and epidemiologic literature integrated in the review; narrative citations of studies linking early exposures to later health and cognitive outcomes (no single longitudinal dataset presented).
Physiological and immunological immaturity (including neonatal risks) increases children's susceptibility to infectious disease and related harms.
Established biological and clinical literature synthesized in the review; references to neonatal clinical risks and immunological immaturity across pediatric literature (no pooled effect sizes reported).
Automation and LLM-driven orchestration add opacity; errors in instrument control or analysis could propagate quickly, raising liability, insurance, and reproducibility concerns.
Analytical discussion of risks and analogies to automated systems in other domains; no incident-level empirical data from microscopy given.
Ethical and governance issues related to LLM-driven microscopy include accountability, reproducibility, access inequities, data privacy, and concentration of capabilities in large providers.
Policy-oriented synthesis and analogies to governance challenges observed in other AI deployments; no new empirical measurement in microscopy contexts.
Integration of LLMs with microscopes faces challenges including safety and reliability of instrument control, verification of scientific outputs, data provenance, and alignment with experimental constraints.
Analytical discussion based on known reliability and safety issues in automated systems and AI tool use; no empirical incident data from microscopy provided.
There is substantial uncertainty in economic forecasts due to possible scale-up failures, regulatory constraints, feedstock price volatility, and path‑dependent lock‑in effects.
Synthesis of technical failure modes, regulatory uncertainty, and sensitivity analyses reported in TEA/LCA literature and economic modeling sections of the review.
Regulatory and biosafety concerns (including environmental release risks and dual‑use issues) increase fixed costs and create entry barriers that shape industry structure and diffusion.
Policy and governance literature reviewed alongside technical case studies; citations of regulatory requirements, biosafety frameworks, and examples of compliance costs affecting project viability.
Engineering and economic challenges—scale‑up hurdles, process robustness, feedstock cost, and downstream purification—limit industrial deployment of many bio-based processes.
Case study TEA/LCA summaries and process reports in the review highlighting scale-up failures or increased costs at larger scales, purification complexity for low‑concentration products, and sensitivity to feedstock prices.
Technical biological limitations—metabolic burden, pathway crosstalk, byproduct formation, and genetic instability—remain major constraints on strain performance and scalability.
Multiple experimental reports and method papers cited in the review documenting decreased growth/productivity due to engineered pathway burden, unintended interactions between pathways, accumulation of byproducts, and genetic mutations during production runs.
The described pipeline is cross-sectional as presented and should be extended to dynamic models (temporal embeddings, change-point detection) for trend or causal analyses.
Method description in summary indicates cross-sectional pipeline; recommendation to extend for temporal/dynamic modeling when analyzing trends or causal effects.
LLMs and corpora may reflect disciplinary, geographic, or language biases; analyses should adjust or stratify accordingly.
Caveat explicitly stated in summary noting potential biases in LLMs and corpora; recommendation to adjust/stratify analyses.
Cluster reliability should be validated (e.g., bootstrap, perturbations) and automatic labels complemented with expert human validation for critical analyses.
Caveat and recommended validation steps provided in summary; suggests bootstrap/perturbation and manual validation as best practices. No empirical stability metrics provided in summary.
Results are sensitive to model and prompt choice; researchers should perform robustness checks across LLMs, soft prompts, and embedding models.
Caveat explicitly stated in the paper summary noting model and prompt sensitivity; recommended validation steps include robustness checks across models and prompts.
Empirical validation is concentrated on the Agora-12 corpus; generalizability to other architectures, scales, or deployment contexts is unproven and identified as a limitation.
Authors' own limitations section and scope of empirical tests (analyses limited to Agora-12 and four clinical cases).
Higher complaint volume is significantly associated with near-term stock price declines.
Fixed-effects panel path models estimated on monthly data for 261 financial firms (2018–2023) report statistically significant negative associations between firm–month complaint volume and subsequent abnormal returns.
Consumer complaints—measured by monthly volume, topic composition, and VADER sentiment of complaint narratives—contain behavioral signals that predict short-term abnormal stock returns in U.S. financial firms.
CFPB complaint records matched to 261 publicly traded U.S. financial firms (monthly observations, 2018–2023); analyses use fixed-effects panel path models to link firm–month complaint features (volume, LDA topic prevalences, aggregated VADER sentiment) to firm-level abnormal returns; complementary machine-learning models evaluate out-of-sample predictive performance.
Platforms benefit from data-driven scalability and network effects, creating barriers to entry and affecting consumer surplus, innovation incentives, and pricing.
Economic theory of platforms and empirical cases from platform markets synthesized in the literature review; argument supported by secondary empirical studies cited.