Evidence (4175 claims)
Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 758 | 199 | 100 | 900 | 2007 |
| Governance & Regulation | 826 | 400 | 191 | 122 | 1563 |
| Organizational Efficiency | 777 | 193 | 124 | 84 | 1189 |
| Technology Adoption Rate | 635 | 233 | 124 | 97 | 1098 |
| Research Productivity | 422 | 128 | 57 | 336 | 954 |
| Output Quality | 476 | 179 | 59 | 47 | 761 |
| Decision Quality | 328 | 177 | 81 | 47 | 640 |
| Firm Productivity | 435 | 57 | 88 | 20 | 606 |
| AI Safety & Ethics | 218 | 277 | 65 | 33 | 599 |
| Market Structure | 180 | 170 | 123 | 24 | 502 |
| Task Allocation | 213 | 64 | 72 | 33 | 387 |
| Skill Acquisition | 170 | 61 | 61 | 17 | 309 |
| Innovation Output | 203 | 27 | 43 | 18 | 292 |
| Employment Level | 105 | 54 | 107 | 13 | 281 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 117 | 63 | 42 | 11 | 233 |
| Firm Revenue | 153 | 48 | 26 | 3 | 230 |
| Task Completion Time | 173 | 31 | 8 | 12 | 225 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Worker Satisfaction | 89 | 65 | 22 | 12 | 188 |
| Error Rate | 69 | 92 | 10 | 2 | 173 |
| Regulatory Compliance | 77 | 69 | 14 | 5 | 165 |
| Automation Exposure | 56 | 56 | 26 | 13 | 154 |
| Training Effectiveness | 94 | 21 | 13 | 19 | 149 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Team Performance | 86 | 17 | 27 | 10 | 141 |
| Developer Productivity | 95 | 17 | 14 | 6 | 133 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 52 | 7 | 8 | 3 | 70 |
| Creative Output | 31 | 18 | 8 | 3 | 61 |
| Skill Obsolescence | 5 | 46 | 6 | 1 | 58 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 19 | 17 | — | 53 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Org Design
Remove filter
Human–AI chats contain fewer emotional and social messages compared with human–human chats.
Content coding of chat transcripts comparing frequencies of emotional/social message categories across human–AI (n = 126) and human–human (n = 108) conditions; reported lower counts/proportions of social/emotional content in human–AI dialogs.
Public‑interest concerns (bias, misuse, systemic risk) may be harder to mitigate via simple transparency rules; policies should emphasize outcome‑based regulations, mandatory behavioral testing, and marketplace disclosure obligations for stressed scenarios.
Policy implication derived from the non‑rule‑encodability thesis; no empirical policy evaluation included.
Standard contracts and regulatory audits that rely on inspection of rule sets or source code will be insufficient to assess model behavior or risk; regulators and buyers must rely more on behavior‑based testing, standards, and outcome measures.
Policy and regulatory argument derived from the main theorem about non‑rule‑encodability; no empirical regulatory studies presented.
Full interpretability via rule extraction may be impossible for the most valuable parts of LLM competence, limiting the utility of some transparency approaches for safety and auditing.
Argumentative consequence of the main theoretical claim and structural mismatch; supported by historical limitations of rule‑based systems; no empirical tests reported.
There is a structural mismatch between explicit human cognitive tools (rules, checklists) and the pattern‑rich, high‑dimensional competence encoded in LLMs.
Theoretical/structural argument about distributed statistical representations in LLMs versus discrete rules; no experimental quantification provided.
Historical expert systems failed to generalize or scale to complex, ambiguous tasks, contrasting with LLMs' broader empirical successes.
Historical case analysis and literature review-style discussion of expert systems versus contemporary LLM performance; no new quantitative historical dataset provided.
High governance costs in regulated/high-risk domains can slow adoption of agentic systems, concentrating deployment in less regulated uses or among large firms that can afford governance infrastructure.
Economic reasoning about fixed and marginal governance costs and firm-level adoption decisions; no empirical adoption data presented.
Path-dependent behavior increases the complexity of principal–agent contracting and moral hazard between platforms, enterprise customers, and downstream users, requiring richer contract terms (acceptable paths, logging, audit rights).
Economic theory reasoning and applied contract/design implications discussed; no empirical contract-study data.
Path-dependent policies complicate ex post auditing and simple rule-based regulation; regulators may prefer standards requiring runtime evaluation and logging to be enforceable in practice.
Conceptual argument about limits of auditing when important state is ephemeral and about how runtime logging enables ex post review; illustrative policy examples mapping to runtime requirements.
Current models appear to internalize preferences as persistent, high‑priority rules rather than conditional behavioral signals contingent on conversational norms and context.
Behavioral patterns observed across BenchPreS scenarios (preference application persisting in inappropriate contexts) and ablation results; interpretive claim based on empirical behavior rather than direct model internals inspection.
BenchPreS detects a pervasive context‑sensitivity failure: models often treat stored preferences as globally enforceable rules rather than conditional, context‑dependent signals.
Pattern of results across the benchmark showing high MR alongside cases where preference application should have been suppressed; qualitative interpretation of model behavior across varied interaction partners and normative contexts in the dataset.
Modern frontier LLMs frequently misapply stored user preferences in contexts where social or institutional norms require suppression (third‑party communication).
Empirical evaluation using the BenchPreS benchmark: models were provided stored preferences and asked to generate responses across contexts requiring either application or suppression; Misapplication Rate (MR) computed as fraction of instances where preferences were applied despite required suppression. Multiple state‑of‑the‑art models were tested (described generically as “frontier models”) across the scenario set.
If left unchecked, managerial short-termism combined with AI adoption can create a feedback loop where firms cut labor to boost short-term profits, undermining aggregate demand and eroding the market that sustains those profits.
Conceptual macroeconomic and organizational synthesis drawing on theory and historical patterns; no new empirical time-series demonstrating this loop in current AI-driven layoffs.
Work-time reduction policies carry distributional and implementation risks (heterogeneous effects by occupation, firm size, capital intensity; risk of hidden wage cuts) that require careful compensation rules and monitoring.
Theoretical reasoning and references to heterogeneous outcomes in prior work-hour studies; no new empirical quantification of heterogeneity in AI-era implementations.
Lower household demand resulting from payroll cuts can precipitate further cost-cutting and automation, creating a self-reinforcing feedback loop that risks persistent demand shortfalls and higher structural unemployment.
Theoretical models of demand-driven adjustment and cited historical patterns; conceptual argument rather than empirical causal identification in contemporary AI contexts.
AI-justified layoffs are driven more by managerial short-termism and misaligned executive incentives than by immediate technological necessity.
Interdisciplinary conceptual synthesis drawing on labor-economics theory, organizational behavior literature linking executive compensation/short-termism to layoffs, and selected prior empirical studies; no new firm-level causal identification or large-scale dataset provided.
Passive monitoring and predictive models are insufficient for governing the complex dynamics of a tech-driven economy.
Conceptual critique based on economic cybernetics literature and the author's expert assessment; no empirical test comparing governance regimes is provided.
Digitalization is deepening digital inequality (unequal access to digital tools, skills, and benefits) across social groups and regions.
Qualitative analysis and expert assessment; the paper calls for new metrics but does not present systematic empirical measures of inequality.
Digital transformation can generate technological unemployment if not managed with appropriate retraining and social protection measures.
Expert assessment and literature-informed argumentation in the paper; no empirical longitudinal analysis isolating technology-driven job losses presented.
Forced or poorly regulated digitalization risks exacerbating social stratification.
Conceptual argument supported by qualitative analysis of policy documents and expert assessment; no empirical causal estimates provided.
Industry-level AI substitution risk moderates the AI–ECSR relationship: higher substitution risk sharpens the inverted U and shifts its peak left (firms in high-substitution-risk industries reach the turning point earlier and suffer stronger negative effects at high AI adoption).
Interaction terms between AI (and AI^2) and an industry AI substitution-risk measure in panel regressions show heterogeneity consistent with a leftward shift and steeper decline in high-risk industries; results reported across the 2,575-firm panel with controls and robustness checks.
Beyond a certain threshold of AI embedding, deeper AI adoption shifts managerial attention toward AI systems and away from employees, reducing ECSR (AI attention shift mechanism).
Negative AI^2 coefficient in quadratic panel regressions indicates declining ECSR at high AI adoption; supported by theoretical dual-agent model arguing attention shift; robustness checks reported. (Sample: same 2,575 firms, 2013–2023.)
Trust, verification costs, and legal/governance requirements remain consequential even with AI mediation and may limit or shape adoption.
Theoretical discussion of governance and verification costs; no empirical measurement of these costs in adopter firms provided.
AI-mediated interpretation and action carry risks related to quality, bias, and misalignment, which can produce miscommunication or incorrect automated actions.
Paper's discussion section raising caveats; conceptual risk analysis without empirical incident data; references to general concerns in AI safety literature (no new empirical evidence provided).
Organisations struggle to optimise human–AI collaboration in knowledge‑intensive decision‑making.
Statement based on a systematic synthesis of human–AI interaction and knowledge management literature presented in the paper; no primary empirical sample or dataset reported in the abstract.
Despite increased deployment, the field lacks a principled framework for answering when a team is helpful, how many agents to use, how team structure impacts performance, and whether a team is better than a single agent.
Authors' assessment of the literature and gaps; presented as a motivation for their work (no empirical count of missing frameworks given in excerpt).
Tasks that workers associate with a sense of agency or happiness may be disproportionately exposed to AI.
Empirical finding based on the paper's worker and developer surveys on 171 tasks, with LM scaling to 10,131 tasks; phrased cautiously in the paper as 'may be' disproportionately exposed.
There is a growing tension between relatively rigid education and training systems and the rapidly changing skill requirements of digitally driven labor markets.
Argument motivated and supported by comparative assessment of international practices and systemic analysis; descriptive/comparative evidence rather than quantified empirical testing.
Information saturation from AI output contributes to cognitive overload among employees.
Grounded in the paper's application of cognitive load theory to findings from surveys and organizational research; the excerpt gives no direct measures of information volume or its direct cognitive effects.
Extensive AI use correlates with measurable productivity losses.
Paper states this correlation is observed in organizational research and large-scale surveys; the excerpt lacks details on productivity measures, sample sizes, or statistical controls.
Extensive AI use correlates with increased decision fatigue.
Reported correlation based on the same cited large-scale surveys and organizational research; no methodological details or effect sizes provided in the excerpt.
Extensive AI use correlates with increased turnover intention among employees.
Paper reports correlations observed in recent large-scale surveys and organizational research; the excerpt does not provide correlation coefficients, sample sizes, or control variables.
AI-augmented work environments create cognitive overload through information saturation, relentless task-switching, and the demanding oversight of multiple AI agents.
Synthesis in the paper drawing on research on human-AI collaboration and cognitive load theory and citing organizational research; specific empirical methods or sample sizes not provided in the excerpt.
Employees using AI extensively report significant mental fatigue, dubbed 'AI brain fry.'
Stated in the paper as derived from recent large-scale surveys and organizational research; no specific sample size, survey instrument, or statistical details provided in the text excerpt.
O SCF é expandido para uma camada de segunda ordem (SCF-E) que incorpora déficit de imaginação tecnocultural e governança simbólica, explicando por que a IA permanece em pilotos e não se converte em capacidade organizacional.
Extensão conceitual (segunda ordem) relatada no artigo; respaldada metodologicamente pela combinação QUAN→QUAL, incluindo etnografia orientada ao SCF (detalhes empíricos no corpo do artigo, não no resumo).
A literatura de adoção tecnológica (TAM, UTAUT, Difusão de Inovações) tende a tratar a resistência como variável comportamental genérica ou deficiência de 'treinamento', negligenciando dimensões simbólicas (ritos, identidades e poder), mecanismos cognitivos de ameaça (aversão à perda, sobrecarga e heurísticas) e seus efeitos econômicos.
Revisão bibliográfica e posicionamento teórico declarado no artigo comparando modelos consagrados com a perspectiva proposta; sem indicação de meta-análise ou contagem empírica no resumo.
A Fricção Psicoantropológica (SCF) é proposta e detalhada como um coeficiente mensurável do custo cultural e da resistência cognitiva que reduz a capacidade de pequenas e médias empresas (PMEs) de transformar iniciativas de Inteligência Artificial (IA) em geração de valor em escala.
Proposição teórica e operacionalização apresentada no artigo; desenho metodológico descrito como QUAN→QUAL incluindo construção de escala psicométrica e etnografia organizacional. O resumo não especifica tamanho de amostra para validação.
Over-reliance on data-driven insights without adequate human oversight can worsen market uncertainty.
Reported in the study's qualitative case studies and interpretive analysis as a potential negative consequence of improper AI/Big Data use (no quantified examples provided in the summary).
Algorithmic bias is a potential pitfall of using AI and Big Data that can exacerbate market uncertainty.
Identified as a risk in the paper's qualitative analysis and discussion of pitfalls (no incident counts or empirical quantification provided in the summary).
External pressures (e.g., pandemics, extreme weather, geopolitical conflicts) disproportionately affect peripheral suppliers in the construction supply chain network.
Mapping of challenge categories to network positions in the study showed external pressures concentrating at peripheral supplier nodes; based on interview reports and network coding (quantitative support not detailed in abstract).
Relationship and contract issues accumulate at high-centrality brokers, which exhibit a reported degree centrality of 0.818.
Result reported in the paper linking the thematic category (relationship/contract issues) to network nodes identified as high-centrality brokers; a numeric degree centrality value (0.818) is reported for these brokers. Underlying network constructed from thematic coding of interviews; sample size not provided in abstract.
Six main challenge categories (comprising 16 open codes) concentrate systematically at specific network positions.
Results reported: thematic grouping produced six challenge categories and 16 open codes, and these were mapped to positions in the network showing systematic concentration; underlying data derive from coded interviews and network mapping (sample size not given in abstract).
Short-run labor market disruptions raise concerns regarding wage inequality and workforce adaptation.
Claims based on observed short-run labor market adjustments in publicly available data and theoretical implications for inequality and adaptation; specific empirical measures, time horizons, and sample sizes are not reported in the excerpt.
AI simultaneously increases adjustment pressures for routine tasks.
Argument and cited observations from publicly available labor market data indicating displacement or adjustment in routine-task-intensive occupations (no specific empirical estimates or samples provided).
The Cautious are held in organizational stasis: without early adopter examples they don't enter the virtuous adoption cycle, never accumulate the usage frequency that drives intent, and never attain high efficacy.
Comparative analysis of archetype subgroups in the survey (N=147) showing the 'Cautious' group has lower reported usage frequency, lower intent to increase usage, and lower self-reported efficacy relative to 'Enthusiasts' and 'Pragmatists'.
Adoption of AI testing tools lags that of coding tools, creating a 'Testing Gap'.
Within-sample comparison of reported adoption rates for coding-oriented AI tools versus testing-oriented AI tools among 147 developers, showing lower adoption for testing tools.
Security concerns remain a moderate and statistically significant barrier to adoption.
Survey-derived security-concern metric (N=147) that shows a statistically significant negative association with future adoption intention (reported as moderate in effect size).
Traditional human resource management (HRM) approaches in hospitals rely on manual processes that are prone to errors, lack adaptability, and fail to adequately balance staff preferences with patient care requirements.
Background/positioning statement in the paper; asserted based on literature and authors' motivation for proposing an AI-driven framework (no specific dataset or quantitative analysis provided for this claim).
Simulations project measurable reductions in defect rates under AI-HRM scenarios.
Regression-based simulations of the counterfactual model include defect reduction as an organizational outcome and project decreases in defect rates when HR processes are AI-supported.
Simulations show notable reductions in absenteeism under the AI-HRM scenario.
Predictive estimation and regression-based simulations projecting absenteeism rates under counterfactual AI-supported HR processes using the industrial firm dataset.