Evidence (13827 claims)
Adoption
8454 claims
Productivity
7544 claims
Governance
6789 claims
Human-AI Collaboration
6327 claims
Org Design
4126 claims
Innovation
4058 claims
Labor Markets
3520 claims
Skills & Training
2924 claims
Inequality
2057 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 749 | 195 | 97 | 889 | 1979 |
| Governance & Regulation | 815 | 391 | 188 | 121 | 1539 |
| Organizational Efficiency | 771 | 189 | 124 | 83 | 1177 |
| Technology Adoption Rate | 624 | 233 | 123 | 96 | 1084 |
| Research Productivity | 410 | 121 | 56 | 331 | 929 |
| Output Quality | 466 | 177 | 59 | 47 | 749 |
| Decision Quality | 320 | 174 | 75 | 42 | 618 |
| Firm Productivity | 435 | 55 | 88 | 20 | 604 |
| AI Safety & Ethics | 214 | 276 | 65 | 33 | 593 |
| Market Structure | 178 | 166 | 122 | 24 | 495 |
| Task Allocation | 206 | 64 | 70 | 31 | 376 |
| Skill Acquisition | 165 | 57 | 60 | 17 | 299 |
| Innovation Output | 201 | 27 | 41 | 18 | 288 |
| Employment Level | 105 | 51 | 107 | 13 | 278 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 116 | 63 | 42 | 11 | 232 |
| Firm Revenue | 149 | 46 | 26 | 3 | 224 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Task Completion Time | 169 | 29 | 8 | 12 | 219 |
| Worker Satisfaction | 89 | 61 | 20 | 12 | 182 |
| Error Rate | 69 | 91 | 10 | 2 | 172 |
| Regulatory Compliance | 76 | 68 | 14 | 5 | 163 |
| Training Effectiveness | 92 | 19 | 13 | 19 | 145 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Automation Exposure | 51 | 54 | 22 | 12 | 142 |
| Team Performance | 86 | 17 | 27 | 9 | 140 |
| Developer Productivity | 94 | 17 | 14 | 6 | 132 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 51 | 7 | 8 | 3 | 69 |
| Skill Obsolescence | 5 | 45 | 6 | 1 | 57 |
| Creative Output | 31 | 16 | 7 | 2 | 57 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 17 | 17 | — | 51 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Adoption of integrated recruitment technology yielded a 45% improvement in candidate quality as measured by first-year performance ratings.
Reported quantitative result from the survey (N=150) and case study evidence using first-year performance ratings as the quality metric.
Organizations adopting integrated technology-driven recruitment platforms experienced an average reduction in time-to-hire of 38%.
Reported quantitative finding based on the paper's mixed-methods analysis (survey of 150 HR professionals and corroborating qualitative case studies of 4 organizations).
These results suggest that LinuxArena has meaningful headroom for both attackers and defenders, making it a strong testbed for developing and evaluating future control protocols.
Authors synthesize results from sabotage evaluations, monitor evaluations, and the LaStraj human-attack dataset to conclude there is room for improvement on both attacker and defender sides; this is presented as an implication/recommendation rather than a strictly measured outcome.
LinuxArena contains 184 side tasks representing safety failures such as data exfiltration and backdooring.
Authors report the number of side tasks and describe their nature (safety failures) in the dataset/control setting documentation.
LinuxArena contains 1,671 main tasks representing legitimate software engineering work.
Authors report the number of main tasks when describing the contents of LinuxArena.
LinuxArena contains 20 environments.
Authors report constructing LinuxArena and state the number of environments explicitly in the paper's description of the dataset/control setting.
We introduce DELEGATE-52 to study the readiness of AI systems in delegated workflows; DELEGATE-52 simulates long delegated workflows that require in-depth document editing across 52 professional domains (e.g., coding, crystallography, and music notation).
Paper describes creation of a benchmark/dataset called DELEGATE-52 covering 52 professional domains and designed to simulate long delegated document-editing workflows.
Drawing on Moral Foundations Theory and a multi-stakeholder perspective, moral (mis)alignment matters for the meaningful integration of AI in sensitive contexts.
Paper's theoretical framing and normative claim (method: conceptual synthesis using Moral Foundations Theory and multi-stakeholder argumentation; no empirical sample or quantitative results reported in the supplied text).
Moral alignment is defined as the perceived congruence between the values embedded in an AI system's decision logic and the moral intuitions of stakeholders.
Explicit definitional statement in the paper (conceptual definition; no empirical measurement reported in the supplied text).
Moral alignment may be a more fundamental dimension of human-AI decision-making than functional or behavioral alignment.
Paper's central argumentative claim (theoretical proposition building on conceptual reasoning and prior theory; no empirical evidence or sample size reported in the supplied text).
In high-stakes AI-supported decisions, considerations are not purely technical but involve moral judgments about fairness, responsibility, and harm.
Stated as a conceptual assertion in the paper's framing/abstract; presented as an observation building on prior literature (no empirical method or sample size reported in the supplied text).
Our paper contributes to the emerging discourse on AI overreliance and provides an understanding of the appropriate degree of reliance as essential to developers making the most of these powerful technologies.
Authors' claimed contribution based on synthesis of themes from twenty-two interviews and presentation of the reliance-control framework.
The reliance-control framework can be used to recommend future research to explore different control levels supported by current and emergent LLM-driven tools.
Paper explicitly uses the framework to motivate and recommend directions for future research; based on qualitative interview findings (n=22) and authors' synthesis.
We propose a preliminary reliance-control framework where the level of control can be used to identify AI overreliance and underreliance.
Authors present a conceptual/framework contribution derived from analysis of the twenty-two interviews; this is a proposed (theoretical) framework rather than an experimentally validated one.
The model's contribution lies in integrating four interdependent governance layers—technical, organizational, workforce, and regulatory—within a single labor-market framework.
Paper's stated conceptual contribution describing the four-layer governance model derived from the evidence map and synthesis.
Based on an evidence map of the included studies, we propose a hybrid governance model combining technical and organizational audits, inclusive upskilling/reskilling, participatory regulation, and responsible HR policies to align AI innovation with decent and inclusive work.
Conceptual proposal grounded in the paper's evidence map and qualitative synthesis of the 19 studies; model components explicitly listed in the text.
The evidence indicates that AI can support inclusion through assistive technologies and improved matching in labor-market settings.
Synthesis claim based on thematic analysis of the 19 included peer-reviewed studies (qualitative evidence across the corpus pointing to assistive technologies and improved matching as inclusion-supporting mechanisms).
Strategic adoption of AI can significantly improve project outcomes and operational performance in the construction industry.
Synthesis of case study findings indicating improved scheduling, risk management, resource allocation, reduced delays and costs, and improved productivity; support is based on the analysed cases rather than a large-scale representative sample.
Artificial Neural Networks (ANN) and predictive modelling support data-driven decision-making in construction.
Paper highlights the use of ANN and predictive modelling in case studies and their role in supporting data-driven decision-making; the summary does not provide quantitative performance metrics for these models.
Quantitative results demonstrate notable improvements in productivity and time efficiency across the analysed cases.
Summary reports quantitative analyses across the case studies showing improvements in productivity and time efficiency; no explicit sample size, statistical significance values, or effect magnitudes provided in the summary.
These enhancements lead to measurable reductions in project delays, operational costs, and safety risks.
Authors state quantitative measurements from analysed cases indicate reductions in delays, costs, and safety risks attributable to AI-driven tools. The summary does not provide numeric magnitudes or sample counts.
AI-driven tools enhance project scheduling, risk management, and resource allocation.
Reported findings across multiple case studies (qualitative and quantitative analyses) where AI applications were applied to scheduling, risk management, and resource allocation tasks. Specific number of cases or statistical tests not provided in the summary.
The work holds important practical significance for promoting the coordinated and sustainable development of efficiency and fairness in the field of digital recruitment in China.
Concluding claim in abstract about practical significance and intended impact on efficiency and fairness; no empirical measures of nationwide impact provided.
These individual adaptation strategies provide important microlevel references for platform algorithm optimization and the improvement of relevant regulatory policies.
Paper's implication/discussion claim in abstract that findings can inform platform design and policy; presented as an application rather than empirically proven policy impact.
An empirical study revealed that active and targeted individual adaptation can effectively avoid the negative impact of algorithmic bias and significantly improve the overall job search success rates of different groups.
Statement in abstract reporting results of an empirical study conducted by the authors; however, the abstract does not report sample size, experimental design, statistical significance levels, or effect sizes.
A scientific four-in-one adaptation strategy system encompassing resume optimization, channel selection, proactive communication, and ability enhancement is constructed.
Paper's stated contribution: construction of a four-part adaptation strategy for job seekers described in abstract; no empirical validation details provided in abstract.
With the popularization of digital recruitment platforms in the era of artificial intelligence, algorithmic screening has become a core and indispensable component of talent matching in the modern labor market.
Statement in paper's introduction/abstract asserting widespread adoption of digital recruitment platforms and centrality of algorithmic screening; no specific adoption figures or data reported in the abstract.
The positive effect of supply chain digitalization on human capital structure is stronger for enterprises located in the eastern region of China.
Heterogeneity analysis in the paper using the DID framework on A-share listed companies (2013–2022); regional subsample analysis shows a larger effect in eastern China.
The positive effect of supply chain digitalization on human capital structure is stronger for enterprises operating in more competitive industries.
Heterogeneity analysis reported in the paper using DID on A-share listed firms (2013–2022); industry competition intensity is used to split sample and examine differential effects.
The positive effect of supply chain digitalization on optimizing human capital structure is stronger for enterprises facing higher external environmental uncertainty.
Heterogeneity analysis in the paper using the DID sample of A-share listed firms (2013–2022); authors report the effect is more pronounced under higher environmental uncertainty.
Supply chain digitalization enhances enterprises' capacity to absorb high-skilled labor by promoting the accumulation of digital intangible assets.
Mechanism analysis in the paper using DID on A-share listed companies (2013–2022); accumulation of digital intangible assets is cited as a channel increasing firms' demand/ability to hire high-skilled workers.
Supply chain digitalization enhances enterprises' capacity to absorb high-skilled labor by alleviating financing constraints.
Mechanism analysis reported in the paper using the quasi-natural experiment and DID approach on A-share listed firms; easing financing constraints is presented as one channel.
Supply chain digitalization enhances enterprises' capacity to absorb high-skilled labor by boosting public trust in brands.
Mechanism analysis in the paper using the DID design on A-share listed firms (2013–2022); brand/public trust is reported as a mediating channel.
Supply chain digitalization enhances enterprises' capacity to absorb high-skilled labor by increasing firms' market attention.
Mechanism analysis reported in the paper using the same DID framework and sample (A-share listed firms 2013–2022); market attention is listed as an identified channel through which digitalization affects human capital.
Supply chain digitalization drives the optimization of the human capital structure of enterprises.
Empirical analysis on A-share listed companies on the Shanghai and Shenzhen Stock Exchanges from 2013 to 2022; authors treat pilots of supply chain innovation and application as a quasi-natural experiment and employ a difference-in-differences (DID) approach to identify the effect.
Successful implementation requires tailored strategies that address contextual, technical, and human factors.
Authors' synthesis and recommendations based on patterns and barriers identified in the included studies.
Client technological readiness plays a positive role in remote auditing.
Reported moderating/mediating findings across included studies summarized in the review indicating client readiness supports remote audit processes.
Technologies such as big data analytics, artificial intelligence, and federated learning have a transformative impact on audit quality and efficiency.
Synthesis of findings from the 10 included empirical studies reporting effects of these technologies on auditing outcomes.
Fairness should be evaluated at the system level (the interacting agents) rather than solely at the level of individual models, because fairness can be an emergent, procedural property of decentralized agent interaction.
Conceptual framing supported by the triage experiments showing emergent fairness properties from agent interaction that were not present at the single-agent level.
Aligned agents partially moderate bias through contestation rather than override, acting as corrective patches that restore access for marginalized groups without fully converting a biased counterpart.
Behavioral observations from the triage negotiation trials where aligned agents contested allocations proposed by biased/un-aligned agents and adjusted final allocations in ways that increased access for marginalized groups while not fully changing the adversarial agent's preferences.
Neither agent's allocation is ethically adequate in isolation, yet their joint final allocation can satisfy fairness criteria that neither would have reached alone.
Comparative analysis of individual-agent allocations versus joint allocations after three rounds of negotiation in the hospital triage simulation; claim based on observed differences between solitary and joint outcomes.
Fairness in language models emerges through interaction and exchange among agents, rather than being solely a property of a single, centrally optimized model.
Controlled simulation using a hospital triage framework in which two agents negotiate over three structured debate rounds; one agent is aligned via retrieval-augmented generation (RAG) and the other is unaligned or adversarially prompted. Observed final allocations and negotiation dynamics used to support the claim.
Policy implications: strengthening digital infrastructure, human capital, and innovation capacity is important to ensure inclusive productivity gains from the AI revolution in BRICS economies.
Normative recommendation derived from empirical findings that digital infrastructure complements AI-driven TC and EC and that differential AI effects are linked to country-level capacities; recommendation follows from observed divergence across economies.
The study contributes methodologically by providing a comparative, frontier‑based assessment of AI-driven productivity in emerging economies and by distinguishing innovation (frontier-shifting) and diffusion (efficiency) effects of AI.
Two-stage empirical approach combining Malmquist TFP decomposition (frontier analysis) with panel regressions linking TFP components to multiple AI penetration indicators (patents, investment, robot density, digital infrastructure) across BRICS, 2005–2023.
Digital infrastructure is a critical complementary factor influencing both efficiency improvements and frontier‑shifting technological change.
Regression analysis includes digital infrastructure indicators and reports that better digital infrastructure is associated with positive effects on both EC and TC (either directly or via interaction terms with AI indicators). Panel data over BRICS, 2005–2023.
Adoption-oriented AI indicators, including robot density, contribute to efficiency improvements (EC).
Panel regressions linking Efficiency Change (EC) to adoption-oriented indicators (robot density and similar diffusion measures) show positive associations, interpreted as diffusion improving efficiency rather than shifting the frontier.
Innovation-oriented AI activities (AI patents and research investment) are strongly associated with frontier‑shifting technological change (TC).
Second-stage panel regression analysis relating TC to AI penetration indicators (AI patents, AI research investment), using BRICS panel data (2005–2023). Reported statistically significant positive associations between patent/research investment indicators and TC.
China and India exhibit sustained productivity growth over 2005–2023 driven primarily by technological progress.
Malmquist Total Factor Productivity (TFP) index computed for BRICS and decomposed into Efficiency Change (EC) and Technological Change (TC); time series patterns show sustained TFP growth for China and India with TC as the dominant component. Panel covers BRICS economies (Brazil, Russia, India, China, South Africa) for 2005–2023.
Current LLMs are imperfect spatial reasoners, a problem that AADvark addresses by incorporating external constraint solver tools with a specialized visual feedback mechanism.
Diagnosis followed by methodological response: authors argue LLM spatial reasoning is imperfect and describe AADvark's use of external constraint solvers and visual feedback to mitigate this; empirical evidence not provided in this excerpt.
Unlike previous state-of-the-art systems, AADvark captures the dynamic part interactions with one or more degrees-of-freedom.
Design claim about the system's modeling of dynamic part interactions (method/architecture difference); supported by the authors' system design and comparison to prior state-of-the-art as asserted in the paper excerpt.