Evidence (7953 claims)
Adoption
5539 claims
Productivity
4793 claims
Governance
4333 claims
Human-AI Collaboration
3326 claims
Labor Markets
2657 claims
Innovation
2510 claims
Org Design
2469 claims
Skills & Training
2017 claims
Inequality
1378 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 402 | 112 | 67 | 480 | 1076 |
| Governance & Regulation | 402 | 192 | 122 | 62 | 790 |
| Research Productivity | 249 | 98 | 34 | 311 | 697 |
| Organizational Efficiency | 395 | 95 | 70 | 40 | 603 |
| Technology Adoption Rate | 321 | 126 | 73 | 39 | 564 |
| Firm Productivity | 306 | 39 | 70 | 12 | 432 |
| Output Quality | 256 | 66 | 25 | 28 | 375 |
| AI Safety & Ethics | 116 | 177 | 44 | 24 | 363 |
| Market Structure | 107 | 128 | 85 | 14 | 339 |
| Decision Quality | 177 | 76 | 38 | 20 | 315 |
| Fiscal & Macroeconomic | 89 | 58 | 33 | 22 | 209 |
| Employment Level | 77 | 34 | 80 | 9 | 202 |
| Skill Acquisition | 92 | 33 | 40 | 9 | 174 |
| Innovation Output | 120 | 12 | 23 | 12 | 168 |
| Firm Revenue | 98 | 34 | 22 | — | 154 |
| Consumer Welfare | 73 | 31 | 37 | 7 | 148 |
| Task Allocation | 84 | 16 | 33 | 7 | 140 |
| Inequality Measures | 25 | 77 | 32 | 5 | 139 |
| Regulatory Compliance | 54 | 63 | 13 | 3 | 133 |
| Error Rate | 44 | 51 | 6 | — | 101 |
| Task Completion Time | 88 | 5 | 4 | 3 | 100 |
| Training Effectiveness | 58 | 12 | 12 | 16 | 99 |
| Worker Satisfaction | 47 | 32 | 11 | 7 | 97 |
| Wages & Compensation | 53 | 15 | 20 | 5 | 93 |
| Team Performance | 47 | 12 | 15 | 7 | 82 |
| Automation Exposure | 24 | 22 | 9 | 6 | 62 |
| Job Displacement | 6 | 38 | 13 | — | 57 |
| Hiring & Recruitment | 41 | 4 | 6 | 3 | 54 |
| Developer Productivity | 34 | 4 | 3 | 1 | 42 |
| Social Protection | 22 | 10 | 6 | 2 | 40 |
| Creative Output | 16 | 7 | 5 | 1 | 29 |
| Labor Share of Income | 12 | 5 | 9 | — | 26 |
| Skill Obsolescence | 3 | 20 | 2 | — | 25 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
Qualitative analyses reveal emergent self-regulated efficiency: models autonomously eliminate redundant metacognitive loops without explicit length supervision.
Qualitative analysis of model behavior reported in the paper (no quantitative effect sizes provided in the excerpt).
BCR challenges the traditional accuracy-efficiency trade-off by demonstrating a 'free lunch' phenomenon at standard single-problem inference (i.e., reduced token usage with maintained or improved accuracy even at N=1).
Reported experimental results on 1.5B and 4B model families showing token reductions and maintained/improved accuracy at standard single-problem inference.
As N increases, accuracy degrades far more gracefully than baselines, establishing N as a controllable throughput dimension.
Comparative experiments versus baselines varying concurrent-problem count N; qualitative claim that accuracy degradation is 'far more graceful' than baselines.
As the number of concurrent problems N increases during inference, per-problem token usage decreases monotonically.
Reported experimental finding described as a novel task-scaling law observed when varying N at inference time; no numeric effect sizes provided in the excerpt.
Batched Contextual Reinforcement (BCR) reduces token usage by 15.8% to 62.6% while consistently maintaining or improving accuracy across five major mathematical benchmarks.
Empirical evaluation reported in the paper across two model families (1.5B and 4B) and five mathematical benchmarks; token usage reduction range and qualitative accuracy statement provided.
This research contributes to debates about the future of work, power asymmetries in platform economies, and the development of worker-protective regulatory frameworks, engaging perspectives from feminist economics, institutional theory, and surveillance capitalism studies.
Stated contribution in the abstract based on theoretical engagement and literature synthesis (conceptual claim; no empirical citation in abstract).
Theoretical frameworks developed in the paper require future empirical validation via case studies, quantitative analysis, and ethnographic research.
Methodological statement within the abstract describing the paper's limitations and next steps (self-report about the paper's status).
The study proposes institutional frameworks for realizing labor value and for worker-protective regulatory frameworks applicable to digital/platform economies.
Normative/theoretical proposals derived from conceptual analysis and engagement with feminist economics, institutional theory, and surveillance capitalism literature (no empirical testing reported).
The paper identifies key characteristics of value formation specific to platform economies.
Theoretical framework and literature synthesis presented in the study (conceptual; no empirical cases reported in abstract).
Living labor remains the sole source of new value; the core insights of the labor theory of value remain essential for critiquing contemporary digital capitalism.
Argumentative/theoretical development grounded in Marxist political economy and literature synthesis (conceptual paper, no empirical testing reported).
AI should be classified as constant capital rather than as labor.
Theoretical analysis and critical literature synthesis in a conceptual study (no empirical sample reported).
Results may be applied in the development of financial institution strategies, regulatory frameworks, risk management systems and professional training programmes.
Applied implications drawn from the literature synthesis and comparative analysis; presented as potential uses rather than empirically validated interventions.
Significant changes in human resource needs are occurring, with growing demand for analysts and specialists combining financial and technological competencies.
Conclusion from literature review and synthesis of international studies on labour demand in finance under Big Data/AI adoption; no original labour-market survey included.
Big Data and AI technologies significantly improve efficiency, risk assessment accuracy, fraud detection and financial inclusion.
The paper reports results from a qualitative analysis of recent academic literature, comparative analysis of sector-specific applications, and synthesis of empirical findings from international studies; no primary sample size reported.
Overall, findings highlight that AI serves as a revolutionary (transformative) tool rather than merely a replacement tool for employment—changing the nature of human work rather than simply disengaging it.
Synthesis conclusion in the paper drawing on the literature review and the authors' empirical results indicating task reallocation and changing job content.
The paper argues for equal technology governance as a necessary policy response to AI's labor market effects.
Policy recommendations discussed in the paper that call for equitable governance of AI; based on literature synthesis and empirical findings.
The analysis raises policy implications emphasizing reskilling and education to address AI-driven changes in the labor market.
Policy discussion section summarized in the paper; draws on empirical findings and literature to recommend reskilling/education.
Moderate AI usage is associated with employment growth.
Part of the U-shaped relationship reported in the paper's empirical results; described qualitatively in the abstract/summary.
Secondary empirical evidence from Colombia's EDIT manufacturing survey (N=6,799 firms) shows that management practice quality amplifies the return to technology investment (interaction coefficient 0.304, p<0.01).
Secondary empirical analysis of EDIT manufacturing survey data; sample size reported as N = 6,799 firms; regression interaction term reported as coefficient 0.304 with p < 0.01.
We endogenize the augmentation function as phi(D, W), where W is a five-dimensional workplace design vector (AI interface design, decision authority allocation, task orchestration, learning loop architecture, psychosocial work environment), and prove that human-centric design is profit-maximizing when the workforce's augmentable cognitive capital exceeds a critical threshold.
Theoretical model and formal proof presented in the paper (analytical derivation of phi(D,W) and threshold condition).
There is a need for energy-efficient AI development to align technological progress with sustainable energy consumption.
Policy recommendation based on the paper's empirical findings that AI adoption increases firm-level electricity demands in the short run; normative argument rather than a directly tested empirical claim.
The AI-related widening of the electricity output growth gap is stronger among manufacturing firms, non-state-owned firms, small firms, low-tech firms, and low-energy-consumption and low-pollution firms.
Heterogeneity/subgroup analyses across firm characteristics (ownership type, size, sector, technology intensity, baseline energy use and pollution levels) showing larger estimated effects in the listed subgroups. Specific subgroup sample sizes and coefficients not reported in the summary.
The effect of AI adoption on the electricity output growth gap is more pronounced for firms operating in highly competitive industries.
Heterogeneity analysis by industry competition intensity (likely via industry-level measures of competition); interaction regressions showing larger estimated effects in more competitive sectors. Sample/subgroup sizes not specified in the summary.
The effect of AI adoption on widening the electricity output growth gap is more pronounced for firms located in economically advanced regions.
Heterogeneity analysis by regional economic development level using the firm-level electricity consumption dataset; stratified or interaction regressions showing larger estimated effects in more advanced regions. Exact subgroup sizes not provided in the summary.
The main result (initial widening of electricity growth gap) is robust to alternative variable definitions, exclusion of firms relying on outsourced AI services or non-AI adoption samples, and controls for endogeneity.
Robustness checks reported in the paper: alternative variable definitions, sample restrictions (excluding outsourced-AI-reliant firms and non-AI samples), and application of endogeneity control methods (e.g., instrumental variables or panel fixed effects). Exact methods and sample sizes not specified in the summary.
AI adoption initially widens the corporate electricity output growth gap at the firm level in China.
Empirical analysis using unique firm-level data on corporate electricity consumption in China; econometric estimation comparing electricity output growth between AI-adopting firms and non-adopting peers (panel/firm-level analysis). Sample size not stated in the summary.
Strong governance and advanced digital infrastructure are critical for realizing AI’s potential as a sustainable technology—governance-driven digital transformation is important for achieving sustainable growth.
Interpretation and policy implication drawn from the empirical findings that GQI and DII mitigate the AI→CO2 relationship in the 104-country panel analysis (2000–2023) employing GMM and 2SLS.
The environmental impact of AI is stronger in energy-inefficient and AI-advanced contexts.
Heterogeneity analysis in which the AI→CO2 effect is reported as larger for energy-inefficient countries and for countries in more advanced stages of AI diffusion (same 104-country panel, 2000–2023).
Adoption of AI currently contributes to higher CO2 emissions.
Empirical panel analysis of 104 countries over 2000–2023 using two-step system GMM and two-stage least squares (2SLS) estimations; AI adoption variable positively associated with country-level CO2 emissions in the reported regressions.
To optimize agentic AI integration and ensure responsible innovation across financial services, interdisciplinary, longitudinal research and robust governance frameworks are needed.
Authors' conclusions and recommendations based on the identified findings and gaps in the reviewed literature.
Diverse architectural models such as multi-agent systems and cloud-based frameworks enable scalable, adaptive agentic AI deployments in financial services.
Synthesis of architecture-focused studies and framework descriptions within the reviewed literature (architectural benchmarking across papers).
Findings reveal substantial productivity gains and operational efficiencies predominantly in banking and investment.
Systematic review synthesizing multidisciplinary qualitative, quantitative, and bibliometric studies of agentic AI applications in financial services published up to mid-2024 (review-level synthesis).
The ManagerWorker two-agent pipeline (expensive text-only manager + cheaper worker with repo access) can substitute expensive execution by using expensive reasoning in the manager and cheaper execution in the worker.
System design description plus empirical results on 200 SWE-bench Lite instances showing parity in success rates between a strong-manager/weak-worker pipeline and a strong single agent while using fewer strong-model tokens.
A minimal review-only manager loop adds only 2 percentage points over the baseline, whereas structured exploration and planning by the manager add 11 percentage points, demonstrating that active direction (not mere reviewing) produces most of the benefit.
Ablation-style comparison of pipeline variants on the 200-instance SWE-bench Lite evaluation: review-only manager loop versus manager with structured exploration and planning; reported improvements in percentage points.
A strong manager directing a weak worker achieves a 62% success rate on software-engineering tasks, matching a strong single agent which achieves 60%, while using a fraction of the strong-model token usage.
Empirical evaluation on 200 instances from SWE-bench Lite across five pipeline configurations and model pairings; measured task success rates and token usage for manager-worker pipelines versus single-agent baselines.
Overall, the HCT is a robust, accurate, and transparent alternative to the AI-as-advisor approach, offering a simple mechanism to tap into the wisdom of hybrid crowds.
Overall conclusion drawn from the empirical comparisons across datasets and analyses described in the paper (summary statement in abstract).
Using signal detection theory, the paper finds that the HCT outperforms the AI-as-advisor approach because people cannot discriminate well enough between correct and incorrect AI advice.
Analysis in the paper applying signal detection theory to the empirical results (as stated in abstract).
The HCT also performed better in almost all cases in which the AI offered an explanation of its judgment.
Empirical results on the subset of four datasets with AI explanations (abstract reports HCT performed better in 'almost all' of these cases).
The HCT outperformed the AI-as-advisor approach in all datasets.
Empirical comparisons reported across the 10 datasets (statement in abstract that HCT 'outperformed' in all datasets). Specific performance metrics not provided in abstract.
The study points to the need for longitudinal, experimental, or platform-log-based designs to establish causality and measure heterogeneity across platforms.
Authors' methodological recommendations and proposed empirical agenda built on limitations of their cross-sectional survey (N = 450) and literature gaps.
Policy and practice interventions (media literacy, platform design changes, mandated diversity, etc.) are recommended to increase informational diversity and mitigate polarization.
Policy recommendations derived from study findings and literature discussion; not evaluated experimentally in the paper (authors propose interventions as implications).
Algorithmic recommendation (structural) and user selective consumption (behavioural) jointly reinforce ideological positions in digital spaces.
Interpretation based on observed associations between selective exposure and polarization plus reported heterogeneity in perceived algorithmic influence from the N = 450 survey; authors frame results as indicating interacting structural and behavioural mechanisms.
Higher levels of selective exposure are positively associated with increased ideological polarization.
Correlational analyses (reported associations / regression-style tests) using survey measures of selective exposure and measures of opinion/political polarization in the same cross-sectional sample (N = 450).
A large majority of respondents reported frequent exposure to content aligned with their preexisting views (widespread echo chambers / filter bubbles).
Quantitative cross-sectional survey of N = 450 active social media users; self-reported measures of content consumption and indicators of selective exposure; descriptive statistics showing most respondents frequently encounter ideologically consonant content.
An AI agent given revealed-preference data predicts subjects' choices more accurately than an AI agent given stated-preference prompts.
Online experiment in which subjects provided written instructions (prompts) and revealed preferences via choices in a series of binary lottery questions; AI agents were given either the revealed-preference data or the stated-preference prompts and their prediction accuracy on subjects' choices was compared.
Under economy-wide deployment, the share of computer-vision-exposed labor compensation that is cost-effectively automatable rises sharply (relative to the firm-level 11% estimate).
Model counterfactuals or calibration scenarios comparing firm-level deployment vs economy-wide deployment; qualitative statement that share increases substantially.
At the firm level, cost-effective automation captures approximately 11% of computer-vision-exposed labor compensation.
Calibration and implementation in computer vision; reported firm-level estimate from the framework.
Scale of deployment is a key determinant: AI-as-a-Service and AI agents spread fixed costs across users, sharply expanding economically viable tasks.
Modeling and calibration arguments showing fixed-cost spreading effects increase set of tasks for which automation is cost-effective; qualitative and quantitative comparisons in implementation.
Because higher accuracy is disproportionately costly (convex cost), full automation is often not cost-minimizing; partial automation, where firms retain human workers for residual tasks, frequently emerges as the equilibrium.
Theoretical model combined with calibration (scaling laws + task mappings); equilibrium outcomes reported from the framework implementation.
We model automation intensity as a continuous choice in which firms minimize costs by selecting an AI accuracy level, from no automation through partial human-AI collaboration to full automation.
The paper develops a theoretical framework / model that treats automation intensity as a continuous decision variable; described as the central modeling approach.