Evidence (2954 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	369	105	58	432	972
Governance & Regulation	365	171	113	54	713
Research Productivity	229	95	33	294	655
Organizational Efficiency	354	82	58	34	531
Technology Adoption Rate	277	115	63	27	486
Firm Productivity	273	33	68	10	389
AI Safety & Ethics	112	177	43	24	358
Output Quality	228	61	23	25	337
Market Structure	105	118	81	14	323
Decision Quality	154	68	33	17	275
Employment Level	68	32	74	8	184
Fiscal & Macroeconomic	74	52	32	21	183
Skill Acquisition	85	31	38	9	163
Firm Revenue	96	30	22	—	148
Innovation Output	100	11	20	11	143
Consumer Welfare	66	29	35	7	137
Regulatory Compliance	51	61	13	3	128
Inequality Measures	24	66	31	4	125
Task Allocation	64	6	28	6	104
Error Rate	42	47	6	—	95
Training Effectiveness	55	12	10	16	93
Worker Satisfaction	42	32	11	6	91
Task Completion Time	71	5	3	1	80
Wages & Compensation	38	13	19	4	74
Team Performance	41	8	15	7	72
Hiring & Recruitment	39	4	6	3	52
Automation Exposure	17	15	9	5	46
Job Displacement	5	28	12	—	45
Social Protection	18	8	6	1	33
Developer Productivity	25	1	2	1	29
Worker Turnover	10	12	—	3	25
Creative Output	15	5	3	1	24
Skill Obsolescence	3	18	2	—	23
Labor Share of Income	7	4	9	—	20

Human Ai Collab Remove filter

A high-level RL agent dynamically adjusts end-effector interaction forces (contact wrench) in real time based on perception feedback of material location.

Method description: the high-level agent outputs adjustments to interaction force/wrench informed by perception of material location inside the vial; the RL algorithm and detailed observation/action representations are not specified in the summary.

medium positive Learning Adaptive Force Control for Contact-Rich Sample Scra... dynamic adjustment of interaction force/wrench and resulting task performance

A low-level Cartesian impedance controller provides stable, compliant physical interaction for contact stability during scraping.

Control architecture description: the paper uses Cartesian impedance control as the low-level controller intended to handle contact compliance and stability; empirical stability metrics are not given in the summary.

medium positive Learning Adaptive Force Control for Contact-Rich Sample Scra... contact stability / compliant interaction (as enabled by the controller)

The learned policy trained in simulation was successfully transferred to a real Franka Research 3 robot (sim-to-real transfer).

Training in a task-representative simulator followed by deployment on a Franka Research 3 setup in real-world scraping experiments; transfer success is asserted in the paper summary. The evaluation included five material setups on the real robot (exact number of trials per setup not specified).

medium positive Learning Adaptive Force Control for Contact-Rich Sample Scra... sim-to-real transfer success measured via real-world task performance (relative ...

An adaptive control framework that combines a low-level Cartesian impedance controller with a high-level reinforcement learning (RL) agent — guided by perception of material location — enables a robot to learn and adapt the optimal contact wrench for scraping heterogeneous samples in a constrained vial environment.

System design and experiments: the paper describes a two-level control architecture (Cartesian impedance + high-level RL) trained in a task-representative simulation and deployed on a real Franka Research 3 robot. Real-world experiments were performed in a constrained vial scraping task (details on trial counts per condition not provided in the summary).

medium positive Learning Adaptive Force Control for Contact-Rich Sample Scra... ability to learn/adapt optimal contact wrench for successful scraping (task perf...

Automation of routine SE tasks suggests measurable productivity gains at team and firm levels, but quantification requires causal, outcome-based studies (e.g., throughput, defect rates, time-to-market).

Interpretation of literature review findings and survey-reported perceived productivity gains; no causal empirical estimates provided in the paper.

medium positive Artificial Intelligence as a Catalyst for Innovation in Soft... potential productivity metrics (throughput, defect rates, time-to-market) — not ...

Empirical survey evidence shows generally positive perceptions of AI tools among software engineering professionals and growing adoption.

Cross-sectional survey of software engineering professionals asking about current tool usage and perceived benefits (productivity, quality, speed); absolute respondent count and sampling frame not provided in the summary.

medium positive Artificial Intelligence as a Catalyst for Innovation in Soft... self-reported perception of AI tools and self-reported adoption rate

ML enables predictive features in software engineering: effort estimation, defect prediction, work prioritization, and risk forecasting that support Agile planning and continuous delivery.

Literature review of ML-for-SE research and practitioner survey reporting use or expectations of predictive features; specific model performance metrics or dataset sizes not reported in the summary.

medium positive Artificial Intelligence as a Catalyst for Innovation in Soft... availability/use of predictive outputs (e.g., estimated effort, defect risk scor...

NLP techniques improve requirements management and team collaboration by extracting intent from natural-language artifacts (tickets, specs, PRs) and reducing miscommunication.

Synthesis of prior studies in the literature review and survey responses indicating perceived improvement in requirements handling and communication; survey sample size not reported.

medium positive Artificial Intelligence as a Catalyst for Innovation in Soft... perceived reduction in miscommunication / improved clarity of requirements

Including task cluster features yields measurable improvements under stratified 5-fold cross-validation in predictive probes (i.e., results are robust under cross-validated evaluation).

Empirical claim explicitly stating the evaluation methodology: two predictive probes evaluated with stratified 5-fold cross-validation showed improved winner prediction accuracy and reduced difficulty prediction error when cluster features were included. Exact numerical results are not provided in the summary.

medium positive Task-Aware Delegation Cues for LLM Agents cross-validated winner prediction accuracy and difficulty prediction error

Clusters and derived priors are human-interpretable and suitable to surface to end users as decision primitives.

Interpretability claim based on the semantic clustering approach and the intelligibility of win-rate and tie-rate maps; paper emphasizes interpretability but does not report user studies measuring comprehension or usability in this summary.

medium positive Task-Aware Delegation Cues for LLM Agents human interpretability (qualitative; no user-study metrics reported)

The proposed protocol (routing primary vs primary+auditor, rationale disclosure, privacy-preserving logs) enables routable, verifiable, and auditable delegation decisions.

Protocol design claim: authors describe a closed-loop system that uses Capability Profiles and Coordination-Risk Cues to route requests, request rationale, and log interactions. This is a systems/protocol proposal rather than a field-evaluated result; no deployment-scale evaluation reported here.

medium positive Task-Aware Delegation Cues for LLM Agents ability to make routable and auditable delegation decisions (protocol functional...

Including task cluster features reduces error in difficulty prediction (regression probe).

Empirical result from regression predictive probe comparing models with and without cluster features; evaluation used stratified 5-fold cross-validation. Specific error metrics and magnitudes not provided in the summary.

medium positive Task-Aware Delegation Cues for LLM Agents prediction error for task difficulty (regression error metric)

Including task cluster features improves winner prediction accuracy in predictive probes.

Empirical result from two predictive probes (classification/regression) reported in the paper; models trained with and without cluster features evaluated using stratified 5-fold cross-validation. Exact effect sizes or absolute accuracy numbers are not provided in the summary.

medium positive Task-Aware Delegation Cues for LLM Agents winner prediction accuracy (classification)

Introducing a task-aware collaboration signaling layer built from offline pairwise preference data can substantially reduce information asymmetry between humans and LLM agents.

Empirical claim supported by the proposed signaling layer derived from Chatbot Arena pairwise preference comparisons; validated via two predictive probes (classification/regression) showing improved predictive performance when cluster features are included. Data source: Chatbot Arena pairwise comparisons (dataset size not specified). Evaluation used stratified 5-fold cross-validation.

medium positive Task-Aware Delegation Cues for LLM Agents reduction in information asymmetry operationalized as improvements in predictive...

RAT data could be valuable for training models that better emulate human interpretive processes; firms owning such data may gain competitive advantage.

Argument in the AI economics section; no empirical model-training experiments or market analyses provided.

medium positive Chasing RATs: Tracing Reading for and as Creative Activity value of RAT data as training signal; competitive advantage for data-owning firm...

RATs make readable and potentially quantifiable the preparatory interpretive work that contributes to downstream outputs, with implications for labor accounting and human capital valuation.

Theoretical economic and policy discussion in the paper; no empirical measurement or case studies provided to quantify how much preparatory work is captured or its economic value.

medium positive Chasing RATs: Tracing Reading for and as Creative Activity visibility/quantifiability of interpretive labor and potential economic valuatio...

RATs can enable collective sensemaking via shared trails and networked associations among readers.

Conceptual argument and suggested network-analysis methods; illustrated with the speculative WikiRAT use case. No group-level empirical studies reported.

medium positive Chasing RATs: Tracing Reading for and as Creative Activity collective sensemaking artifacts (shared trails, co-read graphs, group understan...

RATs can support richer reader models (personalization and modeling of interpretive behavior) through sequence analysis, embedding/clustering of trajectories, and other analytic techniques.

Proposed analytical methods (sequence analysis, embedding/clustering, network analysis) listed in the paper; no implementation results or quantitative evaluations provided.

medium positive Chasing RATs: Tracing Reading for and as Creative Activity reader model quality (personalization accuracy, representation of interpretive b...

RATs enable reflective practice by helping readers see and revise their own processes.

Proposed affordance in the paper based on the inspectable nature of RATs and the WikiRAT illustration; suggested as a potential use case rather than empirically demonstrated.

medium positive Chasing RATs: Tracing Reading for and as Creative Activity changes in reflective behavior or self-revision of reading processes

RATs treat reading as a dual kind of creation: (a) creative input work that shapes future artifacts, and (b) a form of creation whose traces are valuable artifacts themselves.

Theoretical proposal and design rationale presented in the paper; illustrated via a speculative prototype (WikiRAT). No empirical validation provided.

medium positive Chasing RATs: Tracing Reading for and as Creative Activity recognition/value assigned to reading traces as artifacts

Reading Activity Traces (RATs) reconceptualize reading — including navigation, interpretation, and curation across interconnected sources — as creative labor.

Conceptual argument in the paper; supported by theoretical framing and literature review rather than empirical data. No sample size or deployment reported.

medium positive Chasing RATs: Tracing Reading for and as Creative Activity conceptual reclassification of reading (visibility/recognition of interpretive l...

Empirically, RAD improves out-of-distribution (OOD) robustness (OOD harmlessness) compared to baselines.

Out-of-distribution harmlessness evaluations reported in the paper showing RAD performs better than baselines on OOD safety tests (exact experimental details not provided in the summary).

medium positive Safe RLHF Beyond Expectation: Stochastic Dominance for Unive... OOD harmlessness / robustness (safety under OOD prompts or distribution shifts)

Empirically, RAD improves harmlessness relative to baseline RLHF methods.

Empirical evaluations reported in the paper comparing RAD to baseline RLHF methods on harmlessness metrics (specific datasets, sample sizes, and exact metrics not provided in the summary).

medium positive Safe RLHF Beyond Expectation: Stochastic Dominance for Unive... harmlessness metric(s) (e.g., rate of safety violations / harmful outputs)

Entropic regularization plus Sinkhorn iterations yields a differentiable, computationally tractable objective suitable for end-to-end optimization with policy gradient methods.

Algorithmic design and implementation details in the paper showing use of entropic-regularized OT and Sinkhorn; claimed compatibility with policy-gradient/end-to-end training (no concrete runtime benchmarks or sample-complexity numbers in the summary).

medium positive Safe RLHF Beyond Expectation: Stochastic Dominance for Unive... differentiability and computational tractability of the alignment objective (gra...

AI-enabled forecasting can raise operational productivity by reducing forecasting error, stockouts, and excess inventory, but realized returns depend on organizational complements (processes, governance).

Authors' synthesis of case evidence where AI forecasting reduced errors and inventory problems, combined with the theoretical claim that organizational complements condition realized gains.

medium positive Optimizing integrated supply planning in logistics: Bridging... forecast error, stockout frequency, inventory levels, operational productivity

Critical enablers for successful ISP adoption include executive sponsorship, cross-functional processes, data quality/governance, shared KPIs, and continuous learning cycles.

Recurring themes identified across the five case studies and synthesized in the authors' cross-case analysis as necessary organizational complements.

medium positive Optimizing integrated supply planning in logistics: Bridging... successful ISP adoption and subsequent performance improvements

AI-enabled forecasting combined with ERP integration leads to better synchronization across procurement, production, inventory, and distribution; improved decision visibility; and reduced forecasting errors where implemented.

Reported outcomes from cases in which firms implemented AI forecasting and ERP integration; interviewees described improved synchronization and lower forecasting errors (qualitative reports rather than quantified effect sizes).

medium positive Optimizing integrated supply planning in logistics: Bridging... forecasting error (e.g., MAPE), synchronization metrics across functions, decisi...

Policy recommendations: economists and policymakers should perform cost–benefit analyses of explainability mandates, incentivize research into human-centered explanation methods, subsidize standards and certification infrastructure, and consider staged regulation balancing innovation with accountability in high-risk domains.

Prescriptive recommendations drawn by the paper's authors from the review of technical, social-science, and policy literatures; based on synthesis rather than empirical testing of policy impacts.

medium positive Explainable AI in High-Stakes Domains: Improving Trust, Tran... policy design actions (cost–benefit analysis, incentives, subsidies, staged regu...

Clearer explanations and audit trails make it easier to assign responsibility and price risk (insurance markets, contract terms), potentially reducing uncertainty in public procurement and private contracts.

Economic and legal literature included in the review providing conceptual arguments and illustrative cases; no new empirical risk-pricing estimates provided in the paper.

medium positive Explainable AI in High-Stakes Domains: Improving Trust, Tran... ability to assign responsibility; risk pricing and uncertainty in procurement/co...

Better explainability (when usable) raises willingness-to-adopt AI in regulated, risk-averse sectors by reducing information asymmetries and perceived liability—potentially expanding market size for explainable systems.

Economic and conceptual arguments synthesized from the reviewed literature; the review aggregates studies and arguments but does not present new quantitative adoption estimates.

medium positive Explainable AI in High-Stakes Domains: Improving Trust, Tran... willingness-to-adopt AI; potential market size for explainable systems

Implementation requires organizational practices—governance, training, monitoring, and incentives—to translate explainability into safer, more legitimate AI use.

Synthesis of organizational, policy, and case-study literature in the review that identifies organizational measures correlated with effective deployment of explainable systems; descriptive evidence rather than causal experiments.

medium positive Explainable AI in High-Stakes Domains: Improving Trust, Tran... safety and perceived legitimacy of AI deployment

Regulatory frameworks, auditability, documentation (e.g., model cards, datasheets), and clear lines of responsibility amplify the effectiveness of explainability for accountability and compliance.

Synthesis of policy and governance literature included in the review that discusses how institutional mechanisms interact with technical explainability to produce accountability; descriptive evidence from case studies and governance proposals in the literature.

medium positive Explainable AI in High-Stakes Domains: Improving Trust, Tran... organizational accountability and regulatory compliance outcomes

Labor demand will increasingly favor skills that support effective Human–AI teaming (interpretation, interrogation of AI, systems orchestration, shared-model building) rather than routine task execution.

Implication drawn from the framework and literature on complementarity and skill-biased technological change; presented as an expectation rather than quantified by labor market data in the paper.

medium positive Toward a science of human–AI teaming for decision-making: A ... labor demand by skill type (employment shares, wage growth for non-routine teami...

Instituting continuous training, evaluation, and feedback loops is required to adapt Human–AI teams over time and maintain performance.

Prescriptive inference from organizational learning and human factors literature synthesized in the paper; suggested as best practice without empirical evaluation within the paper.

medium positive Toward a science of human–AI teaming for decision-making: A ... performance trajectories over time (learning curves), calibration of trust, adap...

Building knowledge infrastructures that capture, curate, and make provenance accessible is necessary for team knowledge continuity, accountability, and learning.

Conceptual recommendation informed by literature on knowledge management and provenance; no empirical measures or case studies reported to quantify impact.

medium positive Toward a science of human–AI teaming for decision-making: A ... knowledge availability, traceability/provenance metrics, learning/adaptation spe...

Partitioning roles — assigning pattern-detection tasks to AI and normative or contextual judgment to humans — improves task allocation based on comparative strengths.

Design recommendation derived from matching cognitive primitives to task types, supported conceptually by literature; not validated with empirical experiments in this paper.

medium positive Toward a science of human–AI teaming for decision-making: A ... task performance (accuracy, speed, decision quality) under role-partitioned work...

Complementarity requires structuring interactions so humans and AI amplify each other's strengths rather than substitute for one another.

Conceptual argument based on theoretical review of complementarity and collective intelligence; no empirical tests included.

medium positive Toward a science of human–AI teaming for decision-making: A ... degree of complementarity (interaction effects between human skill and AI capabi...

Aligning AI capabilities with human cognitive processes — reasoning, memory, and attention — is foundational to effective Human–AI teaming.

Theoretical grounding and literature synthesis drawing on cognitive science and human factors; proposed as a core lens for the framework rather than validated empirically in the paper.

medium positive Toward a science of human–AI teaming for decision-making: A ... team effectiveness (decision quality, error rate) as mediated by alignment with ...

Human–AI teams can achieve true complementarity such that joint team performance exceeds that of humans or AI alone.

Conceptual claim supported by an integrative, cross-disciplinary framework synthesizing literature from collective intelligence, cognitive science, AI, human factors, organizational behavior, and ethics. No primary empirical dataset or controlled experiments reported in the paper.

medium positive Toward a science of human–AI teaming for decision-making: A ... joint team performance (overall accuracy/quality of decisions compared to indivi...

Operationalizing explainability alongside monitoring (data-drift detection, retraining schedules) and usage rules stabilizes managerial outcomes and raises adoption/trust.

Argument supported by the pilot illustration and the paper's operational design; evidence primarily from single-case pilot and conceptual reasoning rather than multi-site causal testing.

medium positive ALGORITHM FOR IMPLEMENTING AI IN THE MANAGEMENT LOOP OF SMES... stability of managerial outcomes (e.g., consistent decision impact) and adoption...

Explainability (XAI) tools were integrated with the model and, together with operational quality controls (data-drift monitoring, retraining routines, and usage regulations), increased user trust and improved reproducibility of managerial impact in the pilot.

Pilot case study reporting integration of XAI and operational controls and reporting increases in user trust and reproducibility of managerial outcomes (single SME pilot; qualitative and quantitative details referenced but not listed in the summary).

medium positive ALGORITHM FOR IMPLEMENTING AI IN THE MANAGEMENT LOOP OF SMES... user trust (reported increase) and reproducibility of managerial impact (stabili...

A pilot implementation in an SME for inventory-demand forecasting used a gradient-boosting model which outperformed a business-as-usual baseline on forecasting accuracy metrics.

Single pilot case study reported in the paper: inventory-demand forecasting pilot comparing a gradient-boosting model to a baseline forecasting approach (sample: one SME pilot; specific implementation details and exact metrics not provided in the summary).

medium positive ALGORITHM FOR IMPLEMENTING AI IN THE MANAGEMENT LOOP OF SMES... forecasting accuracy (forecast error / accuracy metrics) of gradient-boosting mo...

Firms and governments should invest in continuous training, certification for AI‑augmented skills, and transition assistance to mitigate frictions.

Policy recommendation grounded in the paper's assessment of transition risks and complementarities; not based on program evaluation data.

medium positive How AI Will Transform the Daily Life of a Techie within 5 Ye... policy uptake and effectiveness (training participation rates, certification pre...

Likely increase in the skill premium for workers who can coordinate with and supervise AI (architecture, ethics, systems thinking), creating upward pressure on wages for those skill sets.

Economic reasoning about complementarity between AI capital and high‑skill labor; no wage‑level empirical analysis presented.

medium positive How AI Will Transform the Daily Life of a Techie within 5 Ye... wage changes by skill type (skill premium increase for AI‑complementary skills)

Short‑ to medium‑term productivity gains in software and digital‑product development are likely, lowering per‑unit development costs and accelerating release cycles.

Scenario reasoning and task automation/complementarity arguments extrapolating from current tools; no firm‑level productivity data analyzed.

medium positive How AI Will Transform the Daily Life of a Techie within 5 Ye... productivity metrics (output per developer, per‑unit development cost, release f...

Personalized, continuous learning through AI tutors and on‑the‑job assistants will lower some training frictions but raise the returns to upskilling.

Conceptual reasoning and examples of tutoring/assistive AI; not supported by empirical evaluation of learning outcomes or labor market returns.

medium positive How AI Will Transform the Daily Life of a Techie within 5 Ye... training frictions (time/cost to skill acquisition) and returns to upskilling (w...

AI will change how teams coordinate (automated status summaries, intelligent task routing, synthesis of asynchronous work), potentially speeding product cycles.

Scenario reasoning based on possible AI features in PM and collaboration tools; no measured changes in product cycle times presented.

medium positive How AI Will Transform the Daily Life of a Techie within 5 Ye... product cycle length / time‑to‑release and team coordination metrics (frequency ...

Demand will grow for skills complementary to AI: prompt‑engineering‑like skills, validation/verification, interpretability, governance, and stakeholder communication.

Qualitative reasoning about complementarities between human skills and AI capabilities and illustrative examples; no labor market data analyzed.

medium positive How AI Will Transform the Daily Life of a Techie within 5 Ye... demand for specific complementary skills (job postings, hiring rates for validat...

Practitioners will shift focus toward problem framing, architecture, system‑level reasoning, domain expertise, human‑centered design, and ethics as AI handles more routine tasks.

Task decomposition analysis identifying which tasks become complementary versus automatable; scenario reasoning about how remaining human tasks change; no empirical occupational data.

medium positive How AI Will Transform the Daily Life of a Techie within 5 Ye... change in time allocation and job task composition for tech practitioners (propo...

AI will assist with design through adaptive interfaces, automated usability testing, and rapid prototype generation.

Illustrative examples of AI in design tooling and conceptual reasoning about model capabilities; not supported by systematic user studies in the paper.

medium positive How AI Will Transform the Daily Life of a Techie within 5 Ye... extent of AI usage in design tasks (adaptive UI changes, automated usability tes...

« Prev 1 2 3 … 46 47 48 … 59 60 Next »