The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (6491 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Human Ai Collab Remove filter
The per-task ceiling does not bind the windowed measure, though both remain bounded: L_task by per-task novelty, L_window by the stock of accumulated planning investment that pays out within the window.
Theoretical derivation/argument in the paper distinguishing bounds on per-task leverage (L_task) and windowed leverage (L_window) and identifying their respective limiting factors; no empirical evidence provided.
high positive Leverage Laws: A Per-Task Framework for Human-Agent Collabor... bounds on L_task and L_window (per-task novelty and accumulated planning investm...
We extend this per-task analysis to a windowed leverage measure that accommodates recurring tasks, spawned subtasks, and amortized system-design investment.
Conceptual/theoretical extension in the paper defining a windowed leverage metric and describing how it accounts for recurring tasks, subtasks, and amortized design investments; no empirical tests reported.
high positive Leverage Laws: A Per-Task Framework for Human-Agent Collabor... windowed leverage (aggregated leverage over a time window accounting for amortiz...
The asymptotic behavior of leverage decomposes into two scaling axes (capability and memory) with a non-zero floor on the planning term set by irreducible task novelty bounded by human throughput.
Mathematical/theoretical asymptotic analysis within the paper; conceptual derivation linking capability and memory as scaling axes and asserting a lower bound on planning cost due to task novelty and human throughput.
high positive Leverage Laws: A Per-Task Framework for Human-Agent Collabor... leverage scaling behavior and lower bound on planning term
Information density itself is directional and bounded by separate ceilings on human-to-agent and agent-to-human flow.
Theoretical argument/derivation in the paper establishing directional information-density and distinct upper bounds for each flow direction; no empirical validation reported.
high positive Leverage Laws: A Per-Task Framework for Human-Agent Collabor... directional information flow bounds between human and agent
The denominator decomposes into three channels through which a conserved per-task information requirement must flow, each with its own time-cost scalar (specify the task, resolve mid-run interrupts, and review the result).
Analytic decomposition within the paper's theoretical framework; conceptual argument rather than empirical measurement.
high positive Leverage Laws: A Per-Task Framework for Human-Agent Collabor... components of human time cost (specification, interrupt resolution, review)
We propose a per-task leverage ratio for human-agent collaboration: human work displaced by an agent, divided by the human time required to specify the task, resolve mid-run interrupts, and review the result.
Theoretical/conceptual proposal and formal definition provided in the paper; no empirical sample or experimental data reported.
high positive Leverage Laws: A Per-Task Framework for Human-Agent Collabor... human work displaced per unit human time (per-task leverage)
Grounding recommendations in validated research offers leaders a framework for navigating AI's labor implications responsibly.
Paper asserts that its synthesis and recommendations provide a practical framework for leaders; no empirical validation of the framework is reported in the abstract.
high positive AI Displacement Risk in the Labor Market: Evidence, Exposure... ability of leaders to navigate AI labor implications and mitigate harm
Evidence-based organizational responses (transparent workforce planning, skills investment, redesigned roles, adaptive governance, and long-term capability-building) can mitigate harm and prepare organizations for workplace transformation.
Paper proposes these organizational responses grounded in the synthesized empirical literature; this is a recommendation rather than an empirically tested intervention in the paper abstract.
high positive AI Displacement Risk in the Labor Market: Evidence, Exposure... organizational readiness and mitigation of AI-related harms
Our evolved prefetcher achieves a 1.76x geomean IPC speedup over no prefetching, 17% over its VA/AMPM Lite seed (1.59x) and 21% over SMS (1.55x).
Reported experimental result comparing geomean IPC across benchmark set; comparisons made to no prefetching and two specific prefetcher baselines. Benchmark details not included in abstract.
Our evolved branch predictor achieves a 1.100x geomean IPC speedup over Bimodal, 1.5% over its Hashed Perceptron seed (1.085x).
Reported experimental result comparing geomean IPC across benchmark set; compared to Bimodal and Hashed Perceptron seed as baselines. Benchmark details not given in abstract.
Our best evolved cache replacement design achieves a 1.062x geomean IPC speedup over LRU, 0.6% over Mockingjay (1.056x).
Reported experimental result comparing geomean IPC across benchmark set; exact benchmark count/split not provided in abstract. Comparison reported against LRU and Mockingjay baselines.
Across cache replacement, data prefetching, and branch prediction, Agentic Architect matches or exceeds state-of-the-art designs.
Experimental evaluation across three microarchitectural component domains (cache replacement, prefetching, branch prediction) reported in the paper with comparative performance results versus baselines.
We introduce Agentic Architect, an agentic AI framework for computer architecture design exploration and optimization that combines LLM-driven code evolution with cycle-accurate simulation.
Authors' description of the system and methodology in the paper (introduction and methods). No numeric sample size reported in the abstract; evidence is the implemented framework and accompanying descriptions; authors state it is open-source.
Through targeted prompting inspired by these findings, we modify agents' negotiation behavior and improve win rates from 22.2% to 32.7%.
Intervention experiment reported in the paper where prompts were changed and resulting agent win rates were measured.
In clinical utility evaluation across three abstraction tasks, semantic search reduced time-to-completion by 24 to 89% compared to clinician-performed chart review.
Clinical utility assessment compared chart abstraction efficiency across three tasks and reported percentage reductions in time-to-completion ranging from 24% to 89%.
Qwen3 embeddings with 300-token chunk size achieved 94.6% accuracy on a clinical question-answering benchmark.
Optimization experiment on a physician-authored clinical question-answering benchmark; best-performing configuration reported as qwen3 embeddings with 300-token chunks and 94.6% accuracy.
high positive Health System Scale Semantic Search Across Unstructured Clin... accuracy_on_clinical_question_answering_benchmark
The system delivers sub-second query latency: median 237 ms single-user, 451 ms at 20-user concurrency.
Full-scale performance characterization reported exact median latencies for single-user and 20-user concurrency.
Technological advancement alone is insufficient—maximizing AI's economic potential requires strategic investments in workforce capability development (e.g., widespread AI fluency programs and targeted cultivation of higher-order judgment skills).
Policy recommendation based on the article's synthesis of task-based models and empirical literature; the excerpt does not report specific interventions, trials, or sample sizes.
high positive AI as Augmentation: How Human Capital Shapes Technology's Im... effectiveness of workforce capability investments for realizing AI-driven produc...
The supply of AI-literate workers amplifies productivity gains.
Stated as a mechanism in the task-based model synthesis; described qualitatively in the article without specific empirical method or sample sizes in the excerpt.
high positive AI as Augmentation: How Human Capital Shapes Technology's Im... productivity gains from AI adoption
Aggregate productivity improvements from AI advancement depend critically on two forms of human capital: specialized AI expertise and complementary non-AI skills.
Claim is presented as a theoretical result drawn from 'task-based economic models' in the article; empirical corroboration is referenced generally but no specific datasets or sample sizes are reported in the excerpt.
high positive AI as Augmentation: How Human Capital Shapes Technology's Im... aggregate productivity improvements
Mounting empirical evidence indicates AI primarily functions as augmentation technology—amplifying human capabilities rather than replacing workers.
Article states it draws on 'mounting empirical evidence' and synthesizes recent theoretical and empirical findings; no specific studies, methods, or sample sizes are cited in the excerpt.
high positive AI as Augmentation: How Human Capital Shapes Technology's Im... degree of workforce displacement versus augmentation (replacement vs. amplificat...
The proposed approach reframes AI control from optimizing decisions to governing their admissibility, introducing a protocol-level abstraction that operates independently of model architecture or training methodology.
Conceptual argument and proposal in the paper asserting architecture-agnostic protocol abstraction. No empirical tests across architectures or training methods reported.
high positive Right-to-Act: A Pre-Execution Non-Compensatory Decision Prot... shift in control paradigm (from decision optimization to admissibility governanc...
Through a scenario-based case study, we demonstrate how identical AI outputs can lead to divergent outcomes when evaluated under a Right-to-Act protocol, preserving reversibility and preventing premature or irreversible actions.
Scenario-based case study (illustrative demonstration). The paper reports example scenarios rather than empirical experiments; no sample size or quantitative evaluation reported.
high positive Right-to-Act: A Pre-Execution Non-Compensatory Decision Prot... divergent outcomes from identical AI outputs under the protocol; preservation of...
Unlike compensatory systems, where high-confidence signals can override failed conditions, the proposed framework enforces strict structural constraints: if any required condition is unmet, execution is halted or deferred.
Conceptual distinction and protocol rule specification in the paper (formal description of non-compensatory enforcement). No empirical testing reported.
high positive Right-to-Act: A Pre-Execution Non-Compensatory Decision Prot... whether execution proceeds when required conditions are unmet (halt/defer behavi...
We introduce the Right-to-Act protocol, a deterministic, non-compensatory pre-execution decision layer that evaluates whether an AI-generated decision is permitted to be realized at all.
Proposed method / conceptual contribution and formal definition provided in the paper (formalization and protocol specification). No empirical validation or sample size reported.
high positive Right-to-Act: A Pre-Execution Non-Compensatory Decision Prot... eligibility/admissibility of AI-generated decisions prior to execution
Taken together, these insights provide theoretical clarity and practical guidance for responsible GenAI integration into creative work.
Authors' stated contribution and practical recommendations derived from the conceptual framework; no empirical evaluation of guidance effectiveness provided.
high positive Beyond the Creativity Paradox: A Theory-informed Framework f... theoretical clarity and practical guidance for responsible GenAI integration
The study reinterprets process-oriented creativity theories through structural parallels with GenAI.
Conceptual reanalysis and theoretical reinterpretation based on literature synthesis (paper's theoretical contribution).
high positive Beyond the Creativity Paradox: A Theory-informed Framework f... process-oriented creativity theory reinterpretation
The authors propose a role-based integration model that aligns GenAI capabilities with key creative functions: idea generation, synthesis, strategic framing, and facilitation.
Presentation of a novel conceptual model / framework in the paper (theoretical design); no empirical validation or measured outcomes reported.
high positive Beyond the Creativity Paradox: A Theory-informed Framework f... alignment of GenAI capabilities with creative functions (idea generation, synthe...
The paper repositions GenAI as a cognitive collaborator rather than merely a productivity tool.
Argumentative / conceptual claim supported by the proposed theoretical reframing and role-based model in the paper; no empirical testing reported.
high positive Beyond the Creativity Paradox: A Theory-informed Framework f... role of GenAI in organizational workflows (cognitive collaborator vs productivit...
There are structural parallels between GenAI architectures and human cognition—such as heuristic search, divergent thinking, and iterative refinement.
Conceptual mapping and theoretical comparison between GenAI architecture characteristics and cognitive/creativity constructs presented in the paper (literature synthesis / theoretical argument).
high positive Beyond the Creativity Paradox: A Theory-informed Framework f... structural parallels between GenAI architectures and human cognition (heuristic ...
The study revisits foundational creativity theories to develop a framework for integrating GenAI into creative workflows.
Paper describes a conceptual review and theoretical synthesis of foundational creativity theories leading to a proposed integration framework; methodological (theoretical / conceptual) contribution rather than empirical validation.
high positive Beyond the Creativity Paradox: A Theory-informed Framework f... framework for integrating GenAI into creative workflows
Generative Artificial Intelligence (GenAI) is reshaping organisational creativity by emulating cognitive processes traditionally associated with human innovation.
Paper's theoretical argument and literature-grounded conceptual claims (conceptual analysis / literature review); no empirical sample or quantitative data reported.
high positive Beyond the Creativity Paradox: A Theory-informed Framework f... organisational creativity
That compliance layer can improve oversight by making departures from law easier to detect.
Claim supported by the paper's analytical argumentation (no empirical evidence reported).
high positive AI Governance under Political Turnover: The Alignment Surfac... detectability of departures from law (oversight effectiveness)
For probabilistic AI to be incorporated into public administration it must be embedded in a compliance layer that makes decisions reviewable, repeatable, and legally defensible.
Stated as a normative/architectural claim in the paper; supported by conceptual argument rather than empirical testing.
high positive AI Governance under Political Turnover: The Alignment Surfac... requirements for legal/administrative incorporation of probabilistic AI
Governments are increasingly interested in using AI to make administrative decisions cheaper, more scalable, and more consistent.
Stated as background motivation in the paper (no empirical data or sample size reported).
high positive AI Governance under Political Turnover: The Alignment Surfac... government interest in AI adoption for administrative decisions (cost, scale, co...
There is an open opportunity to support collaborative construction where users and AI jointly develop an evolving knowledge representation.
Paper's stated research opportunity and motivation based on gaps identified in prior tools and systems (conceptual argument).
high positive MindTrellis: Co-Creating Knowledge Structures with AI throug... potential benefits of joint user-AI collaborative knowledge representation (prop...
In a user study where 12 participants created slide decks, MindTrellis outperformed retrieval-only baselines in knowledge organization and cognitive load, as measured by expert ratings of content coverage and structural quality.
Controlled user study reported in the paper: N = 12 participants performing slide-deck creation tasks; outcomes assessed via expert ratings of content coverage and structural quality (comparison to retrieval-only baseline).
high positive MindTrellis: Co-Creating Knowledge Structures with AI throug... knowledge organization and cognitive load (operationalized via expert ratings of...
MindTrellis is an interactive visual system where users and AI collaboratively build a dynamic knowledge graph; users can query the graph for document-grounded information and contribute by introducing new concepts, modifying relationships, and reorganizing the hierarchy.
System design and implementation described in the paper (feature description and demonstration).
high positive MindTrellis: Co-Creating Knowledge Structures with AI throug... system capability to support collaborative construction and manipulation of a dy...
Generative artificial intelligence (genAI) is rapidly reshaping how knowledge and culture are produced and consumed.
Author's descriptive statement based on observed changes in production/consumption patterns (no empirical sample reported in paper abstract).
high positive Generative artificial intelligence reduces social welfare th... production and consumption of knowledge and culture
Reducing variability in solder-joint quality and cycle time.
Abstract statement that variability in solder-joint quality and cycle time was reduced during the deployment (no quantitative variability metrics provided in the abstract).
high positive Learning-augmented robotic automation for real-world manufac... variability of solder-joint quality; variability of cycle time
It maintained near-human takt time.
Abstract claim comparing the system's cycle/takt time to human performance during the deployment (no numeric takt-time comparison provided in the abstract).
high positive Learning-augmented robotic automation for real-world manufac... takt time (cycle time) relative to human workers
Achieving a 99.4% pass rate on product-level quality-control tests.
Reported QC pass rate from the production run in the abstract (presumably based on the produced motors).
high positive Learning-augmented robotic automation for real-world manufac... product-level quality-control pass rate
Operating without physical fencing.
Abstract statement that the run occurred "without physical fencing" (implying operation around people without traditional fences).
high positive Learning-augmented robotic automation for real-world manufac... use of physical fences for safety (absent)
Produced 108 motors.
Count of products produced during the continuous run reported in the abstract.
high positive Learning-augmented robotic automation for real-world manufac... number of motors produced during the run
The system operated continuously for 5 h 10 min.
Reported continuous operation duration from the production run described in the abstract.
high positive Learning-augmented robotic automation for real-world manufac... continuous operational time without interruption
Less than 20 min of real-world data per task.
Reported training data requirement for the deployed tasks in the authors' field experiment (abstract statement).
high positive Learning-augmented robotic automation for real-world manufac... amount of real-world training data per task
With less than 20 min of real-world data per task, the system operated continuously for 5 h 10 min, producing 108 motors without physical fencing and achieving a 99.4% pass rate on product-level quality-control tests.
Single field deployment / production run reported in the paper; numbers reported in the abstract (training data time, continuous operation duration, number of motors produced, fencing status, QC pass rate).
high positive Learning-augmented robotic automation for real-world manufac... training data required; continuous operational duration; production quantity; pr...
We deployed the system on an electric-motor production line to automate deformable cable insertion and soldering under real manufacturing constraints, a step previously performed manually by human workers.
Field deployment on an actual electric-motor production line described by the authors (deployment + task specification).
high positive Learning-augmented robotic automation for real-world manufac... automation of previously manual deformable cable insertion and soldering tasks
We present Learning-Augmented Robotic Automation, a hybrid system that integrates learned task controllers and a neural 3D safety monitor into conventional industrial workflows.
Description of the system developed by the authors (system design/development reported in the paper).
high positive Learning-augmented robotic automation for real-world manufac... integration of learned controllers and 3D safety monitoring
Self-correction should be treated not as a default behavior, but as a control decision governed by measurable error dynamics.
Synthesis of theoretical framing (Markov model and diagnostic inequality) and empirical results across multiple models/datasets showing thresholds and promptability of EIR.
high positive When Does LLM Self-Correction Help? A Control-Theoretic Mark... policy/recommendation about when to enable iterative self-correction to improve ...