The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (6574 claims)

Adoption
8625 claims
Productivity
7686 claims
Governance
6917 claims
Human-AI Collaboration
6574 claims
Org Design
4189 claims
Innovation
4131 claims
Labor Markets
3588 claims
Skills & Training
2985 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 761 200 101 904 2020
Governance & Regulation 829 400 191 122 1566
Organizational Efficiency 784 193 125 84 1197
Technology Adoption Rate 637 236 124 97 1103
Research Productivity 431 131 58 340 972
Output Quality 481 183 59 47 770
Decision Quality 332 177 82 49 647
Firm Productivity 439 57 88 20 610
AI Safety & Ethics 218 279 66 33 602
Market Structure 181 170 123 24 503
Task Allocation 214 64 72 33 388
Skill Acquisition 174 62 62 17 315
Innovation Output 204 27 45 18 295
Employment Level 105 54 108 13 282
Fiscal & Macroeconomic 132 69 43 26 277
Consumer Welfare 117 63 42 11 233
Firm Revenue 154 48 26 3 231
Task Completion Time 173 31 8 12 225
Inequality Measures 44 123 50 6 223
Worker Satisfaction 89 65 22 12 188
Error Rate 71 92 10 2 175
Regulatory Compliance 77 69 14 5 165
Automation Exposure 58 56 26 13 156
Training Effectiveness 96 21 14 19 152
Wages & Compensation 77 37 25 6 145
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 81 21 1 115
Hiring & Recruitment 52 7 8 3 70
Creative Output 32 20 8 3 64
Skill Obsolescence 5 47 6 1 59
Social Protection 28 16 8 2 54
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Human Ai Collab Remove filter
A modular four-script Python pipeline processes synthetic FHIR-based claims data and real claims documents, extracting 36 actuarial variables across reserving, ratemaking, and claims management categories.
Authors report implementation of a four-script Python pipeline applied to synthetic FHIR-based claims and real documents, with 36 target variables defined.
high positive Leveraging LLMs for Unstructured Claims Data Analysis number of actuarial variables extractable by the pipeline
We present a proof-of-concept framework using large language models (LLMs) to extract structured actuarial variables from unstructured claims data.
Authors implemented a prototype framework described in the paper (implementation details and pipeline described).
high positive Leveraging LLMs for Unstructured Claims Data Analysis ability to extract structured actuarial variables from unstructured text
I have developed LLMbench, a research instrument for the comparative close reading of LLM outputs that visualises token probability distributions, entropy curves, and cross-model divergence.
Description of a tool/method developed by the author (LLMbench); claim about the tool's features as stated in the abstract; no implementation details or evaluation sample sizes provided in the abstract.
high positive Prompt anxiety and the algorithmic politics of uncertainty tool capabilities (visualisation of token probabilities, entropy, cross-model di...
Journalists and editors exercise bounded and situational agency through local adaptation, self-training, and development of ethical guardrails that institutionalise responsible AI use.
Based on in-depth interviews with newsroom staff (journalists, editors, technical personnel) at Al-Masry Al-Youm; qualitative accounts of local practices such as self-training and the creation of internal ethical rules. Sample size not reported in the excerpt.
high positive Platformisation, Power, and AI Governance in the Newsroom: I... local adaptation, skill development, and internal governance practices
By bridging established knowledge with emerging governance challenges, this study advances a more comprehensive understanding of platform governance and outlines future research avenues related to technological change, dynamic capabilities, and ecosystem perception.
Authors' stated contribution based on their integrative framework and literature synthesis of 644 publications.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... advancement of understanding of platform governance and identification of future...
The paper proposes a research agenda that examines how emerging technologies, including algorithmic governance, generative AI, and agentic systems, are reshaping governance practices.
Paper's concluding/prospective section proposing future research directions; conceptual proposal rather than empirical test.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... proposed future research topics concerning the impact of emerging technologies o...
The identified governance mechanisms foster innovation in platform ecosystems.
Claim based on the paper's integrative synthesis of 644 publications indicating governance's role in fostering innovation.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... innovation outcomes in platform ecosystems associated with governance mechanisms
The identified governance mechanisms ensure quality in platform ecosystems.
Argument and synthesis from the systematic literature review of 644 publications as presented in the paper's framework.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... quality assurance in platform ecosystems via governance mechanisms
The identified governance mechanisms (incentives, control, boundary resources) enable platform owners to coordinate value creation.
Argument based on the integrative framework derived from the systematic literature review (644 publications).
high positive Mission: Orchestration – Governance Mechanisms And Future Re... coordination of value creation by platform owners using governance mechanisms
There are three core types of governance mechanisms that enable platform owners to coordinate value creation, ensure quality, and foster innovation: incentives, control, and boundary resources.
Synthesis and classification resulting from the systematic literature review of 644 publications, producing an integrative framework that identifies the three mechanism types.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... identification of three governance mechanism types (incentives, control, boundar...
This study conducts a systematic literature review of 644 publications to synthesize the governance landscape and develop an integrative framework.
Methodological statement from the paper reporting the authors performed a systematic literature review analyzing 644 publications.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... number of publications reviewed and use of SLR to develop framework
Platform owners orchestrate complementor participation through governance mechanisms.
Synthesis and conceptual argument based on the systematic literature review of 644 publications.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... use of governance mechanisms by platform owners to orchestrate participation
Digital platform ecosystems rely on loosely coupled complementors to jointly create value with platform owners.
Synthesis of prior literature via the paper's systematic literature review (644 publications); conceptual framing in the literature on platform ecosystems.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... reliance of platform ecosystems on loosely coupled complementors for joint value...
Work Flexibility is the strongest predictor of Employee Productivity (β = 0.562, p < 0.001), indicating flexible working conditions play an important role in improving employee performance and work efficiency.
Reported quantitative result from the study using PLS-SEM; β and p-value provided in the paper indicating the largest standardized effect among predictors. Sample size not reported in the excerpt.
Human-Centric AI Adoption has a positive and statistically significant effect on Employee Productivity (β = 0.263, p = 0.028).
Reported quantitative result from the study using Partial Least Squares Structural Equation Modeling (PLS-SEM); β and p-value provided in the paper. Sample size not reported in the excerpt.
Artificial intelligence (AI) increasingly participates in strategic decision-making, challenging leadership theories that assume human agency at the top of organizations.
Concept-centric literature review integrating management and information systems (IS) research; theoretical synthesis of prior empirical and conceptual studies (no primary empirical sample reported).
high positive Hybrid Upper Echelons: A Theorizing Review On Ai In Executiv... participation of AI in strategic decision-making
The findings suggest that twin-based market research is no longer gated by data design, but by item volume, model selection, and a small set of construction-level decisions.
Interpretive conclusion based on empirical results across the construction-method grid and performance patterns (discussion/implication in paper).
high positive Synthetic Personalities: How Well Can LLMs Mimic Individual ... primary constraints on successful twin-based market research (data design vs. it...
Best-cell Fisher-z rank-order correlation reaches r = 0.590 on the SOEP held-out evaluation set.
Reported best-performing cell Fisher-z (or Fisher-transformed correlation) from held-out evaluation on SOEP.
high positive Synthetic Personalities: How Well Can LLMs Mimic Individual ... rank-order correlation (Fisher-z) of twin responses vs. held-out answers
Best-cell accuracy reaches 78.8% on the SOEP held-out evaluation set.
Reported best-performing cell accuracy from held-out evaluation on SOEP.
high positive Synthetic Personalities: How Well Can LLMs Mimic Individual ... hold-out accuracy (best-performing cell)
Switching the embedding from a narrative persona summary to a raw dialog history of past responses raises hold-out accuracy in every model-by-reasoning cell at the 100 percent depth.
Empirical comparison between two embedding methods at 100% information depth across all model-by-reasoning cells (reported in results).
high positive Synthetic Personalities: How Well Can LLMs Mimic Individual ... hold-out accuracy as a function of embedding method (narrative persona summary v...
Twin quality rises with information depth but with diminishing returns past the 75 percent entropy quartile, which acts as a cost-efficient Pareto point relative to the best-performing 100 percent cells.
Empirical evaluation across information-depth conditions, comparing hold-out performance by normalized Shannon entropy quartiles (reported in results).
high positive Synthetic Personalities: How Well Can LLMs Mimic Individual ... twin quality (hold-out performance / accuracy) as a function of information dept...
The field can be organized around an integrated decision-system framework consisting of five connected constructs—delegation frontier, reliance wedge, decision-useful XAI, meaningful oversight, and reflexive AI loop—to support cumulative research on investment, trading, credit, asset management, risk, compliance, and financial regulation.
Proposal of a conceptual framework grounded in the paper’s integrative literature review (no empirical validation or sample size reported in the abstract).
high positive Human–AI hybrid finance: from AI tools to decision systems utility of the proposed decision-system framework for structuring future researc...
The review integrates evidence on methods, data, scenarios, explainability, trust, governance, financial large language models (FinLLMs), and agentic finance.
Descriptive claim about the scope of this paper’s literature synthesis (the review itself; content-based rather than empirical).
high positive Human–AI hybrid finance: from AI tools to decision systems breadth of topics integrated in the review
The central question is moving from model performance to decision architecture: how authority, oversight, and accountability should be allocated across financial workflows.
Argument based on synthesis of prior literature across relevant fields (conceptual review; no single empirical study or sample size reported).
high positive Human–AI hybrid finance: from AI tools to decision systems allocation of authority, oversight, and accountability in financial decision wor...
AI is moving from a predictive tool to a component of human–AI hybrid financial decision systems.
Integrative conceptual literature review synthesizing work across finance, management, human–computer interaction (HCI), and AI (no primary empirical sample reported).
high positive Human–AI hybrid finance: from AI tools to decision systems role of AI within financial decision workflows (predictive tool vs. integrated d...
Under linear local composition, every protocol tree defines a barycentric coordinate chart on the simplex of leaf weights; Tamari-cover reparameterizations of protocol trees preserve complementarity, and for N = 4 these reparameterizations satisfy the pentagon identity.
Mathematical construction and proofs in the paper linking protocol trees, barycentric coordinates, Tamari lattice reparameterizations, and the pentagon identity (theoretical work; no empirical sample).
high positive Tree-Based Formalization of Multi-Agent Complementarity in H... structural invariances of aggregation protocols (preservation of complementarity...
For N = 2 in regression under squared loss, the optimal linear-pooling weight has a closed form and admits a residual-correction interpretation.
Closed-form derivation and interpretation provided in the paper (mathematical derivation; no empirical sample).
high positive Tree-Based Formalization of Multi-Agent Complementarity in H... optimal pooled prediction weight (performance of 2-agent aggregation)
Across our large-scale empirical analysis, Parthenon substantially improves the performance of state-of-the-art models and harnesses on legal-matter tasks.
Reported evaluations in the paper comparing baseline state-of-the-art models/harnesses to the Parthenon framework across their empirical dataset (Harvey LAB), claiming substantial performance gains.
high positive Parthenon Law: A Self-Evolving Legal-Agent Framework performance on legal-matter tasks (aggregate metric unspecified in abstract)
An anti-leakage learning loop converts scored failures into task-agnostic edits to skills, tools, and knowledge, letting the system improve with experience without touching model weights.
Paper describes a proposed/implemented learning loop (anti-leakage) that translates scored agent failures into edits to non-weight system components (skills, tools, knowledge) and claims this enables improvement without model weight updates.
high positive Parthenon Law: A Self-Evolving Legal-Agent Framework system improvement via edits to skills/tools/knowledge (no model weight changes)
We introduce Parthenon, a self-evolving legal-agent framework that factors Model, Harness, Agent roles, legal Knowledge, deterministic Tools, and procedural Skills into auditable surfaces for source traceability, date and number grounding, deliverable compliance, and issue closure.
Paper describes the design and implementation of the Parthenon framework and its modular decomposition into Model, Harness, Agent roles, Knowledge, Tools, and Skills, claiming these enable auditable traces and grounding.
high positive Parthenon Law: A Self-Evolving Legal-Agent Framework auditability (source traceability), date/number grounding, deliverable complianc...
Per-criterion accuracy climbs with stronger models.
Empirical comparison across model strengths reported in the Harvey LAB study (12,510 trajectories) showing per-criterion accuracy trends correlated with model strength.
high positive Parthenon Law: A Self-Evolving Legal-Agent Framework per-criterion accuracy
TAs remained fully in control and could use, edit, or ignore AI-generated drafts at their discretion.
Study design statement from the randomized field experiment: intervention provided AI-assisted feedback drafts to TAs after grading but kept TAs fully in control to accept, edit, or ignore drafts. 11 TAs in the course.
high positive AI Assistance for Discretionary Work: Increasing Feedback Pr... degree of human control over AI-generated artifacts (procedural/design feature)
Qualitative findings indicate AI-assisted drafts function as editable scaffolds that lower barriers to initiating feedback rather than reducing overall effort.
Qualitative interviews conducted as part of the mixed-methods study (course included 11 TAs and 88 students); thematic/qualitative analysis reported that TAs described drafts as scaffolds that made starting feedback easier and did not simply replace TA effort.
high positive AI Assistance for Discretionary Work: Increasing Feedback Pr... perceived barriers to initiating feedback / perceived TA effort
AI-assisted feedback increases feedback length by 39.8 characters.
Randomized field experiment in the same course; comparison of feedback length between treatment and control. Reported estimate: +39.8 chars, SE=3.45, p<0.001. Student-level random assignment (n=88); 11 TAs.
high positive AI Assistance for Discretionary Work: Increasing Feedback Pr... feedback length (number of characters)
AI-assisted feedback significantly increases feedback provision by 10.8 percentage points.
Randomized field experiment in a 300-level machine learning course. Student submissions (n=88) were randomly assigned to treatment (TAs received AI-assisted feedback drafts) or control. Reported estimate: +10.8 percentage points, SE=1.1, p<0.001. 11 TAs participated and could use, edit, or ignore drafts.
high positive AI Assistance for Discretionary Work: Increasing Feedback Pr... feedback provision (whether feedback was provided)
A tool-augmented agentic AI method (equipped with analytical tools, structured DIKW reasoning agents, and transparent evidence chains) can automatically learn from experimental data to generate new interventions and produce superior interventions compared to Human + Chatbot co-design.
Two-stage field experiments in healthcare prescription messaging comparing Stage 1 (Human + Chatbot: 13 message variants, 444,691 patient visits) to Stage 2 (Tool-Augmented Agentic AI: 17 AI-generated variants, 248,448 patient visits).
high positive Beyond One-shot: AI Agents for Learning in Field Experiments performance of message interventions (measured by CTR and comparative success of...
The best AI-generated message achieved a 69.8% CTR (+6.5 percentage points over baseline).
Stage 2 field experiment in healthcare prescription messaging where AI-generated message variants were tested; result reported directly in paper.
high positive Beyond One-shot: AI Agents for Learning in Field Experiments click-through rate (CTR)
We will open-source all evaluation codes, tasks, and data at https://github.com/mrwwk/DeskCraft.
Author statement promising release of code, tasks, and data (stated in abstract).
high positive DeskCraft: Benchmarking Desktop Agents on Professional Workf... availability of benchmark artifacts (code, tasks, data) as open source
GPT-5.4 reaches 27.6% on interactive tasks.
Author-reported benchmark result for GPT-5.4 on interactive tasks from the evaluation (reported in abstract); presumably measured across the evaluation tasks.
high positive DeskCraft: Benchmarking Desktop Agents on Professional Workf... task success rate / benchmark score on interactive tasks
GPT-5.4 reaches 31.6% on standard tasks.
Author-reported benchmark result for GPT-5.4 on standard tasks from the evaluation (reported in abstract); presumably measured across the evaluation tasks.
high positive DeskCraft: Benchmarking Desktop Agents on Professional Workf... task success rate / benchmark score on standard (non-interactive) tasks
We evaluate 18 proprietary and open source agents on 538 tasks.
Author-reported evaluation methodology and scale (number of agents and tasks) as stated in abstract.
high positive DeskCraft: Benchmarking Desktop Agents on Professional Workf... evaluation sample size (agents and tasks)
Mid-turn interaction captures both agent-initiated clarification under uncertainty and user-initiated interruption during execution, while post-turn interaction accommodates user-driven feedback after the agent signals completion.
Author description of interaction protocol structure (design specification in paper abstract).
high positive DeskCraft: Benchmarking Desktop Agents on Professional Workf... coverage of interaction types (mid-turn and post-turn) in the protocol
DeskCraft formalizes human-agent collaboration into an interaction protocol covering mid-turn and post-turn exchanges.
Author statement in abstract describing the protocol (design/method contribution).
high positive DeskCraft: Benchmarking Desktop Agents on Professional Workf... formalization of human-agent interaction protocol
DeskCraft covers professional creative software across design, video, audio, and 3D creation.
Author statement in abstract listing covered software domains.
high positive DeskCraft: Benchmarking Desktop Agents on Professional Workf... breadth of software domains covered by the benchmark
DeskCraft organizes tasks into a multilevel difficulty taxonomy, with long horizon tasks requiring over 50 execution steps.
Benchmark design described in abstract (explicit statement that long-horizon tasks require over 50 execution steps).
high positive DeskCraft: Benchmarking Desktop Agents on Professional Workf... task difficulty / horizon length (number of execution steps)
We introduce DeskCraft, a desktop GUI benchmark targeting long horizon creative and engineering workflows and proactive human-agent collaboration.
Author statement describing the new benchmark (benchmark design and scope described in paper).
high positive DeskCraft: Benchmarking Desktop Agents on Professional Workf... availability of a benchmark for long-horizon workflows and human-agent collabora...
The paper constructs firm-level indicators of artificial intelligence and new quality productive forces for new energy vehicle firms.
Authors state they constructed firm-level indicators as part of their empirical approach on the Yangtze River Delta panel dataset.
high positive Mechanisms and Effects of Artificial Intelligence on New Qua... construction of measurement indicators (methodological contribution)
Artificial intelligence affects firms' new quality productive forces through improvement of innovation output.
Mechanism tests reported by the authors showing empirical evidence that AI improves innovation output (e.g., measured innovation outcomes) which is linked to higher new quality productive forces.
high positive Mechanisms and Effects of Artificial Intelligence on New Qua... new quality productive forces (mediated by innovation output)
Artificial intelligence affects firms' new quality productive forces through optimization of R&D personnel structure.
Mechanism tests reported by the authors using the constructed indicators and panel data; empirical evidence cited that links AI to changes in R&D personnel structure which in turn link to new quality productive forces.
high positive Mechanisms and Effects of Artificial Intelligence on New Qua... new quality productive forces (mediated by R&D personnel structure)
The promoting effect of artificial intelligence on new quality productive forces is more pronounced among small-sized enterprises.
Heterogeneity tests by firm size in the panel data; authors report stronger positive effects for small-sized firms.
high positive Mechanisms and Effects of Artificial Intelligence on New Qua... new quality productive forces (firm-size heterogeneity)