The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (13870 claims)

Adoption
8467 claims
Productivity
7558 claims
Governance
6805 claims
Human-AI Collaboration
6363 claims
Org Design
4132 claims
Innovation
4065 claims
Labor Markets
3526 claims
Skills & Training
2945 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 749 196 98 892 1984
Governance & Regulation 817 394 188 121 1544
Organizational Efficiency 771 189 124 83 1177
Technology Adoption Rate 627 233 123 96 1088
Research Productivity 411 123 56 332 933
Output Quality 467 178 59 47 751
Decision Quality 320 174 75 42 618
Firm Productivity 435 55 88 20 604
AI Safety & Ethics 214 276 65 33 593
Market Structure 178 167 122 24 496
Task Allocation 207 64 71 32 379
Skill Acquisition 165 59 60 17 301
Innovation Output 203 27 43 18 292
Employment Level 105 52 107 13 279
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 116 63 42 11 232
Firm Revenue 150 48 26 3 227
Inequality Measures 44 122 49 6 221
Task Completion Time 169 29 8 12 219
Worker Satisfaction 89 63 20 12 184
Error Rate 69 92 10 2 173
Regulatory Compliance 76 68 14 5 163
Training Effectiveness 93 21 13 19 148
Wages & Compensation 77 36 25 6 144
Automation Exposure 51 54 22 12 142
Team Performance 86 17 27 9 140
Developer Productivity 94 17 14 6 132
Job Displacement 12 80 20 1 113
Hiring & Recruitment 51 7 8 3 69
Creative Output 31 17 7 3 59
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 17 17 51
Worker Turnover 11 12 3 26
Industry 1 1
The paper is primarily discursive and invitational: it opens a dialogue and proposes a research agenda rather than providing definitive empirical answers.
Stated methodological stance and limits: conceptual/philosophical analysis, interdisciplinary literature synthesis, qualitative/illustrative examples, and explicit note of no systematic empirical evaluation.
high null result At the table with Wittgenstein: How language shapes taste an... presence/absence of new empirical datasets or systematic experimental validation...
Operators and regulators should prioritize independent model audits, disclosure of data use, fairness/error rates, and field experiments to quantify causal impacts and heterogeneous effects.
Policy recommendations and research priorities summarized in the review based on identified methodological and governance gaps.
high null result Deep technologies and safer gambling: A systematic review. policy/research actions recommended (qualitative)
Research gaps include the need for robust causal evaluations (RCTs, field experiments), standardized metrics, transparency/interpretability, fairness analysis, and cross‑jurisdictional studies.
Review's recommendations and identified gaps, noting scarcity of RCTs/longitudinal work and calls for standardized outcomes and fairness checks.
high null result Deep technologies and safer gambling: A systematic review. presence of causal evaluations, standardized metrics, transparency and fairness ...
Heterogeneous study designs, outcomes, and measures across the literature hinder quantitative meta‑analysis and synthesis of effectiveness.
Review states heterogeneity of designs and outcome measures as a limitation preventing meta‑analysis.
high null result Deep technologies and safer gambling: A systematic review. heterogeneity of study designs and outcome measures (qualitative / count of disp...
Typical data used in studies are platform behavioural logs (bets, stakes, timestamps, session durations), account metadata, and in some cases limited self‑report measures.
Review summary of data sources across included studies listing platform logs and metadata as primary inputs to algorithms.
high null result Deep technologies and safer gambling: A systematic review. data types employed in models (behavioral log variables, account metadata, self‑...
Evaluation approaches in the reviewed literature varied widely, with many studies using retrospective accuracy metrics (AUC, precision/recall) rather than causal impact measures on harm reduction.
Methods synthesis in review: prevalence of supervised/unsupervised ML with retrospective performance reporting; few RCTs or field experiments reported.
high null result Deep technologies and safer gambling: A systematic review. type of evaluation used (retrospective predictive metrics vs causal designs)
Four primary application areas were identified: (1) behavioural monitoring and feedback, (2) predictive risk modelling, (3) decision support and AI classifiers, and (4) limit‑setting and self‑exclusion tools.
Thematic synthesis of included studies categorizing described applications into four main areas (review taxonomy).
high null result Deep technologies and safer gambling: A systematic review. application area classification (categorical counts / thematic presence)
Searches were performed in Web of Science, PubMed, Scopus, EBSCO and IEEE, plus manual searches, following PRISMA guidelines.
Methods section of the review specifying databases searched and PRISMA-guided review process.
high null result Deep technologies and safer gambling: A systematic review. search strategy / databases searched (qualitative)
The review included 68 empirical and methodological studies on deep technologies in online gambling.
Systematic review following PRISMA; searches of Web of Science, PubMed, Scopus, EBSCO, IEEE and manual searching produced 68 included studies (count reported in paper).
high null result Deep technologies and safer gambling: A systematic review. number of included studies (study count = 68)
The collection includes a mix of methodological papers, empirical applications demonstrating ecological insight, and translational work focused on policy or conservation practice.
Study-types categorization provided in the paper (descriptive tally/characterization of the kinds of contributions in the collection).
high null result Towards ‘digital ecology’: Advances in integrating artificia... types of studies present in the collection
Methods in the collection span from automated image and signal processing for routine tasks to integrated modelling that couples ecological theory with data‑driven methods.
Methods-scope summary in the paper describing the range of AI/ML approaches used across the collection (descriptive across studies).
high null result Towards ‘digital ecology’: Advances in integrating artificia... range of methodological approaches used
The collection uses large ecological observational datasets such as camera‑trap imagery, sensor streams, biodiversity surveys, and other high‑volume ecological monitoring data.
Data & methods section listing the data types represented across the reviewed papers (descriptive inventory of dataset types used in the collection).
high null result Towards ‘digital ecology’: Advances in integrating artificia... types of data used in ecological AI research
Recommendation (research): Future research should link AI adoption to objective performance metrics (profitability, default rates, processing times) and use longitudinal or quasi-experimental designs to identify causal effects.
Authors' suggested research directions noted in the summary, motivated by limitations of cross-sectional, self-reported data.
high null result From Data to Decisions: Harnessing Artificial Intelligence f... research design and outcome measurement (recommendation)
The summary omits important reporting details: p-values, standard errors, model control variables, and exact variable operationalizations are not provided.
Explicit reporting gap noted in the paper summary (absence of p-values, SEs, controls, and operationalization details).
high null result From Data to Decisions: Harnessing Artificial Intelligence f... statistical reporting completeness
Because the data are cross-sectional and self-reported, the design limits causal inference about AI adoption causing the observed outcomes.
Study design (cross-sectional survey, self-reported measures) and explicit limitation noted in the paper summary.
high null result From Data to Decisions: Harnessing Artificial Intelligence f... ability to infer causality
Key measures are self-reported Likert scales for AI adoption/usage and the dependent outcomes (financial decision-making efficiency, operational efficiency, financial resilience, and AI-based analytics effectiveness).
Measurement description in Methods: independent and dependent variables reported as self-reported Likert measures collected in the cross-sectional survey.
high null result From Data to Decisions: Harnessing Artificial Intelligence f... measurement type (self-reported Likert scales)
The study is a cross-sectional quantitative survey of 312 professionals in banks, fintechs, and financial service firms.
Study design and sample description reported in Data & Methods; sample size explicitly given as N = 312 and composition described as professionals across financial institutions, fintech organizations, and financial service companies.
The SKILL.md used in the with-skill condition encodes workflow logic, API patterns, and business rules as portable domain guidance for agents.
Paper description of the with-skill intervention specifying the content and intended role of SKILL.md.
high null result SKILLS: Structured Knowledge Injection for LLM-Driven Teleco... presence and content type of injected domain guidance (workflow logic, API patte...
We evaluated open-weight models under two conditions: baseline (generic agent with tool access but no domain guidance) and with-skill (agent augmented with a portable SKILL.md document encoding workflow logic, API patterns, and business rules).
Experimental design in paper describing the two agent conditions; SKILL.md described as the injected domain guidance artifact.
high null result SKILLS: Structured Knowledge Injection for LLM-Driven Teleco... experimental condition (baseline vs with-skill)
Each scenario is grounded in live mock API servers with seeded production-representative data, MCP tool interfaces, and deterministic evaluation rubrics combining response content checks, tool-call verification, and database state assertions.
Methods/benchmark design described in paper specifying environment: live mock APIs, seeded data, MCP tool interfaces, and deterministic evaluation combining content checks, tool-call verification, and DB assertions.
high null result SKILLS: Structured Knowledge Injection for LLM-Driven Teleco... evaluation environment fidelity and evaluation criteria (content checks, tool-ca...
SKILLS comprises 37 telecom operations scenarios spanning 8 TM Forum Open API domains (TMF620, TMF621, TMF622, TMF628, TMF629, TMF637, TMF639, TMF724).
Framework specification in the paper; explicit statement of scenario count (37) and list of 8 TMF Open API domains.
high null result SKILLS: Structured Knowledge Injection for LLM-Driven Teleco... coverage: number of scenarios (37) and number of API domains (8) included
We introduce SKILLS (Structured Knowledge Injection for LLM-driven Service Lifecycle operations), a benchmark framework for telecom operations.
Paper describes the design and release of the SKILLS benchmark framework as the contribution; methods section outlines framework components and usage.
high null result SKILLS: Structured Knowledge Injection for LLM-Driven Teleco... existence and definition of the SKILLS benchmark framework
The paper identifies three core mechanisms underlying calibrated trust and complementarity: (1) calibrated trust balancing reliance and oversight, (2) complementarity–trust interaction for optimal performance, and (3) dynamic feedback loops producing reinforcing learning cycles.
Explicit identification of mechanisms claimed in the paper's synthesis; this is a descriptive claim about the paper's content rather than an empirical finding—no sample or empirical test reported in the abstract.
high null result Optimising Human– AI Decision Performance: A Trust and Cap... n/a (identification of theoretical mechanisms)
AI-adopting firms do not increase capital expenditures following adoption.
Firm-level capex analysis showing no significant change in capital expenditures for adopters versus nonadopters post-adoption in the paper's empirical framework.
high null result AI and Productivity: The Role of Innovation capital expenditures (capex)
It remains unclear how developers' general programming and security-specific experience, and the type of AI tool used (free vs. paid), affect the security of the resulting software — motivating this study.
Paper's stated research gap/motivation: the authors identify uncertainty in the literature regarding interactions between developer experience, AI tool tier (free vs. paid), and resulting code security.
high null result The Impact of AI-Assisted Development on Software Security: ... the combined effect of developer experience and AI tool type on code security (i...
Participants were assigned a security-related programming task using either no AI tools, the free version, or the paid version of Gemini.
Experimental design described in the paper: random/conditional assignment of participants into three groups (no AI, free Gemini, paid Gemini) performing the same security-related programming task.
high null result The Impact of AI-Assisted Development on Software Security: ... experimental condition (tool used) as it relates to subsequent code security out...
We conducted a quantitative programming study with software developers (n = 159) exploring the impact of Google's AI tool Gemini on code security.
Explicit methodological statement in the paper: a quantitative study with 159 participating software developers assigned to experimental conditions to evaluate Gemini's impact on security-related programming tasks.
high null result The Impact of AI-Assisted Development on Software Security: ... impact of Gemini on code security (security of code produced in the study)
The authors surveyed workers and developers on a representative sample of 171 tasks and used language models (LMs) to scale ratings to 10,131 computer-assisted tasks across all U.S. occupations.
Study methodology reported in the paper: surveys of 'workers and developers' on 171 tasks, plus LM-based scaling to 10,131 tasks (coverage claims across U.S. occupations).
high null result Are We Automating the Joy Out of Work? Designing AI to Augme... coverage and scaling of task-level ratings (number of tasks surveyed and number ...
SWE-Skills-Bench is available at https://github.com/GeniusHTX/SWE-Skills-Bench.
Repository URL provided in the paper for the benchmark's code/data.
high null result SWE-Skills-Bench: Do Agent Skills Actually Help in Real-Worl... public availability (URL) of the benchmark
SWE-Skills-Bench provides a testbed for evaluating the design, selection, and deployment of skills in software engineering agents.
Benchmark design pairs skills, repositories, and deterministic verification tests; intended use stated by authors as a testbed for evaluation of skills.
high null result SWE-Skills-Bench: Do Agent Skills Actually Help in Real-Worl... availability of a benchmarking testbed for evaluating agent skills
39 of 49 skills yield zero pass-rate improvement.
Empirical evaluation over 49 skills and ~565 task instances reporting that 39 skills produced no improvement in test pass rate when injected.
high null result SWE-Skills-Bench: Do Agent Skills Actually Help in Real-Worl... change in task acceptance-test pass rate (zero improvement)
The authors introduce a deterministic verification framework that maps each task's acceptance criteria to execution-based tests, enabling controlled paired evaluation with and without the skill.
Method: creation of a deterministic verification framework that converts acceptance criteria into executable tests; used to perform paired evaluations (with skill vs. without skill).
high null result SWE-Skills-Bench: Do Agent Skills Actually Help in Real-Worl... ability to deterministically verify task acceptance criteria via execution-based...
SWE-Skills-Bench pairs 49 public SWE skills with authentic GitHub repositories pinned at fixed commits and requirement documents with explicit acceptance criteria, yielding approximately 565 task instances across six SWE subdomains.
Benchmark construction: 49 public skills, repositories pinned to fixed commits, requirement documents with acceptance criteria, producing ~565 task instances spanning six SWE subdomains (as reported by the paper).
high null result SWE-Skills-Bench: Do Agent Skills Actually Help in Real-Worl... number of skill-repo-task instances (~565) and coverage across six subdomains
The article introduces a novel Bayesian Item Response Theory framework that quantifies human–AI synergy by separately estimating individual ability, collaborative ability, and AI model capability while controlling for task difficulty.
Methodological contribution described in the paper: development and application of a Bayesian Item Response Theory model that includes separate parameters for individual ability, collaborative ability, AI model capability, and task difficulty (method section of the paper).
high null result Quantifying and Optimizing Human-AI Synergy: Evidence-Based ... estimated parameters for individual ability, collaborative ability, AI model cap...
The Planner is trained via Supervised Fine-Tuning (SFT) to internalize diagnostic capabilities and then aligned with business outcomes (conversion rate) via Reinforcement Learning (RL).
Method description in the paper specifying SFT initialization followed by RL alignment targeting conversion rate (UCVR) as reward signal.
high null result Probe-then-Plan: Environment-Aware Planning for Industrial E... Planner diagnostic behavior and policy alignment with conversion rate (model tra...
EASP's Offline Data Synthesis stage: a Teacher Agent synthesizes diverse, execution-validated plans by diagnosing the probed environment.
Method description in the paper detailing the Teacher Agent's role in synthesizing execution-validated plans during offline data synthesis.
high null result Probe-then-Plan: Environment-Aware Planning for Industrial E... synthesized execution-validated search plans (data generation outcome)
The Probe-then-Plan mechanism uses a lightweight Retrieval Probe to expose the retrieval snapshot, enabling the Planner to diagnose execution gaps and generate grounded search plans.
Methodological description in the paper: design and implementation of Retrieval Probe and Planner; validated through synthesized data and downstream evaluations (offline and online).
high null result Probe-then-Plan: Environment-Aware Planning for Industrial E... retrieval snapshot exposure and Planner diagnostic output (implementation/functi...
Descriptive statistics, reliability tests, regression analysis, and structural equation modelling (SEM) were employed to analyse the relationships between AI adoption and entrepreneurial outcomes.
Methods section reporting use of descriptive statistics, reliability tests, regression analysis, and SEM to evaluate relationships between AI adoption and measured outcomes.
high null result Entrepreneurship in the Era of Artificial Intelligence: Rede... not applicable (methodological detail)
The study used a quantitative research design and collected data from 350 entrepreneurs and managers of small and medium-sized enterprises (SMEs) who had adopted AI in their business operations.
Methods section of the paper specifying a quantitative design and a sample size of 350 AI-adopting SME entrepreneurs/managers.
high null result Entrepreneurship in the Era of Artificial Intelligence: Rede... not applicable (methodological detail)
The study used portfolio-level analysis to compare the financial outcomes of portfolios constructed using AI-driven ESG indicators with those based on conventional ESG ratings.
Methodological statement in the paper: portfolio-level analysis and comparative design. The summary does not specify the number of portfolios, asset universes, time frame, or construction rules.
high null result Green Intelligence in Finance: Artificial Intelligence-Drive... Study methodology (portfolio-level comparative analysis)
A quantitative methodology was employed, utilizing a structured questionnaire administered to 400 small business owners.
Explicit methodological statement in the paper: structured questionnaire survey with sample size N=400 small business owners.
high null result The role of artificial intelligence in enhancing financial l... method / sample (use of structured questionnaire; sample size = 400)
The study uses a game-theoretic model involving a foundation model provider and two competing downstream firms to analyze how policy interventions affect consumer surplus in the AI supply chain.
Methodological description in the paper: a formal game-theoretic model with one upstream provider and two downstream competing firms; equilibrium analysis and comparative statics are performed on model outcomes (prices, qualities, profits, consumer surplus).
high null result The Economics of AI Supply Chain Regulation model equilibrium outcomes (prices, qualities, provider profit, downstream profi...
Foi realizada etnografia organizacional orientada ao SCF, com roteiro e triangulação de evidências.
Método qualitativo divulgado no resumo: etnografia organizacional com roteiro e triangulação; o resumo não fornece número de organizações, duração ou amostragem.
high null result A FRICÇÃO PSICOANTROPOLÓGICA (SCF - Symbolic-Cognitive Frict... evidências qualitativas da existência e manifestação da fricção psicoantropológi...
Foi construído e validado um instrumento psicométrico (escala SCF-30) e calculado um índice 0–100, com modelagem por Equações Estruturais (SEM) e testes de confiabilidade/validade.
Descrição metodológica explícita no resumo: construção e validação da escala SCF-30, uso de SEM e testes de confiabilidade e validade. O resumo não detalha estatísticas, amostra ou resultados numéricos.
high null result A FRICÇÃO PSICOANTROPOLÓGICA (SCF - Symbolic-Cognitive Frict... pontuação SCF (índice 0–100) e propriedades psicométricas da escala SCF-30 (conf...
O SCF é operacionalizado por três vetores centrais: Percepção de Complexidade (PC), Aversão ao Risco Institucional (AR) e Inércia Cultural (IC).
Estrutura conceitual e operacional apresentada no artigo; especificação explícita dos três vetores como componentes do construto SCF.
high null result A FRICÇÃO PSICOANTROPOLÓGICA (SCF - Symbolic-Cognitive Frict... componentes constituintes do construto SCF (PC, AR, IC)
This research conducts a critical analysis of the ethical implications of artificial intelligence in terms of job displacement during the fifth industrial revolution.
Author-declared methodology: a literature-based critical analysis drawing on novel studies and the existing body of literature; no further methodological details (e.g., inclusion criteria, databases searched) provided in the excerpt.
high null result A Study on Work-Life Balance of Women Employees in the IT Se... ethical implications of AI-related job displacement
This study uses panel data on agricultural firms listed on the Shanghai and Shenzhen A-share markets from 2007 to 2023 and applies a multidimensional fixed-effects model to estimate the impact of AI on firms’ total factor productivity (TFP).
Methodological statement in the paper: dataset = panel of listed agricultural firms (Shanghai and Shenzhen A-share markets), time period 2007–2023; empirical approach = multidimensional fixed-effects model.
high null result Artificial intelligence and the sustainable development of a... study design / estimation of AI impact on total factor productivity (TFP)
Degree, betweenness, and eigenvector centrality metrics were used to identify structural vulnerabilities and leverage points in the construction supply chain network.
Paper reports calculation of degree, betweenness, and eigenvector centrality to outline vulnerabilities; specific metrics and interpretations are reported (e.g., degree centrality value for brokers).
high null result Social-Network Analytics of Construction Supply Chain network centrality measures (degree, betweenness, eigenvector) as indicators of ...
Thematic coding translated reported interactions into nodes and edges of a complex network and grouped challenges into thematic categories.
Methods described: thematic coding applied to interview data to create network structure and to generate challenge categories (six main categories, 16 open codes reported).
high null result Social-Network Analytics of Construction Supply Chain conversion of qualitative interactions into network structure and thematic categ...
This study combines empirical, semi-structured interviews with social network analytics to map construction supply chain relationships and vulnerabilities.
Methods reported in the paper: use of semi-structured interviews plus social network analysis (thematic coding to create nodes/edges, calculation of network metrics). Sample size not specified in the abstract.
high null result Social-Network Analytics of Construction Supply Chain research method integration (interviews + social network analytics)