The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (13870 claims)

Adoption
8467 claims
Productivity
7558 claims
Governance
6805 claims
Human-AI Collaboration
6363 claims
Org Design
4132 claims
Innovation
4065 claims
Labor Markets
3526 claims
Skills & Training
2945 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 749 196 98 892 1984
Governance & Regulation 817 394 188 121 1544
Organizational Efficiency 771 189 124 83 1177
Technology Adoption Rate 627 233 123 96 1088
Research Productivity 411 123 56 332 933
Output Quality 467 178 59 47 751
Decision Quality 320 174 75 42 618
Firm Productivity 435 55 88 20 604
AI Safety & Ethics 214 276 65 33 593
Market Structure 178 167 122 24 496
Task Allocation 207 64 71 32 379
Skill Acquisition 165 59 60 17 301
Innovation Output 203 27 43 18 292
Employment Level 105 52 107 13 279
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 116 63 42 11 232
Firm Revenue 150 48 26 3 227
Inequality Measures 44 122 49 6 221
Task Completion Time 169 29 8 12 219
Worker Satisfaction 89 63 20 12 184
Error Rate 69 92 10 2 173
Regulatory Compliance 76 68 14 5 163
Training Effectiveness 93 21 13 19 148
Wages & Compensation 77 36 25 6 144
Automation Exposure 51 54 22 12 142
Team Performance 86 17 27 9 140
Developer Productivity 94 17 14 6 132
Job Displacement 12 80 20 1 113
Hiring & Recruitment 51 7 8 3 69
Creative Output 31 17 7 3 59
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 17 17 51
Worker Turnover 11 12 3 26
Industry 1 1
Our findings highlight the importance of additional research and progress on economic measurement related to AI.
Authors' concluding statement/recommendation based on their results and measurement challenges discussed in the paper.
high null result Early Estimates of the Impact of AI Within BEA’s Industry Ec... need for improved economic measurement and further research
We use tools to indirectly estimate the impact of AI via the lens of BEA’s industry accounts.
Methodological description in the paper: authors apply indirect estimation methods using BEA industry accounts to infer AI's economic impact.
high null result Early Estimates of the Impact of AI Within BEA’s Industry Ec... economic impact of AI (estimated indirectly)
Currently, there is not a line item in the U.S. national accounts that can be used to identify and measure the economic impact of artificial intelligence (AI).
Statement by authors about the state of U.S. national accounts (BEA) and absence of a specific national-accounts line item for AI.
high null result Early Estimates of the Impact of AI Within BEA’s Industry Ec... presence/absence of a national-accounts line item for AI
We compare multiple state-of-the-art agents (e.g., GPT-4o, Llama 3, Qwen2) on metrics assessing tool selection accuracy, faithfulness, and hallucination.
Paper lists evaluated models (GPT-4o, Llama 3, Qwen2) and reports evaluation on metrics including tool selection accuracy, faithfulness, and hallucination across the benchmark.
high null result Time Series Augmented Generation for Financial Applications tool selection accuracy, faithfulness, hallucination
Our benchmark consists of 100 financial questions.
Paper explicitly states the benchmark contains 100 financial questions.
high null result Time Series Augmented Generation for Financial Applications benchmark size (number of questions)
The outreach casenotes used in the study are fairly short and heavily redacted.
Descriptive statement about the dataset of street outreach casenotes provided by the nonprofit partner used in the audit (direct observation by authors).
high null result Auditing LLMs for Algorithmic Fairness in Casenote-Augmented... casenote length and degree of redaction
LLM zero-shot classification does not introduce additional textual biases beyond the algorithmic biases already present in tabular classification.
Authors' assessment/audit comparing zero-shot LLM classification using casenote text against tabular-only classification, concluding no additional textual bias introduced. (Details and sample size not provided in abstract.)
high null result Auditing LLMs for Algorithmic Fairness in Casenote-Augmented... additional textual bias introduced by LLM zero-shot classification relative to t...
Under three scenarios (optimistic: 2028-2035; base: 2035-2045; pessimistic: 2045-2060), we specify disconfirmation criteria that would weaken the thesis if observed.
Scenario analysis and specification of disconfirmation criteria by the authors; methodological claim about forecasting structure rather than empirical result.
high null result The Instrumental Dissolution of Typing: Why AI Challenges th... timing of transition/adoption scenarios and falsification criteria
Converging evidence from history, philosophy, neuroscience, technology, organizational studies, and cultural analysis supports this thesis.
Authors' multidisciplinary literature review and synthesis across the named fields (method: qualitative review); no single empirical dataset or sample size given.
high null result The Instrumental Dissolution of Typing: Why AI Challenges th... strength of multidisciplinary support for the thesis
We introduce 'instrumental dissolution' -- loss of institutional-default status while persisting in specialist niches.
Conceptual/theoretical contribution defined by the authors and illustrated via cross-disciplinary examples; no empirical validation sample reported.
high null result The Instrumental Dissolution of Typing: Why AI Challenges th... change in institutional-default status of a technology
Typing's dominance was instrumental, not cognitively necessary.
Argumentative/historical analysis presented in the paper; synthesis of historical and philosophical literature (no empirical sample or experiment reported).
high null result The Instrumental Dissolution of Typing: Why AI Challenges th... instrumental status of keyboard in knowledge work
We conducted an in-the-wild evaluation with over 2,200 individuals from heterogeneous organisations and roles in 116 countries, via log analysis, surveys, and 20 interviews.
Reported evaluation methods and sample in the paper's abstract: log analysis, surveys, and 20 interviews with over 2,200 participants across 116 countries.
high null result Learning from AVA: Early Lessons from a Curated and Trustwor... evaluation sample and methods
Participants were retested individually on the programming tasks after a retention interval of one week.
Statement in abstract describing follow-up retest procedure (one-week retention interval, individual retest).
high null result Fast and Forgettable: A Controlled Study of Novices' Perform... retest performance after one-week retention interval
Participants were incentivized by bonus compensation to balance performance with understanding.
Paper description of participant incentives in methods/abstract; compensation scheme used during experiment.
high null result Fast and Forgettable: A Controlled Study of Novices' Perform... incentive structure (bonus compensation)
We conducted a controlled pair programming study with 22 participants who wrote Python code under time pressure in teams of two and individually with GitHub Copilot for 20 minutes each.
Statement of study design in the paper's methods/abstract; controlled pair programming experiment with 22 participants, 20-minute tasks in both conditions (human teammate and Copilot).
high null result Fast and Forgettable: A Controlled Study of Novices' Perform... study design / experimental conditions (teams of two vs individual with Copilot;...
We measure processes of polarization and integration in global AI research over three decades using large-scale scientific publication data.
Methodological claim describing the study: the analysis spans three decades and uses large-scale publication data and network comparisons to randomized baselines.
high null result Polarization and Integration in Global AI Research measurement of polarization and integration processes
A stylized calibration to four providers using April 2026 data treats parameter values as inputs to a comparative risk mapping, not structural estimates.
Paper reports a calibration exercise using data from four providers (April 2026) and emphasizes it is a comparative mapping rather than structural estimation.
high null result The Inference Bottleneck: A Formal Model of Vertical Foreclo... comparative risk mapping across providers
Discrimination (QoS gap) vanishes at a joint boundary rather than at a simple threshold in alpha alone.
Analytical result from the model characterizing the boundary conditions for non-discrimination.
high null result The Inference Bottleneck: A Formal Model of Vertical Foreclo... presence/absence of QoS discrimination
The framework is evaluated against forecast-driven base-stock and greedy fulfillment heuristics, and against a perfect-information oracle; pairwise differences are examined using Wilcoxon signed-rank tests.
Experimental evaluation setup described in the paper: comparisons to two heuristic baselines and an oracle, and use of Wilcoxon signed-rank tests for pairwise comparisons.
high null result Omnichannel Supply Chains Amid Demand Shocks: A Centralized ... evaluation protocol (comparators and statistical test used)
Demand shocks are modeled using two specifications: a mixed profile (half the products follow a uniform demand process and the rest follow a Merton-type jump-diffusion process) and a fully shock-driven profile.
Modeling choices described in the methods: two demand-shock specification setups for simulation experiments.
high null result Omnichannel Supply Chains Amid Demand Shocks: A Centralized ... demand process specification used in experiments (mixed vs fully shock-driven)
Policies are learned using Proximal Policy Optimization (PPO) in an actor–critic architecture, with bounded stochastic policies to handle constrained action spaces.
Method description in the paper specifying the use of PPO, actor–critic structure, and bounded stochastic policy parameterization.
high null result Omnichannel Supply Chains Amid Demand Shocks: A Centralized ... learning algorithm and policy parameterization (PPO actor–critic with bounded st...
The study develops a centralized Hierarchical Reinforcement Learning (HRL) control framework that makes decision timing explicit: replenishment and allocation are optimized weekly, while fulfillment and lateral inventory rebalancing are controlled daily.
Methodological description in the paper: design of an HRL framework with two-level timing (weekly vs daily) for different control decisions.
high null result Omnichannel Supply Chains Amid Demand Shocks: A Centralized ... decision timing policy (weekly replenishment/allocation; daily fulfillment/rebal...
In both popular and academic press, concerns are often expressed that AI threatens not only people’s livelihoods but also the meaning they derive from their work.
Observational/literature-commentary claim made in the paper's abstract; references to discourse in popular and academic press (no empirical study or sample reported).
high null result Is artificial intelligence a threat to meaningful work and l... concerns about threat to livelihoods and meaning derived from work (public and a...
Buildings account for approximately 40% of global energy consumption.
Statement in paper (background/contextual fact); likely based on cited external data though no sample size reported in excerpt.
high null result Safe Deep Reinforcement Learning for Building Heating Contro... share of global energy consumption accounted for by buildings
The analysis uses causal discovery methods and integrates scenario-based outcomes, communication analysis, and questionnaire measures.
Paper abstract states that causal discovery analysis was used and that it integrates scenario outcomes, communication analysis, and questionnaire measures.
The study examines user Extraversion and Agreeableness alongside AI design characteristics including Adaptability, Expertise, and chain-of-thought Transparency.
Variables listed in the abstract as the human personality traits and AI design characteristics analyzed.
high null result Imperfectly Cooperative Human-AI Interactions: Comparing the... personality_and_design_factors
The study compares two interaction scenario categories: (1) hiring negotiations between human job candidates and AI hiring agents; and (2) human-AI transactions in which AI agents may conceal information to maximize internal goals.
Explicit description of the two scenario categories in the paper abstract; method: experimental / simulation scenarios.
The study includes a parallel human subjects experiment involving 290 human participants.
Statement in paper abstract reporting a human-subjects experiment with 290 participants.
The study uses a purely simulated dataset comprising 2,000 simulations.
Statement in paper abstract describing a simulated dataset of 2,000 simulations; method: simulation experiments.
Algorithmic accuracy alone does not determine value; legitimacy and uptake hinge on people's and process readiness.
Thematic conclusion drawn from interviews, Likert surveys, and document analysis across cases indicating non-technical factors strongly influence uptake despite algorithmic performance metrics. (Sample size not reported.)
high null result Overcoming Resistance to Change: Artificial Intelligence in ... value realised / uptake of AI systems
The study utilized 3.87 million consumer comments from 127,846 product listings to build and validate models.
Data description reported in paper: 3.87 million consumer comments and 127,846 product listings used.
high null result Enhancing Supply Chain Resilience in Textile SMEs: A Human-C... dataset size (number of consumer comments and product listings)
The study's measurement model is supported by Composite Reliability (CR), Average Variance Extracted (AVE), and several model-fit indicators.
Paper explicitly states CR, AVE, and model-fit indices were used and supported the construct measurements and SEM.
high null result A Machine Learning Perspective on FinTech-Driven Inclusion: ... measurement model reliability and validity
Principal Component Analysis (PCA) identified the main constructs related to adoption of FinTech and perceived algorithmic trust.
Paper reports using PCA to identify constructs underlying adoption and perceived algorithmic trust prior to CFA/SEM.
high null result A Machine Learning Perspective on FinTech-Driven Inclusion: ... construct validity for adoption and perceived algorithmic trust
Structured questionnaires were administered to 400 respondents in both city and rural areas of developing countries.
Method section statement specifying a quantitative research design and that structured questionnaires were sent to 400 respondents.
high null result A Machine Learning Perspective on FinTech-Driven Inclusion: ... survey responses (study data collection)
Molecular representations discussed include string-based methods, topological models, five key categories of Graph Neural Networks (GNNs), 3D-aware Geometric Deep Learning (GDL), emerging Quantum Machine Learning (QML), and Hybrid Quantum-Classical Neural Networks (HQNNs).
Taxonomy and descriptive enumeration of representation classes provided by the review (no empirical comparison or performance claims quantified in the provided text).
high null result Artificial intelligence in drug discovery from advanced mole... categorization of molecular representation methods
The study combines theoretical analysis with quantitative empirical research using survey data from Bosnia and Herzegovina analyzed by regression.
Paper summary states the methodological approach: theoretical analysis plus a quantitative empirical study based on survey data from Bosnia and Herzegovina, analyzed with regression methods. No further methodological details or sample size provided in the summary.
high null result Application of artificial intelligence in e-commerce methodological approach (theory + survey/regression)
The long-term dynamic effects of AI on resilience remain unverified and require longer-term data.
Authors explicitly state the need for longer time-series data to validate long-term dynamics.
high null result The impact mechanism of artificial intelligence on the resil... long-term dynamic effects of AI on SCR
Enterprise-level indicators used in the study do not directly capture supply chain network structure and node dependencies.
Explicit limitation noted by the authors about measurement and scope.
high null result The impact mechanism of artificial intelligence on the resil... accuracy/completeness of supply chain network characterization
The study's sample is limited to listed manufacturing companies, so conclusions should be applied cautiously to small and medium-sized enterprises (SMEs).
Explicit limitation stated by the authors in the paper.
high null result The impact mechanism of artificial intelligence on the resil... generalizability of findings to SMEs
Mediation and moderation models are leveraged to explore how AI enhances resilience via resource allocation optimization, productivity, and technological innovation, and how conditional factors (e.g., agility) affect these links.
Authors state they used mediation and moderation models on firm-level data to test mechanisms and conditional effects.
high null result The impact mechanism of artificial intelligence on the resil... supply chain resilience (SCR) and mediators/moderators (TFP, technological innov...
The study uses data on A-share listed manufacturing companies from 2011 to 2023 and applies a multi-period difference-in-differences (DID) model to assess AI's impact on SCR.
Methods description provided in the paper summary: sample timeframe and econometric approach explicitly stated.
high null result The impact mechanism of artificial intelligence on the resil... supply chain resilience (SCR) (target of analysis)
A randomly sampled coalition of equal size remains largely ineffective at increasing platform spending / wages.
Theoretical comparison in the model between targeted coalitions and randomly sampled coalitions of the same size; analytical results showing limited impact for random coalitions.
high null result Stochastic wage suppression on gig platforms and how to orga... change in platform spending / worker wages due to coalition action
We contribute junior–senior accounts on their usage of agentic AI through a three-phase mixed-methods study: ACTA combined with a Delphi process with 5 seniors, an AI-assisted debugging task with 10 juniors, and blind reviews of junior prompt histories by 5 more seniors.
Authors' methodological description of the study design and participant counts as reported in the paper.
high null result From Junior to Senior: Allocating Agency and Navigating Prof... Study design / data collection approach (ACTA + Delphi; task experiment; blind r...
The article examines the socioeconomic implications of AI-driven automation through the lens of political economy and labor sociology.
Methodological statement in the paper indicating theoretical framing and disciplinary approaches; no empirical sample reported in the abstract.
In the patent citation network, neither technological diversity nor technological proximity shows a significant impact on main path formation.
Layer-specific ERGM results for the patent citation network reporting non-significant coefficients for variables measuring technological diversity and technological proximity.
high null result Mapping China’s digital transformation: a multilayer network... effect of diversity and proximity on main path formation (patent layer)
We introduce a new benchmark QuantSightBench to assess prediction-interval forecasting capability and evaluate frontier models under multiple settings, assessing both empirical coverage and interval sharpness.
Methodological contribution reported in the paper: creation of QuantSightBench and its use to evaluate models on empirical coverage and sharpness (paper describes benchmark and evaluation procedure; specific task/sample counts not given in excerpt).
high null result QuantSightBench: Evaluating LLM Quantitative Forecasting wit... benchmark introduction / evaluation of empirical coverage and interval sharpness
Technology-driven recruitment encompasses Applicant Tracking Systems (ATS), AI-powered screening, video-based interviews, gamified assessments, and data analytics.
Conceptual description in the paper's introduction/background defining the scope of 'technology-driven recruitment'.
high null result A Study on the Effectiveness of Technology-Driven Recruitmen... definition / scope of recruitment technology
The study employed a mixed-methods research design combining a quantitative survey of 150 HR professionals and recruiters across manufacturing, IT, banking, and education sectors with qualitative case study analysis of four organizations in Chhatrapati Sambhajinagar.
Explicit methodological statement in the paper: quantitative survey (N=150) across specified sectors + qualitative case studies of 4 organizations in Chhatrapati Sambhajinagar.
The review is a focused qualitative evidence synthesis and the proposed governance model is an evidence-informed conceptual framework that warrants future empirical validation.
Authors' explicit framing of the review approach and caveat calling for empirical validation of the proposed model.
high null result Artificial Intelligence in the Labor Market: Evidence on Wor... need for future empirical validation
Given the focused Title/Abstract/Keywords query and the small, heterogeneous corpus, the findings are interpreted as a scoped evidence map rather than an exhaustive census of all AI-and-work research.
Authors' explicit limitation statement referencing the search strategy (title/abstract/keywords focus), small number of included studies (n=19), and heterogeneity of studies.
high null result Artificial Intelligence in the Labor Market: Evidence on Wor... scope and generalizability of the review findings