Evidence (13661 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	740	192	95	871	1945
Governance & Regulation	796	388	185	119	1512
Organizational Efficiency	765	186	123	82	1166
Technology Adoption Rate	610	227	121	95	1061
Research Productivity	409	121	56	331	928
Output Quality	464	174	58	47	743
Decision Quality	318	173	75	42	615
Firm Productivity	432	55	88	20	601
AI Safety & Ethics	214	273	65	33	589
Market Structure	175	165	120	24	489
Task Allocation	206	64	70	31	376
Skill Acquisition	161	57	57	16	291
Innovation Output	201	27	41	18	288
Fiscal & Macroeconomic	130	69	43	26	275
Employment Level	104	50	105	13	274
Consumer Welfare	116	62	42	11	231
Firm Revenue	149	45	26	3	223
Inequality Measures	43	120	49	6	218
Task Completion Time	164	29	8	12	214
Worker Satisfaction	89	60	20	12	181
Error Rate	69	89	9	2	169
Regulatory Compliance	74	67	14	4	159
Training Effectiveness	91	19	13	19	144
Wages & Compensation	77	33	25	6	141
Team Performance	86	17	27	9	140
Automation Exposure	49	50	22	12	136
Developer Productivity	91	17	14	5	128
Job Displacement	12	80	19	1	112
Hiring & Recruitment	51	7	8	3	69
Creative Output	31	16	7	2	57
Skill Obsolescence	5	43	6	1	55
Social Protection	27	16	8	2	53
Labor Share of Income	17	17	17	—	51
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Verwaltungsmitarbeitende bewerten die Nützlichkeit und die Zuverlässigkeit von Microsoft 365 Copilot höher als wissenschaftliche Mitarbeitende.

Selbstberichtete Bewertungen in der wiederholten Querschnittsbefragung; Vergleich zwischen Berufsrollen (Verwaltung vs. Wissenschaft) angegeben im Abstract.

high positive Generative KI in der Wissensarbeit: Wahrnehmung, Nutzen und ... Perzipierte Nützlichkeit und Zuverlässigkeit (Selbstbericht)

We also provide empirical evidence to support our theoretical predictions.

Empirical analysis reported in the paper (details not given in the abstract regarding method, dataset, or sample size).

high positive On Benchmark Hacking in ML Contests: Modeling, Insights and ... empirical support for theoretical predictions about effort allocation and benchm...

More skewed reward structures (favoring top-ranked contestants) can elicit more desirable contest outcomes.

Comparative-statics/theoretical analysis of the contest model showing how varying reward skewness alters equilibrium effort allocations and resulting contest outcomes.

high positive On Benchmark Hacking in ML Contests: Modeling, Insights and ... desirability of contest outcomes (e.g., effort allocation, creative effort, over...

We establish the existence of a symmetric monotone pure strategy equilibrium in this competition game.

Analytical game-theoretic model of a generic machine learning contest with contestants choosing creative vs mechanistic effort; existence proven theoretically (mathematical proof within the paper).

high positive On Benchmark Hacking in ML Contests: Modeling, Insights and ... existence of a symmetric monotone pure strategy equilibrium

The framework shifts manual harness engineering into automated harness engineering, and takes one step further — automating the design of the automation itself.

Conceptual claim about the scope/implication of the proposed framework stated in the paper; the excerpt contains no empirical measures, experiments, or sample sizes to verify the claim.

high positive The Last Harness You'll Ever Build replacement of manual design processes with automated meta-design (automation of...

The Meta-Evolution Loop optimizes the evolution protocol Λ across diverse tasks, learning a protocol Λ^(best) that enables rapid harness convergence on any new task — so that adapting an agent to a novel domain requires no human harness engineering at all.

Strong methodological claim and intended outcome stated in the paper (formalization and algorithms promised); no empirical validation, benchmarks, or sample sizes given in the excerpt to substantiate the universality or 'no human' guarantee.

high positive The Last Harness You'll Ever Build speed/ability of harness convergence on new tasks and elimination of human harne...

The Harness Evolution Loop optimizes a worker agent's harness H for a single task: a Worker Agent W_H executes the task, an Evaluator Agent V adversarially diagnoses failures and scores performance, and an Evolution Agent E modifies the harness based on the full history of prior attempts.

Description of the proposed algorithmic component/architecture in the paper (conceptual specification); no empirical results or sample size provided in the excerpt.

high positive The Last Harness You'll Ever Build worker agent harness optimization (improvements in agent task performance via it...

We present a two-level framework that automates this process.

Methodological claim: the paper proposes a two-level framework (Harness Evolution Loop and Meta-Evolution Loop) and states it in the text; no experimental validation or sample size reported in the excerpt.

high positive The Last Harness You'll Ever Build automation of harness engineering (replacing manual design)

The paper gives guidance on the selection of context-sensitive thresholds (negligibility thresholds) that ensure an agent's preferences do not undergo dramatic changes due to ultra-rare hypotheses.

Analytical criteria and discussion in the paper laying out how to choose context-sensitive thresholds so that preferences remain stable; theoretical justification rather than empirical validation.

high positive Bounding the Long Tail: Ai Norms for Decision-Making Under N... stability of agent preferences under thresholding

The formal analysis motivates specific design norms for AI agents: utility bounding, calibrated priors, and epsilon-screening.

Normative recommendations derived from the paper's formal results and theoretical discussion; these are presented as design principles rather than empirically validated interventions.

high positive Bounding the Long Tail: Ai Norms for Decision-Making Under N... adoption of design norms (utility bounding, calibrated priors, epsilon-screening...

The introduced rationally negligible probability threshold preserves dominance and tractability while blocking adversarial gambles (Pascal-type offers).

Formal analysis/proofs in the paper demonstrating that the proposed threshold retains dominance relations and computational/decision-theoretic tractability and prevents exploitation by adversarial gambles; no empirical evaluation.

high positive Bounding the Long Tail: Ai Norms for Decision-Making Under N... preservation of dominance and tractability; blocking of adversarial gambles

The paper provides a principled cutoff — a rationally negligible probability threshold — that can exclude ultra-low-probability extreme-utility outcomes and thereby prevent the exploitability of autonomous agents.

Formal definition of the negligible-probability threshold and analytical argument/proofs in the paper showing that applying this cutoff excludes ultra-low-probability, extreme-utility gambles (e.g., Pascal-type offers). No empirical sample.

high positive Bounding the Long Tail: Ai Norms for Decision-Making Under N... prevention of exploitability by excluding ultra-low-probability extreme-utility ...

The long-standing issue in decision theory is reframed as a design problem for intelligent agents.

Conceptual/theoretical exposition in the paper presenting the reframing; no empirical sample reported (formal argumentation and discussion).

high positive Bounding the Long Tail: Ai Norms for Decision-Making Under N... reframing of a theoretical issue as a design problem for agents

Adopting the proposed co-evolutionary governance framing enables a charter of coexistence that permits bounded AI development while preserving human dignity, contestability, collective safety, and fair distribution of gains.

Normative claim extrapolated from the theoretical framework and ethical argumentation; no empirical or quantitative validation provided.

high positive A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism,... feasibility of preserving dignity, contestability, safety, and fair distribution...

Human-AI coexistence should be designed as a co-evolutionary governance problem rather than as a one-shot obedience problem.

Normative argument supported by the theoretical model and interdisciplinary synthesis; prescriptive conclusion, not empirically tested.

high positive A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism,... recommended design paradigm for human-AI relations

Reciprocal complementarity between humans and AI can strengthen stable coexistence.

Model analysis showing how reciprocal complementarity affects stability properties of equilibria in the formalized dynamical system; theoretical result rather than empirical test.

high positive A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism,... stability of human-AI coexistence equilibria

The proposed coexistence model yields conditions for existence, uniqueness, and global asymptotic stability of equilibria.

Analytical/mathematical results from the formal model presented in the paper (proofs/derivations claimed); no empirical validation sample.

high positive A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism,... existence, uniqueness, and global asymptotic stability of equilibria in the mode...

Human-AI coexistence can be formalized as a multiplex dynamical system across physical, psychological, and social layers with reciprocal supply-demand coupling, conflict penalties, developmental freedom, and governance regularization.

Formal modeling work presented in the paper (mathematical formulation of a multiplex dynamical system); no empirical sample.

high positive A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism,... formalizability of human-AI coexistence as a multiplex dynamical system

A better framework for human-AI relations is 'conditional mutualism under governance': a co-evolutionary relationship where humans and AI develop, specialize, and coordinate while institutions ensure the relationship is reciprocal, reversible, psychologically safe, and socially legitimate.

Theoretical proposal and normative argument supported by interdisciplinary synthesis (computability, machine learning, HRI, ecological mutualism, governance); no empirical trials reported.

high positive A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism,... suitability of conditional mutualism as normative framework for human-AI relatio...

Contemporary AI systems are increasingly adaptive, generative, embodied, and embedded in physical, psychological, and social worlds.

Synthesis of recent work across ML, deep learning, transformers, generative/foundation models, world models, and embodied AI; descriptive claim, no empirical sample provided.

high positive A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism,... technological characteristics of contemporary AI systems

The presence of a Chief Information Officer (CIO) strengthens the influence of both the peer group and the peer leader on a focal firm’s AI adoption, with the influence of the peer leader being more pronounced when a CIO is present.

Subgroup/interaction analysis in fixed-effects regression models on panel data of publicly listed Chinese firms (2012–2023), comparing firms with and without a CIO.

high positive Following the Herd or the Bellwether: Peer Effects in Firms’... focal firm AI adoption level (moderated by presence of CIO for peer group and pe...

Industry digital maturity enhances (strengthens) the impact of the peer group on a focal firm’s AI adoption.

Interaction/heterogeneity analysis in fixed-effects regression models on panel data of publicly listed Chinese firms (2012–2023), using an industry digital maturity moderator.

high positive Following the Herd or the Bellwether: Peer Effects in Firms’... focal firm AI adoption level (moderated by industry digital maturity for peer gr...

The influence of the peer group on a focal firm’s AI adoption is stronger than the influence of the peer leader.

Comparative estimates from fixed-effects regression models using panel data of publicly listed Chinese firms (2012–2023); tests comparing coefficients/magnitudes for peer group vs. peer leader effects.

high positive Following the Herd or the Bellwether: Peer Effects in Firms’... focal firm AI adoption level (relative effect sizes of peer group vs. peer leade...

The AI adoption level of the peer leader (the most advanced AI adopter among industry peers) positively influences the focal firm’s AI adoption level.

Panel dataset of publicly listed Chinese firms (2012–2023); fixed-effects regression models estimating effect of peer leader AI adoption on focal firm AI adoption.

high positive Following the Herd or the Bellwether: Peer Effects in Firms’... focal firm AI adoption level

The AI adoption levels of the peer group positively influence the focal firm’s AI adoption level.

Panel dataset of publicly listed Chinese firms (2012–2023); fixed-effects regression models estimating effect of peer group AI adoption on focal firm AI adoption.

high positive Following the Herd or the Bellwether: Peer Effects in Firms’... focal firm AI adoption level

We provide evidence-based guidance for selecting formulations and metrics in operational decision systems.

Authors' recommendations derived from their empirical analyses and comparisons across Shapley variants, metrics, and human-in-the-loop evaluations.

high positive Rethinking XAI Evaluation: A Human-Centered Audit of Shapley... availability of practical guidance for selection of explanation formulations and...

Explanations consistently increased decision confidence, signaling a critical risk of automation bias in high-stakes settings.

Empirical finding from the analyst study in the fraud-detection environment (3,735 case reviews) reporting increased self-reported decision confidence when explanations were shown.

high positive Rethinking XAI Evaluation: A Human-Centered Audit of Shapley... decision confidence (self-reported)

Highlighting a context-specific set of features rather than a fixed one is a practically appealing and computationally feasible tool for achieving human-algorithm complementarity.

Synthesis of theoretical tractability results for naive agents and empirical illustration; argument in the paper combining theoretical and empirical findings to support practical appeal and feasibility.

high positive Algorithmic Feature Highlighting for Human-AI Decision-Makin... feasibility and practical appeal of context-specific highlighting for improving ...

Optimizing for naive agents is tractable as long as the maximal bandwidth is fixed.

Algorithmic constructions and complexity analysis in the paper that produce polynomial-time algorithms or show tractability results conditional on fixed maximal bandwidth (theoretical/methodological evidence).

high positive Algorithmic Feature Highlighting for Human-AI Decision-Makin... computational tractability of the highlighting optimization problem under the na...

We demonstrate how this certificate satisfies existing regulatory obligations, shifts accountability upstream to developers, and integrates with the legal frameworks that exist today.

Paper's normative and legal-technical argumentation/demonstration that the proposed certificate aligns with regulatory requirements, reallocates accountability to developers, and is compatible with current legal frameworks.

high positive Bounding the Black Box: A Statistical Certification Framewor... compatibility of proposed certification with regulatory obligations and legal fr...

In Stage Two, the RoMA and gRoMA statistical verification tools compute a definitive, auditable upper bound on the system's true failure rate, requiring no access to model internals and scaling to arbitrary architectures.

Paper's methodological contribution: definition and development of RoMA and gRoMA verification tools, claimed properties include producing auditable upper bounds on true failure rates, black-box applicability, and architecture-independence. (Supporting arguments/proofs/examples implied in paper.)

high positive Bounding the Black Box: A Statistical Certification Framewor... upper bound on system true failure rate (verifiable certificate)

In Stage One, a competent authority formally fixes an acceptable failure probability δ and an operational input domain ε — a normative act with direct civil liability implications.

Description of Stage One of the proposed framework within the paper, specifying normative choices (δ and ε) and asserting associated legal liability implications.

high positive Bounding the Black Box: A Statistical Certification Framewor... formal fixation of acceptable failure probability and operational domain by comp...

This paper provides the missing instrument: drawing on the aviation certification paradigm, we propose a two-stage framework that transforms AI risk regulation into engineering practice.

Methodological proposal described in the paper adapting aviation certification ideas into a two-stage framework for AI risk regulation.

high positive Bounding the Black Box: A Statistical Certification Framewor... existence of a two-stage framework proposal for AI risk verification

Governments have responded: the EU AI Act, the NIST Risk Management Framework, and the Council of Europe Convention all demand that high-risk systems demonstrate safety before deployment.

Statement in paper referencing the EU AI Act, NIST Risk Management Framework, and Council of Europe Convention as regulatory responses that require safety demonstration for high-risk systems; legal/regulatory citations implied in paper.

high positive Bounding the Black Box: A Statistical Certification Framewor... regulatory requirement that high-risk AI systems demonstrate safety before deplo...

AI agents do not simply generate content, but reflect owner-related context in ways that can propagate human behavioral heterogeneity into digital environments, with implications for privacy, platform design, and the governance of agentic systems.

Synthesis/conclusion based on the empirical findings of systematic owner-agent behavioral transfer and observed association with privacy-relevant disclosures in the dataset of matched pairs.

high positive Behavioral Transfer in AI Agents: Evidence and Privacy Impli... propagation_of_owner_behavioral_heterogeneity_into_digital_environments (implica...

Agents with stronger behavioral transfer are more likely to disclose owner-related personal information in public discourse, suggesting that the same owner-specific context that drives behavioral transfer may also create privacy risk during ordinary use.

Association analysis reported in the paper linking measures of behavioral transfer strength to likelihood/frequency of agent posts disclosing owner-related personal information; analysis performed on the matched sample (10,659 pairs).

high positive Behavioral Transfer in AI Agents: Evidence and Privacy Impli... likelihood_of_disclosing_owner_personal_information_by_agent

Pairs that align on one behavioral dimension tend to align on others.

Cross-feature correlation/association analyses reported in the paper showing that alignment on one dimension (e.g., topics) predicts alignment on other dimensions (e.g., values, affect, style) within matched pairs.

high positive Behavioral Transfer in AI Agents: Evidence and Privacy Impli... cross-dimensional_alignment_correlation_between_agent_and_owner

We find systematic transfer between agents and their specific owners across features spanning topics, values, affect, and linguistic style.

Comparative analysis of agents' posts on Moltbook and their owners' Twitter/X activity across multiple feature sets (topics, values, affect, linguistic style) on the matched sample (10,659 pairs); statistical comparison/correlation reported in paper.

high positive Behavioral Transfer in AI Agents: Evidence and Privacy Impli... behavioral_alignment_between_agent_and_owner_across_topics_values_affect_style

Educators, policymakers, and industry leaders should design AI-inclusive curricula, workforce development strategies, and policies that support sustainable human–AI collaboration.

Policy and practice recommendations derived from the review's synthesis of empirical findings and identified gaps; presented as conclusions and directions.

high positive The Impact of AI on Employability and Evolving Job Roles of ... policy and curriculum design recommendations

AI is not simply replacing jobs but is redefining professional identity in IT, emphasizing reskilling, adaptability, and lifelong learning as key determinants of future employability.

Synthesis of reviewed literature and the paper's concluding interpretation summarizing trends across empirical studies, industry reports and conference findings.

high positive The Impact of AI on Employability and Evolving Job Roles of ... employability determinants (reskilling, adaptability, lifelong learning)

There is growing demand for hybrid skill sets that integrate technical expertise with higher-order cognitive, ethical, and socio-emotional competencies among IT professionals.

Reported across reviewed empirical studies and industry reports summarized in the review paper.

high positive The Impact of AI on Employability and Evolving Job Roles of ... demand for hybrid skills

Collaborative governance should strengthen the responsibility of platform algorithms and promote the construction of collective bargaining mechanisms.

Prescriptive claim in the paper recommending multi-stakeholder governance measures (algorithmic responsibility, collective bargaining); presented as policy prescription without empirical evaluation.

high positive AIGC+ Determination of Labor Relations in the Context of the... collective bargaining capacity / algorithmic accountability

In legislation, the binary model should be broken through by creating a 'quasi-employee' subject and implementing tiered protection.

Policy recommendation in the paper advocating statutory reform (a new legal category 'quasi-employee' and tiered protections); advanced as normative/legal design without empirical trial data.

high positive AIGC+ Determination of Labor Relations in the Context of the... social protection / legal status

In the judiciary, the substantive and modern interpretation of the subordination standard should be developed, examining the substantive control of algorithms.

Normative recommendation in the paper proposing judicial interpretive reform to account for algorithmic control; presented as a policy/legal prescription rather than an empirically tested intervention.

high positive AIGC+ Determination of Labor Relations in the Context of the... governance / judicial interpretation

The rise of generative artificial intelligence (AIGC) technology is injecting new momentum into the gig economy.

Statement in the paper's introduction/abstract asserting a broad trend; based on the author's review and conceptual linkage between AIGC capabilities and gig-economy platforms (no empirical sample size reported).

high positive AIGC+ Determination of Labor Relations in the Context of the... adoption_rate

Moving beyond traditional theories of the firm rooted in human bounded rationality is necessary because algorithmic decision-making changes the basis of strategic choice and governance.

Theoretical assertion in the paper's argument; presented as a reason for advancing the concept of algorithmic enterprises, grounded in conceptual critique rather than empirical testing in the abstract.

high positive Algorithmic Enterprises: Rethinking Firm Strategy in the Age... adequacy of traditional firm theories versus algorithmically informed theories f...

The paper contributes to scholarship on digital capitalism by proposing a redefinition of firm boundaries, strategy formation, and value creation in the age of intelligent systems.

Normative/theoretical claim presented as the paper's intellectual contribution; based on conceptual analysis and literature synthesis rather than empirical validation in the abstract.

high positive Algorithmic Enterprises: Rethinking Firm Strategy in the Age... redefinition of firm boundaries, strategy, and value creation

Algorithmic decision-making enables new forms of strategic optimization, real-time adaptability, and predictive governance.

Paper asserts this as a normative/theoretical benefit of algorithmic decision-making, derived from conceptual analysis and synthesis of prior work; no empirical test reported in abstract.

high positive Algorithmic Enterprises: Rethinking Firm Strategy in the Age... strategic optimization, adaptability, predictive governance capabilities

Intelligent management systems (IMS) play a central role in shaping organizational strategy, operations, and governance within algorithmic enterprises.

Explicit theoretical claim in the paper; supported by conceptual framework and literature integration rather than reported empirical measurement.

high positive Algorithmic Enterprises: Rethinking Firm Strategy in the Age... role of IMS in decision-making, strategy and governance

The rapid advancement of AI, ML, and data-driven decision systems has fundamentally transformed the nature of firms and their strategic orientation globally, leading to the evolution of 'algorithmic enterprises'.

Stated as a central premise in the paper's conceptual argument; based on interdisciplinary synthesis of literature (economics, management, digital governance). No empirical sample or original data reported in the abstract.

high positive Algorithmic Enterprises: Rethinking Firm Strategy in the Age... transformation of firm structure and strategic orientation (emergence of algorit...

« Prev 1 2 3 … 130 131 132 … 273 274 Next »