Moral fit matters as much as technical accuracy: AI systems whose decision logic conflicts with stakeholders' moral intuitions risk rejection, mistrust and harmful outcomes in sensitive settings; policymakers and designers must address multi-stakeholder moral alignment alongside technical alignment.

Smart But Not Moral? Moral Alignment In Human-AI Decision-Making

Christiane Ernst, Luis Gutmann, Domenique Zipperling, Kathrin Figl, Niklas Kühl · April 15, 2026

arxiv theoretical n/a evidence 7/10 relevance Source PDF

Moral alignment — the perceived congruence between an AI system's decision logic and stakeholders' moral intuitions — is a distinct and foundational dimension that shapes trust, acceptance, and the meaningful integration of AI in high-stakes decisions.

In high-stakes AI-supported decisions, considerations are not purely technical but involve moral judgments about fairness, responsibility, and harm. While prior research has focused mainly on functional or behavioral alignment, this paper argues that moral alignment may be a more fundamental dimension of human-AI decision-making. Moral alignment is defined as the perceived congruence between the values embedded in an AI system's decision logic and the moral intuitions of stakeholders. Building on Moral Foundations Theory, the paper adopts a multi-stakeholder perspective and highlights why moral (mis)alignment matters for the meaningful integration of AI in sensitive contexts.

Summary

Main Finding

Perceived moral alignment between AI systems and human stakeholders—conceptualized as the congruence between the moral values embedded in an AI’s decision logic and a stakeholder’s moral intuitions—strongly shapes perceptions, trust, reliance, and acceptance of AI-supported decisions. Moral alignment is relational, context-dependent, and multi-stakeholder: alignment with those who hold decision authority (e.g., hiring managers) can disproportionately determine outcomes even when the system is technically accurate.

Key Points

Definition: Moral alignment = perceived congruence between an AI system’s reflected moral values and a stakeholder’s moral intuitions.
Theoretical framing: The paper uses Moral Foundations Theory (MFT) — Care, Loyalty, Authority, Purity, and Fairness (with Fairness split into Equality and Proportionality) — to systematically characterize moral differences that drive alignment or misalignment.
Distinct from functional alignment: Prior work focused on behavioral/accuracy alignment; moral alignment captures value-based acceptability that can override performance considerations (e.g., people may follow a less-accurate AI if it seems morally aligned).
Relational & multi-stakeholder nature: Moral alignment is not an intrinsic property of a system but depends on which stakeholder(s) are considered (developers, decision-makers, affected parties, auditors, regulators). Different stakeholders may have conflicting moral priorities, making universal alignment infeasible.
Power asymmetries matter: Alignment with powerful actors (those who interpret/authorize decisions) may determine whether AI recommendations are adopted, overridden, or contested.
Real-world relevance: Public controversies (e.g., LinkedIn visibility dispute) show that perceived moral misalignment can turn technical system behavior into ethical and political disputes.
Future research directions specified: (1) examine varying stakeholder constellations, (2) explore types of moral conflict, and (3) probe degrees of alignment and their effects on behavior.

Data & Methods

Nature of the paper: Conceptual/theoretical TREO submission — synthesis and framing rather than new empirical data.
Methods used: literature review of relevant streams (trust/reliance on algorithms, value-alignment research, moral psychology) and application of Moral Foundations Theory to human–AI decision-making; presents a conceptual model (including an illustrative Figure 1 of stakeholder relationships).
What is NOT included: no primary empirical study or quantitative testing in this paper.
Suggested empirical approaches (implicit in the paper’s agenda): vignette/experimental studies manipulating AI moral framings and stakeholder roles; surveys measuring individual moral foundations and perceived alignment; field studies in high‑stakes domains (e.g., hiring) tracking reliance, override behavior, and outcomes; multi-stakeholder deliberation experiments to study distributional alignment effects.

Implications for AI Economics

Adoption and diffusion: Moral misalignment can reduce adoption or meaningful use of otherwise high-performing AI systems, lowering realized productivity gains and slowing diffusion in markets where decisions are value-laden (e.g., hiring, lending, content moderation).
Welfare and efficiency: When decision-makers override accurate AI recommendations for moral reasons (or follow immoral but aligned systems), societal welfare and allocative efficiency can be affected. An accurate-but-misaligned AI may produce better technical outcomes but worse perceived legitimacy and compliance.
Strategic design and market segmentation: Firms may design AI tools to align with the moral priors of decision-makers (those with implementation power) rather than affected stakeholders, creating market segmentation and potential externalities for groups whose values are marginalized.
Distributional consequences: Moral alignment choices embed value trade-offs (e.g., equality vs. proportionality) that affect distributional outcomes across workers, consumers, and communities. Economic analyses must account for these normative impacts, not just performance metrics.
Regulation and governance costs: Regulators (e.g., EU AI Act) set baseline norms, but perceived alignment beyond regulatory compliance influences acceptance; firms may face higher compliance, monitoring, and reputational costs if moral misalignment triggers public contestation.
Measurement & evaluation: Economic assessments of AI should expand metrics beyond accuracy (e.g., cost-benefit analyses) to include perceived moral alignment, legitimacy, likelihood of reliance/override, reputational risk, and stakeholder welfare measures.
Policy and managerial recommendations:
- Incorporate multi-stakeholder moral assessments into cost-benefit and rollout decisions.
- Use participatory design and stakeholder mapping to identify whose values the system should prioritize given organizational objectives and externalities.
- Track alignment-sensitive outcomes (override rates, appeals, litigation, user satisfaction) as part of deployment metrics.
- Anticipate strategic behavior: firms might prioritize alignment with in-house decision-makers, so regulators should consider protections for affected parties to avoid systematic value capture.
Research opportunities for AI economics: quantify GDP or sector-level impacts of moral misalignment, estimate welfare losses from trust deficits despite technical accuracy, design incentive mechanisms to internalize multi-stakeholder value externalities, and evaluate regulation that mandates transparency about value trade-offs embedded in AI decision logic.

Assessment

Paper Typetheoretical Evidence Strengthn/a — This is a conceptual/theoretical paper that does not present empirical tests or causal estimates, so there is no direct empirical evidence to rate. Methods Rigorn/a — The contribution is a conceptual synthesis and argument built on Moral Foundations Theory and multi-stakeholder reasoning rather than empirical methods; rigor pertains to logical coherence and literature coverage rather than statistical or experimental design. SampleNo empirical sample; the paper develops a conceptual framework of 'moral alignment' by surveying relevant literature (Moral Foundations Theory, human-AI interaction, ethics, governance) and articulating implications for multi-stakeholder high-stakes decision contexts. Themeshuman_ai_collab governance GeneralizabilityNo empirical validation — conclusions are theoretical and may not hold in practice without testing, Moral Foundations Theory and moral intuitions vary across cultures, so applicability may be culturally specific, Stakeholder heterogeneity (patients, judges, hiring managers, citizens) may limit uniform application of the framework, Context specificity: focused on 'high-stakes' decisions and may not generalize to low-stakes or routine automation, Policy and institutional differences across jurisdictions may affect relevance and implementation

Claims (5)

Claim	Direction	Confidence	Outcome	Details
In high-stakes AI-supported decisions, considerations are not purely technical but involve moral judgments about fairness, responsibility, and harm. Ai Safety And Ethics	positive	high	presence of moral judgments in decision-making	0.06
Prior research has focused mainly on functional or behavioral alignment rather than moral alignment. Ai Safety And Ethics	negative	high	focus/themes of prior AI alignment research	0.12
Moral alignment may be a more fundamental dimension of human-AI decision-making than functional or behavioral alignment. Decision Quality	positive	high	relative fundamental status of moral alignment in human-AI decision-making	0.02
Moral alignment is defined as the perceived congruence between the values embedded in an AI system's decision logic and the moral intuitions of stakeholders. Ai Safety And Ethics	positive	high	perceived congruence between AI values and stakeholder moral intuitions (definition of 'moral alignment')	0.06
Drawing on Moral Foundations Theory and a multi-stakeholder perspective, moral (mis)alignment matters for the meaningful integration of AI in sensitive contexts. Adoption Rate	positive	high	meaningful integration/adoption of AI in sensitive/high-stakes contexts	0.06