Human-AI teams rarely achieve true synergy; closing the 'synergy gap' requires redesigning more than interfaces. A six-part sociotechnical framework shows why context, decision frameworks, participant roles, AI capabilities, interaction design and holistic evaluation must all be aligned to unlock combined performance gains.

Addressing the Synergy Gap: The Six Elements of the Design Space

Tommaso Turchi, Ben Wilson, Matt Roach, Alan Dix, Alessio Malizia · May 20, 2026

arxiv theoretical n/a evidence 8/10 relevance Source PDF

The paper diagnoses a persistent 'synergy gap'—human-AI teams rarely exceed the performance of the best solo actor—and proposes a six-element sociotechnical framework (context, decision frameworks, participants, AI capabilities, interaction, evaluation) to guide designs that can close that gap.

AI is now embedded in healthcare, finance, policy, and many other domains, yet genuine human-AI synergy - combined performance that exceeds what either party achieves alone - is uncommon. Meta-analyses show that AI assistance tends to improve human performance compared to working alone, but studies finding true synergy are scarce. We call this persistent shortfall the synergy gap. Most current work treats human-AI combination as an engineering problem and concentrates on interpretability, trust calibration, or interface design. These matter, but they cover only part of what determines whether combination works. Closing the synergy gap, we argue, requires explicit engagement with a wider design space. We map that space through six interconnected elements: sociotechnical context, decision-making frameworks, human decision participants, AI capabilities, interaction, and holistic evaluation. For each element, we describe what it covers, how it shapes the others in practice, and what it implies for design. The result is a shared vocabulary for practitioners building hybrid systems, an analytical lens for researchers studying combination patterns, and a starting point for evaluators interested in the full quality of human-AI decision-making rather than accuracy alone.

Summary

Main Finding

The paper identifies a persistent "synergy gap": although AI assistance typically improves individual human performance, true human-AI synergy — combined performance that exceeds the best individual agent — is rare. Closing this gap requires treating human-AI decision systems as sociotechnical systems composed of six interdependent design elements (sociotechnical context, decision-making frameworks, human decision participants, AI capabilities, interaction, and holistic evaluation). The authors present this six-element design space as a shared vocabulary and analytical lens to guide design, study, and evaluation of hybrid decision systems rather than focusing narrowly on algorithms or interfaces.

Key Points

Definition of the problem
- Synergy gap: improvements from AI are common, but surpassing the best agent (human or AI) via genuine collaboration is uncommon.
- Existing work emphasizes interpretability, trust, and UI; these are necessary but insufficient.
The six elements of the design space
Sociotechnical context: goals, stakeholder roles, organisational power relations, dynamics (time-sensitivity, uncertainty), task cardinality, contested objectives.
Decision-making frameworks: underlying normative/prescriptive/descriptive theories (rational optimisation vs. heuristics/naturalistic decision-making), cognitive models shaping interaction design.
Human decision participants: distributed expertise, cognitive capacities, psychological states (trust, risk tolerance), preferences, and skill evolution under automation.
AI capabilities: prediction/classification/optimization, congruence between problem and formulation, adaptivity, informational scope, process and outcome transparency (uncertainty reporting, domain boundaries).
Interaction: roles (peer, coach, provocateur), mode of intervention (information, critique, recommendation), control/initiative balance, temporal organisation, modality, and overlap/orthogonality of human vs. machine information.
Holistic evaluation: beyond accuracy — process quality, sociotechnical fit, provenance and accountability, appropriate reliance, long-term human capability and wellbeing, integration with workflows.
Design recommendations (high level)
- Start from context and participation dynamics before imposing technical solutions.
- Make theoretical assumptions explicit (how you think decisions are made).
- Support human skill development and maintain agency/traceability.
- Evaluate systems on combination-centered criteria, not only AI metrics.
Evidence base and motivation
- The argument is motivated by workshop synthesis (SYNERGY series), participant feedback, and literature/meta-analyses citing that synergy outcomes are rare despite frequent improvement over unaided human performance.

Data & Methods

Type of study: conceptual/analytic preprint that maps a design space rather than reporting new empirical experiments.
Sources:
- Workshop discussions (SYNERGY Workshop series) and collaborative inquiry across researchers and practitioners.
- Synthesis and analysis of contemporary research and existing meta-analyses (e.g., Vaccaro et al., 2024; Lai et al., 2023; Jacobs et al., 2021; Berger et al., 2025) showing frequent improvement but scarce true synergy.
- Citations to works on human decision theory, transparency, human factors, and AI evaluation.
Methods: qualitative mapping and synthesis; development of a conceptual framework (six elements) and implications for design and evaluation. No new quantitative datasets or primary field experiments are presented.

Implications for AI Economics

Productivity and returns to AI are context-dependent
- Economic gains from AI depend critically on organizational, social, and interactional design — not just model accuracy. Estimates of productivity gains should incorporate deployment design and human complementarities.
Complementarity vs substitution nuance
- The framework foregrounds that AI may complement human capabilities when designed for synergy (role/interaction choice, training, workflow integration). Conversely, poor design can lead to deskilling and substitution. Models of labor impact should include design quality as an endogenous variable influencing complementarity.
Heterogeneous adoption and inequality
- Sectors and firms with capacity to invest in sociotechnical integration (training, process redesign, provenance systems) are more likely to achieve true synergy, potentially widening productivity and wage gaps across firms, sectors, and regions.
Measurement and evaluation changes for economic assessment
- Standard economic metrics (accuracy, task completion time) are insufficient. Economists and evaluators should incorporate process-quality metrics (appropriate reliance, decision defensibility, human-skill retention), long-run outcomes (skill formation, wellbeing), and transaction costs (coordination, accountability).
- Cost–benefit and ROI analyses should include investments in human capital, organisational change, and monitoring/provenance infrastructure required to achieve synergy.
Market structure and firm strategy
- Systems that enable synergy likely require tighter integration with firm processes and training, creating adoption frictions and barriers to entry. This can produce “integration moats” favoring incumbents or large platform providers that bundle models with organizational solutions.
- Pricing and procurement models: buyers should value and pay for system-level synergy (integration, training, evaluation) not just model performance; procurement specifications might need to require process and outcome transparency and traceability.
Policy and regulation implications
- Regulation should move beyond model-level metrics to mandate or incentivize combination-centered evaluation, accountability/provenance mechanisms, and human-skill preservation (e.g., reporting requirements that capture interaction logs, role assignment, and human oversight protocols).
- Policies to subsidize training or support smaller firms’ adoption of synergistic designs can help mitigate inequality.
Modeling recommendations for researchers
- Incorporate interaction terms between AI technical performance and organizational features (training, workflow redesign, multi-stakeholder participation) when estimating productivity effects.
- Use multi-level and longitudinal designs to capture dynamic adaptation (learning-by-using vs deskilling) and path dependence in human-AI capability evolution.
- Consider agent-based or multi-agent models that represent multiple human participants, varying preferences, and adaptive AI roles to study equilibrium outcomes and welfare.
Empirical agenda
- Field experiments and quasi-experiments that randomize not only AI models but also design elements (role, transparency level, training, integration effort) to estimate causal effects on productivity, decision quality, and labor outcomes.
- Development of standardized process-quality metrics that can be incorporated into impact evaluations and regulatory reporting.
- Longitudinal studies on skills, trust calibration, and labor market outcomes arising from different human-AI interaction designs.

Overall, the paper argues that economists, managers, and policymakers should treat AI adoption as a sociotechnical investment problem: returns depend on design choices across multiple dimensions. Valuation, regulation, and measurement should reflect the complexity of achieving genuine human-AI synergy rather than optimizing standalone model performance.

Assessment

Paper Typetheoretical Evidence Strengthn/a — The paper is a conceptual/theoretical framework and does not present new causal empirical tests; it synthesizes prior meta-analyses and literature rather than providing primary causal identification or new quantitative estimates. Methods Rigormedium — The work systematically maps a comprehensive six-element design space and grounds its arguments in existing meta-analyses and literature, showing conceptual rigor; however, it lacks original empirical validation, pre-registered hypotheses, or formal modeling that would raise methodological rigor to high. SampleNo original dataset or experiment; the paper synthesizes existing meta-analyses and empirical studies across domains (e.g., healthcare, finance, policy) and constructs a conceptual framework describing factors that affect human-AI combination. Themeshuman_ai_collab org_design productivity GeneralizabilityNot empirically validated—framework applicability to specific domains or tasks is untested, Broad conceptual scope may miss domain-specific constraints (regulatory, clinical, financial workflows), Cultural and organizational differences across firms/countries could limit transferability, Recommendations are high-level and may require operationalization for different user populations and task types

Claims (10)

Claim	Direction	Confidence	Outcome	Details
AI is now embedded in healthcare, finance, policy, and many other domains. Adoption Rate	positive	high	embedding/adoption of AI in multiple domains	0.06
Genuine human-AI synergy—combined performance that exceeds what either party achieves alone—is uncommon. Decision Quality	negative	high	frequency/prevalence of human-AI combinations achieving superior combined performance	0.12
Meta-analyses show that AI assistance tends to improve human performance compared to working alone. Decision Quality	positive	high	human performance with AI assistance versus human performance alone	0.12
Studies finding true synergy are scarce. Decision Quality	negative	high	number/prevalence of studies reporting genuine synergy	0.12
We call this persistent shortfall the 'synergy gap.' Other	null_result	high	n/a (terminology defining a phenomenon)	0.02
Most current work treats human-AI combination as an engineering problem and concentrates on interpretability, trust calibration, or interface design. Other	null_result	high	research focus/themes in human-AI combination literature	0.06
Interpretability, trust calibration, and interface design matter, but they cover only part of what determines whether human-AI combination works. Decision Quality	mixed	high	completeness of current design foci relative to factors determining effective combination	0.02
Closing the synergy gap requires explicit engagement with a wider design space. Organizational Efficiency	positive	high	likelihood of closing the synergy gap given broader design engagement	0.02
We map that space through six interconnected elements: sociotechnical context, decision-making frameworks, human decision participants, AI capabilities, interaction, and holistic evaluation. Other	null_result	high	n/a (framework description)	0.02
The result is a shared vocabulary for practitioners building hybrid systems, an analytical lens for researchers studying combination patterns, and a starting point for evaluators interested in the full quality of human-AI decision-making rather than accuracy alone. Decision Quality	positive	high	utility for practitioners, researchers, and evaluators regarding human-AI combination quality	0.02