A hybrid rules-plus-learning system lets managers tune how work is split between humans and AI, converting autonomous AI actions into supervised collaborations as governance tightens. In benchmarked software and manufacturing scenarios, stronger governance sometimes raises output and lowers fatigue, and moderate governance wins out as the learner accumulates experience.

HAAS: A Policy-Aware Framework for Adaptive Task Allocation Between Humans and Artificial Intelligence Systems

Vicente Pelechanoa, Antoni Mestre, Manoli Albert, Miriam Gil · May 04, 2026

arxiv quasi_experimental medium evidence 8/10 relevance Source PDF

HAAS pairs a rule-based governance layer with a contextual-bandit learner to adaptively allocate tasks between humans and AI, showing that tuning governance predictably shifts autonomy, can sometimes improve performance and reduce worker fatigue (notably in manufacturing), and that moderate governance becomes more competitive as the learner gains experience.

Deciding how to distribute work between humans and AI systems is a central challenge in organisational design. Most approaches treat this as a binary choice, yet the operational reality is richer: humans and AI routinely share tasks or take complementary roles depending on context, fatigue, and the stakes involved. Governing that distribution -- balancing efficiency, oversight, and human capability -- remains an open problem. This paper presents Human-AI Adaptive Symbiosis (HAAS), an implemented framework for adaptive task allocation in software engineering and manufacturing. HAAS combines two coupled components: a rule-based expert system that enforces governance constraints before any learning occurs, and a contextual-bandit learner that selects among feasible collaboration modes from outcome feedback. Task-agent fit is represented through five auditable cognitive dimensions and a five-mode autonomy spectrum -- from human-only to fully autonomous -- embedded in a reproducible benchmark spanning both domains. Three empirical findings emerge. First, governance is not a binary switch but a tunable design variable: tighter constraints predictably convert autonomous AI assignments into supervised collaborations, with domain-specific costs and benefits. Second, in manufacturing, stronger governance can improve operational performance and reduce fatigue simultaneously -- a workload-buffering effect that contradicts the usual framing of governance as pure overhead. Third, no single governance setting dominates across all contexts; moderate governance becomes increasingly competitive as the learner accumulates experience within the governed action space. Together, these findings position HAAS as a pre-deployment workbench for comparing and inspecting human--AI allocation policies before organisational commitment.

Summary

Main Finding

HAAS (Human–AI Adaptive Symbiosis) is a policy-aware, implemented framework and reproducible benchmark for adaptive task allocation between humans and AI. Its main empirical takeaways are: - Governance is a tunable design variable: stronger governance predictably shifts assignments away from fully autonomous AI toward supervised/shared modes, producing context-specific trade-offs. - In manufacturing, stronger governance can simultaneously improve operational performance and reduce human fatigue (a workload-buffering effect), contradicting the view of governance as pure overhead. - No single governance setting dominates across contexts; moderate governance becomes more competitive as a contextual-bandit learner accrues experience within the governed action space.

Key Points

Framework architecture:
- Three layers: (1) cognitive characterisation of subtasks, (2) allocation engine combining a rule-based PolicyEngine and a contextual-bandit learner, (3) execution layer with five collaboration modes and human-state updates.
- Governance (PolicyEngine) is applied before the learner acts, so learning always occurs within organisationally acceptable bounds.
Task representation:
- Each subtask is scored on five auditable cognitive dimensions: repetitiveness (r), technical depth (τ), creativity (c), ambiguity (a), human interaction (h).
- A scalar AI affinity αAI is derived from those dimensions (weighted linear formula using r, τ, and transformations of c, a, h).
Collaboration/autonomy vocabulary:
- Five graded modes: Human-Only, Copilot, Peer, Supervised, Autonomous — covering fully human to fully AI and three intermediate shared modes.
Allocation engine:
- PolicyEngine: forward-chaining rule-based expert system that enforces governance constraints (e.g., autonomy caps, mandatory human validation, safety overrides).
- Learner: contextual-bandit that selects among feasible modes and learns from immediate execution feedback; variants include UCB1, LinUCB, Thompson Sampling, with discounted-UCB for nonstationarity.
Human-state dynamics:
- Fatigue, trust, and deskilling (skill erosion from sustained delegation) are modelled explicitly and embedded in the reward signal, so the allocator adapts to evolving human capacity.
Benchmark & domains:
- Reproducible simulation benchmark spanning software engineering and manufacturing subtasks (task catalogue with examples and αAI values).
- Enables comparison of learned, heuristic, and fixed policies under different governance contracts and across domains.
Empirical questions addressed:
- Which allocation strategies perform best?
- How does governance intensity affect performance and mode mix?
- How transferable are governance settings and learned policies across scenarios and time horizons?

Data & Methods

Subtask characterisation:
- Five-dimension rubric with normalized 0–1 scores; AI affinity computed as αAI(s) = wr·r + wτ·τ + wc·(1−c) + wa·(1−a) + wh·(1−h) with nonnegative weights summing to 1.
- Example tasks provided across software (e.g., boilerplate generation, API design, debugging) and manufacturing (e.g., precision assembly, visual inspection, AGV route management).
Governance model:
- Rule-based PolicyEngine (classical forward-chaining knowledge base) that filters feasible collaboration modes before learning; rules implement organisational constraints and safety/oversight requirements.
Learning algorithm:
- Contextual-bandit setup (per-subtask decisions with immediate feedback). Algorithms considered include UCB variants, LinUCB (for context), Thompson Sampling; discounted-UCB used for nonstationarity.
- Reward combines efficiency/quality outcomes and human-state penalties/benefits (fatigue, trust, deskilling).
Human-state modelling:
- State variables (fatigue, trust, skill) evolve based on executed collaboration mode and outcomes; these feed back into rewards and future allocations.
Evaluation:
- Simulated experiments within the benchmark compare performance metrics (throughput, quality), human wellbeing (fatigue), and mode distribution under varying governance intensities and policy classes (fixed heuristics, learned policies).
- Transferability analyses examine how policies and governance settings generalise across domains and longer horizons.
Reproducibility:
- Framework packaged as a configurable benchmarking artefact with fixed seeds, parameter tables, and a command-line runner.

Implications for AI Economics

Governance as an organisational instrument: Treat governance intensity (e.g., autonomy caps, mandatory human validation) as a design parameter in cost–benefit and adoption models. Firms can trade off short-run productivity gains from autonomy against longer-run oversight, liability, and skill-retention concerns.
Productivity and labour allocation:
- HAAS demonstrates that intermediate, governed human–AI collaboration often yields superior outcomes to naive full automation or rigid human-only regimes. Economic models of automation should therefore account for graded allocation and dynamic adaptation rather than binary substitution.
- The workload-buffering effect (stronger governance reducing fatigue while improving performance in some settings) implies that governance can raise effective labor supply/quality by maintaining human capacity—altering estimates of net productivity gains from automation.
Human capital and deskilling:
- Explicit modeling of deskilling suggests firms face a trade-off between short-term efficiency (delegating more to AI) and long-term human capability. Investment in training, rotation policies, and governance can be economically optimal to prevent costly skill erosion.
Regulation, liability, and compliance costs:
- HAAS provides a pre-deployment workbench to quantify how regulatory constraints (e.g., human oversight mandates) change operational outcomes and mode mixes. This allows firms and regulators to evaluate compliance costs and safety–productivity trade-offs before deployment.
Market for governance-aware systems:
- Demand may grow for allocation engines that embed policy constraints and human-state considerations; such systems have value in regulated/high-stakes industries where oversight and human reliability matter.
Wage and labor-market effects:
- With graded allocation, tasks will split into more AI-centric, balanced (complementary), and human-centric categories. This heterogeneity affects task-level demand for skills and may shift wage structures toward roles requiring oversight, interpretation, and creative/ambiguous work.
Measurement and empirical strategy for economics research:
- Empirical studies of AI’s labour market impact should incorporate dynamic human-state variables (fatigue, trust, deskilling) and governance constraints. Static measures of task automatability will misstate welfare and productivity effects.
Policy recommendations:
- Firms and policymakers should treat governance configurations as operational levers: moderate, adaptive governance plus learning can outperform extremes; policies encouraging pre-deployment simulation/benchmarking (as HAAS enables) would improve safe, efficient adoption.
Research directions for AI economics:
- Quantify long-run welfare impacts of deskilling vs. productivity gains; field experiments to validate simulated findings; calibrate human-state models with measured fatigue/productivity data; study heterogeneous worker responses and distributional effects across skill levels and sectors.

Limitations noted by the authors (relevant to economic interpretation): - Results are simulation-based and parameter-dependent; external validity requires field calibration. - Longer-term deskilling and labour-market equilibrium effects are not modelled endogenously. - Single-agent allocation; broader organisational interactions, multi-agent strategic behavior, and adversarial settings remain open.

If you want, I can (a) extract the formal equations and pseudocode for the PolicyEngine + bandit loop, (b) outline a simple economic model that embeds HAAS-style governance choices into a firm’s optimization problem, or (c) propose empirical designs to validate HAAS in field settings. Which would be most useful?

Assessment

Paper Typequasi_experimental Evidence Strengthmedium — Findings are supported by implemented experiments in two domains (software engineering and manufacturing) within a reproducible benchmark, providing credible within-benchmark causal contrasts from controlled manipulations; however, evidence is limited to benchmark/lab settings with likely simulated or small-scale human-in-the-loop trials and short-run outcomes, so external validity to real-world firms is uncertain. Methods Rigormedium — The paper combines a clear modular architecture (rule-based governance + contextual-bandit learner), uses an auditable representation of task-agent fit, and reports systematic experiments across domains; but it appears to lack large-scale field randomization, detailed robustness checks across heterogeneous real-world teams, and long-run deployment evidence, leaving potential confounders and implementation challenges only partly addressed. SampleA custom, reproducible benchmark spanning representative software-engineering tasks (e.g., collaborative coding/review workflows) and manufacturing tasks (e.g., assembly/inspection); evaluated using the HAAS framework with the governance module and a contextual-bandit learner across multiple runs, involving combinations of simulated agents and human-in-the-loop trials (lab/bench experiments) rather than large-scale organisational deployments. Themesorg_design governance human_ai_collab productivity IdentificationControlled comparisons in a reproducible benchmark: the authors manipulate governance constraints and the learner's action space, then compare operational outcomes (performance, fatigue) across settings using repeated trials; the contextual-bandit learner is evaluated within those manipulated environments. No broad field-randomized assignment in actual organisations is reported. GeneralizabilityBenchmark and lab setting may not reflect operational complexity of real firms, Limited task types (software engineering and specific manufacturing tasks) restrict domain coverage, Likely small or simulated human samples; results may not scale to heterogeneous workforces, Short-run experiments do not capture long-term adaptation, learning, and organizational change, Governance rules and cognitive-dimension representations are hand-designed and may not transfer across contexts

Claims (7)

Claim	Direction	Confidence	Outcome	Details
HAAS combines a rule-based expert system that enforces governance constraints before any learning occurs, and a contextual-bandit learner that selects among feasible collaboration modes from outcome feedback. Task Allocation	positive	high	mechanism for adaptive task allocation (selected collaboration mode)	0.48
Task–agent fit is represented through five auditable cognitive dimensions and a five-mode autonomy spectrum (from human-only to fully autonomous) embedded in a reproducible benchmark spanning software engineering and manufacturing. Task Allocation	positive	high	representation of task–agent fit and benchmarking across domains	0.48
Governance is not a binary switch but a tunable design variable: tighter constraints predictably convert autonomous AI assignments into supervised collaborations, with domain-specific costs and benefits. Task Allocation	mixed	high	distribution of collaboration modes / assignment types (autonomous vs supervised)	0.48
In manufacturing, stronger governance can improve operational performance and reduce fatigue simultaneously — a workload-buffering effect. Firm Productivity	positive	high	operational performance and worker fatigue	0.48
This workload-buffering effect (governance improving performance while reducing fatigue) contradicts the usual framing of governance as pure overhead. Organizational Efficiency	mixed	high	relationship between governance and combined measures of performance and fatigue	0.24
No single governance setting dominates across all contexts; moderate governance becomes increasingly competitive as the learner accumulates experience within the governed action space. Task Allocation	mixed	high	relative performance of governance settings over learning/experience (competitiveness of moderate governance)	0.48
HAAS can serve as a pre-deployment workbench for comparing and inspecting human–AI allocation policies before organisational commitment. Adoption Rate	positive	high	ability to compare and inspect allocation policies prior to deployment	0.24