AI decision-support modes do not directly boost novice teachers' task performance but shape it indirectly by changing users' confidence and trust; agreement between teachers and AI substantially increases both trust and performance.
Artificial intelligence is playing an increasingly important role in supporting decision-making, particularly in educational contexts, where it serves as a critical tool to assist teacher judgment and optimize instructional decisions. However, limited research has examined how different AI-assisted decision-making paradigms influence the Performance of human-AI collaboration, as well as the underlying psychological mechanisms and causal pathways. Therefore, this study investigated 59 pre-service teachers to examine how AI-assisted decision-making paradigms and human-AI consistency influenced their psychological states and task performance. Specifically, this study employed a two-factor mixed experimental design, with the AI-assisted decision-making paradigms as the between-subjects factor and human-AI consistency as the within-subjects factor. Data were analyzed using the Bayesian cumulative link mixed model and structural equation modeling. The results reveal that AI-assisted decision-making paradigms do not have a significant direct effect on task performance. However, when the moderating role of human-AI decision consistency is taken into account, the effect of AI-assisted decision-making paradigms on task performance can exert its influence indirectly through a sequential psychological pathway involving users’ confidence and their trust in the AI. Consistency between human and AI decisions not only significantly enhances users’ trust in AI, confidence, and task performance, but the proportion of consistent decisions also significantly moderates the impact of AI-assisted decision-making paradigms on users’ confidence levels. Notably, our findings indicate that users maintain a moderately level of trust in AI even when their decisions diverge from those of AI. In summary, this study highlights the mediating mechanism by which AI-assisted decision-making paradigms influence task performance through psychological states and identifies the moderating role of human-AI consistency in this pathway. These findings advance the theoretical understanding of human-AI interaction models in educational contexts and offer mechanistic insights to guide the optimization of instructional AI systems.
Summary
Main Finding
AI-assisted decision-making paradigms (concurrent vs sequential) did not directly change task performance in an instructional student-performance prediction task. Instead, the type of paradigm influences performance indirectly through a sequential psychological pathway: the paradigm affects users’ confidence, which affects trust in the AI, and these psychological states jointly shape performance. Crucially, the proportion of human–AI decision consistency (alignment) both improves trust, confidence and performance and moderates the paradigm→confidence link. Unexpectedly, higher self-reported confidence was associated with worse performance (larger absolute error).
Key Points
- Participants and task
- N = 59 pre-service teachers (29 concurrent, 30 sequential).
- Task: predict student final grade (25 trials) using 8 selected features from the UCI Student Performance dataset.
- AI support: identical random-forest predictions in both conditions; only interaction paradigm differed.
- Manipulated paradigms
- Concurrent: AI prediction shown before participants make their decision.
- Sequential: participants make an initial decision, then see AI suggestion and may revise it.
- Measures
- Trust in AI and confidence in one’s final decision: single-item, momentary ratings on 7-point Likert scales after each trial.
- Task performance: absolute error between participant prediction and true grade.
- Main statistical results (CLMM & SEM)
- CLMM: Human–AI decision consistency significantly reduced decision error (Estimate = 0.26, 95% CI [0.09, 0.42]) and increased trust (consistency main effect large; Estimate ≈ −2.40 on trust scale). AI paradigm had no main effect on performance (Estimate = −0.04, 95% CI [−0.20, 0.13]).
- Confidence: sequential paradigm yielded lower confidence than concurrent (CLMM Estimate = −0.77, 95% CI [−1.46, −0.12]; SEM β = −0.613, p = 0.015). Inconsistency lowered confidence (CLMM).
- SEM: No direct effect of paradigm on performance (β = −0.239, p = 0.297). Confidence positively predicted decision error (β = 0.537, p < 0.001) — i.e., higher confidence → larger errors. The model supports a sequential mediation (paradigm → confidence → trust → performance) with human–AI consistency moderating the paradigm→confidence path.
- Additional behavioral insight
- Participants maintained a moderate level of trust in AI even when their decisions disagreed with the AI, though trust was lower under inconsistency.
Data & Methods
- Sample: 59 pre-service teachers, randomized between concurrent vs sequential paradigms.
- Task design: 25 repeated prediction trials per participant; no ground-truth feedback during task.
- AI: Random forest trained on UCI Student Performance data; identical accuracy across conditions.
- Measures: single-item momentary trust and confidence (7-pt Likert), absolute error as performance metric.
- Analyses:
- Bayesian cumulative link mixed models (CLMM) with random intercepts for participants and trials to examine main and interaction effects on ordinal trust/confidence and discrete absolute-error outcomes.
- Structural equation modeling (SEM; SmartPLS) to test a moderated sequential mediation: paradigm → confidence → trust → performance, with human–AI decision consistency as moderator on the paradigm→confidence path. Bootstrapping (5,000 resamples) used for inference.
- Limitations to note: modest sample size, single-domain (pre-service teachers) and lab-style task (no feedback), single-item measures for psychological states, and a controlled AI accuracy (limits generalization to variable-quality AI).
Implications for AI Economics
- Product design and value capture
- Interaction paradigm matters economically via its psychological effects. Vendors should optimize how AI recommendations are presented (concurrent vs sequential) to shape user confidence and trust rather than assuming a direct productivity gain from any AI interface.
- Features that preserve or calibrate appropriate user confidence (e.g., explanations, uncertainty estimates) can have outsized returns because confidence mediates downstream performance and adoption.
- Adoption, pricing and procurement
- Human–AI consistency (alignment) is a key value-driver: higher alignment increases trust and perceived effectiveness. Procurement decisions and pricing models should account for design elements that increase perceived alignment (calibration, personalization, transparency).
- Buyers (schools, districts) should evaluate not only aggregate accuracy but also how AI outputs interact with human decision processes (i.e., probability of agreement and impact on confidence).
- Market outcomes and policy
- Overconfidence risk: higher confidence can correlate with worse outcomes. Economic evaluations (cost–benefit, ROI) should incorporate behavioral responses (miscalibrated confidence) that can reduce realized productivity gains from AI.
- Regulation and auditing: metrics for education-AI performance should include human-AI agreement rates and measures of user calibration (confidence vs accuracy), not only model accuracy.
- Modeling and forecasting impacts
- Macroeconomic or microeconomic models of AI-driven productivity should include behavioral parameters (trust, confidence, agreement probability) and interaction-paradigm effects. Returns to AI investments will depend on these behavioral channels.
- Implementation and training investments
- Complementary investments (teacher training, interface design to surface alignment information, decision-decision reconciliation workflows) can increase realized benefits from AI tools, affecting cost-effectiveness and adoption curves.
- Research & evaluation priorities
- Payoffs from different interaction paradigms will vary by context, task difficulty, and user population. Economic evaluations (pilots, RCTs) should be used to estimate heterogeneity in psychological channels before large-scale deployments.
Suggestions for follow-up empirical work (relevant to economists) - Larger, more diverse samples and field deployments to estimate external validity and heterogeneity in behavioral responses. - Vary AI accuracy and explainability features to quantify marginal returns to alignment/calibration interventions. - Longitudinal studies to assess learning and calibration dynamics (does confidence bias attenuate with feedback?). - Cost-effectiveness analyses comparing investment in model accuracy vs interface/UX and training that target trust/confidence alignment.
Assessment
Claims (7)
| Claim | Direction | Confidence | Outcome | Details |
|---|---|---|---|---|
| AI-assisted decision-making paradigms do not have a significant direct effect on task performance. Output Quality | null_result | high | task performance |
n=59
0.6
|
| When human-AI decision consistency is taken into account, AI-assisted decision-making paradigms influence task performance indirectly through a sequential psychological pathway involving users’ confidence and their trust in the AI. Output Quality | positive | high | task performance (mediated effect) |
n=59
0.6
|
| Consistency between human and AI decisions significantly enhances users' trust in AI. Ai Safety And Ethics | positive | high | trust in AI |
n=59
0.6
|
| Consistency between human and AI decisions significantly enhances users' confidence. Decision Quality | positive | high | users' confidence |
n=59
0.6
|
| Consistency between human and AI decisions significantly enhances task performance. Output Quality | positive | high | task performance |
n=59
0.6
|
| The proportion of consistent decisions significantly moderates the impact of AI-assisted decision-making paradigms on users' confidence levels. Decision Quality | positive | high | users' confidence (moderation effect) |
n=59
0.6
|
| Users maintain a moderate level of trust in AI even when their decisions diverge from those of AI. Ai Safety And Ethics | positive | high | trust in AI under decision divergence |
n=59
0.6
|