An LLM-powered forecasting tool cut cafeteria forecast errors by roughly 30% for new dishes while keeping staff in control via override features; qualitative sessions show humans still crucial for unusual or high-uncertainty situations.
Cafeteria demand planning requires both algorithmic pattern recognition and human expertise, yet current systems treat these separately, which generates significant food waste. This paper reports on a 9-month action design research (ADR) project at a German financial services firm. Using a practice-driven abductive approach, we developed a collaborative forecasting system that leverages semantic processing using large language models (LLMs) to solve the “cold-start” problem for novel menu items while preserving human agency via override mechanisms. Our evaluation combines algorithmic benchmarking, reducing forecast errors by 30% over naive baselines, with two think-aloud sessions showing that human judgment remains critical for high-uncertainty events. We distill our findings into a meta-design and four design principles (DPs), grounded in kernel theories, for systems where human contextual intelligence and algorithmic recognition must coexist. We contribute to the discourse on human-AI collaboration and sustainable IS by providing a rigorous blueprint for designing synergistic, trustworthy, and diagnostic operational planning tools.
Summary
Main Finding
A nine-month action design research project produced a human–AI cafeteria demand forecasting system that combines tree‑based machine learning (XGBoost) for pattern recognition with semantic processing via large language models (LLMs) to solve cold-starts for novel menu items, while preserving human agency through override and feedback mechanisms. The hybrid system reduced forecast errors by ~30% versus naive baselines and retained critical human judgment for high‑uncertainty events (shown in two think‑aloud sessions). The authors distill a meta‑design and four design principles (DPs) grounded in kernel theories for operational planning systems that seek synergy between algorithmic recognition and human contextual intelligence.
Key Points
- Problem: Cafeteria demand forecasting requires both large‑scale pattern recognition and situated human contextual knowledge; prior approaches treated these separately, causing persistent food waste (~20% in many settings).
- Methodology: Practice‑driven Action Design Research (ADR) over three iterative BIE cycles (build–intervene–evaluate) with practitioners at a German financial services association.
- Core artifact: A collaborative forecasting system combining:
- XGBoost ensembles for demand prediction (feature engineering: temporal lags, calendar markers, contextual features),
- LLM‑based semantic processing/embeddings to map novel menu descriptions to historical analogues (cold‑start handling),
- User interface mechanisms for transparency, calibrated uncertainty, and human overrides/feedback.
- Performance: Algorithmic benchmarking shows ~30% reduction in forecast errors compared to naive baselines; initial quick POC achieved ~65% accuracy on validation before refinements.
- Human role: Two think‑aloud sessions indicate human planners remain essential for exceptional/high‑uncertainty events (e.g., one‑offs, construction, meetings). System design preserves agency and facilitates trust.
- Contributions: Meta‑design and four empirically derived DPs (grounded in human–AI collaboration, explainability/trust, and sustainable IS literatures) for designing operational planning tools that integrate human expertise and AI.
- Limitations noted: Single organization case, nine‑month horizon, context specificity (communal catering).
Data & Methods
- Case: Large regional financial services association cafeteria (200–400 portions/day; wide demand variability, holidays/bridge days, events).
- Dataset: Operational data covering demand, pricing, and menu information from 2022 onward (historical transaction/consumption records and contextual markers).
- ADR process: Three iterative cycles from October 2024–June 2025:
- Alpha 1 — problem scoping and simple statistical POC (65% validation accuracy).
- Alpha 2 — prototype with XGBoost, richer features, added contextual variables.
- Final — integrated system with LLM semantic processing for cold‑start items, UI for overrides, and feedback loops.
- Algorithms & techniques:
- Primary forecasting: Gradient boosting (XGBoost) chosen for nonlinear patterns, mixed data handling, and interpretability (feature importance).
- Cold‑start: LLM semantic embeddings / natural language processing of menu text to link new dishes to historical analogues.
- UX elements: Explainability cues, uncertainty/calibration display, and manual override/feedback capture.
- Evaluation:
- Algorithmic benchmarking against naive baselines (reported ~30% error reduction).
- Qualitative evaluation: Two think‑aloud user sessions to probe interaction, trust, and when humans override model output.
- Operational assessment: Early indications of reduced mismatch between production and demand; formal cost/waste accounting left for future work.
Implications for AI Economics
- Direct economic gains: Improved forecasting accuracy reduces overproduction and food waste, yielding procurement cost savings and lower disposal costs. A reported ~30% error reduction implies meaningful reductions in variable food costs and waste‑related expenses (exact monetary gains require local costing).
- Value of hybrid investments: The study highlights that combining ML with LLM semantic capabilities and human‑in‑the‑loop interfaces can deliver higher practical value than automation‑only or human‑only approaches. Economic returns depend on balancing spending on model development, LLM/embedding infrastructure, and investment in usable interfaces and governance.
- Adoption and ROI depend on trust & agency: Economic benefits materialize only if frontline planners adopt the system. Design features that preserve control (overrides), provide calibrated uncertainty, and enable feedback accelerate adoption and thus the realization of savings.
- Labour and human capital effects: The approach augments experienced planners rather than replacing them—reducing cognitive load and burnout risk while preserving tacit knowledge. Firms should account for potential reallocation of labor (less time forecasting, more time on exception handling and quality control).
- Scalability & markets: Similar hybrid systems can be productized for broader communal catering, hospitals, universities, and corporate cafeterias. There is potential for subscription/enterprise software markets combining domain ML models with LLM‑based cold‑start modules.
- Measurement recommendations for economic evaluation:
- Track reduction in meals wasted (units) and convert to procurement and disposal cost savings.
- Estimate emissions/ externality reductions for sustainability valuation (CO2e per kg food avoided).
- Calculate payback period: compare development/operational costs (models, LLM API or hosting, integration, UX) vs. monthly waste/cost savings.
- Consider A/B or randomized rollout to estimate causal impact on waste and costs.
- Policy/externalities: Reduced food waste aligns with sustainability goals and may yield regulatory or reputational benefits (and possibly incentives), further improving the economic case.
- Research agenda for AI economics:
- Formal cost‑benefit and sensitivity analyses across settings (small vs large cafeterias).
- Comparative studies of investment allocation (model accuracy vs explainability/UI) on adoption and economic returns.
- Market analysis for LLM‑augmented forecasting tools and pricing strategies that internalize sustainability externalities.
Limitations and next steps: single‑case ADR evidence—broader trials, randomized evaluations, and full economic accounting (waste volumes → € savings; emissions valuation) are needed to quantify generalizable economic impacts and inform deployment decisions.
Assessment
Claims (8)
| Claim | Direction | Confidence | Outcome | Details |
|---|---|---|---|---|
| Cafeteria demand planning requires both algorithmic pattern recognition and human expertise, yet current systems treat these separately, which generates significant food waste. Organizational Efficiency | negative | high | food waste |
0.03
|
| This paper reports on a 9-month action design research (ADR) project at a German financial services firm. Other | null_result | high | study duration and setting |
0.3
|
| We developed a collaborative forecasting system that leverages semantic processing using large language models (LLMs) to solve the 'cold-start' problem for novel menu items while preserving human agency via override mechanisms. Task Allocation | null_result | high | resolution of cold-start forecasting for novel menu items; preservation of human agency via overrides |
0.18
|
| Algorithmic benchmarking reduced forecast errors by 30% over naive baselines. Error Rate | positive | high | forecast error |
30% reduction in forecast errors over naive baselines
0.18
|
| Two think-aloud sessions show that human judgment remains critical for high-uncertainty events. Decision Quality | positive | high | importance/role of human judgment in handling high-uncertainty forecasting events |
n=2
0.18
|
| We distill our findings into a meta-design and four design principles (DPs), grounded in kernel theories, for systems where human contextual intelligence and algorithmic recognition must coexist. Other | null_result | high | design principles and meta-design artifact |
0.3
|
| The paper provides a rigorous blueprint for designing synergistic, trustworthy, and diagnostic operational planning tools, contributing to the discourse on human-AI collaboration and sustainable information systems (IS). Organizational Efficiency | positive | high | guidance/blueprint for operational planning tool design |
0.18
|
| The system preserves human agency via override mechanisms. Worker Satisfaction | positive | high | preservation of human agency (ability to override algorithmic forecasts) |
0.18
|