Microsoft 365 Copilot is seen as user-friendly and reliable and yields clear efficiency gains for administrative work in a research institute, while researchers' productivity views improve only with role-specific training and governance measures.

Generative KI in der Wissensarbeit: Wahrnehmung, Nutzen und Akzeptanz von Microsoft 365 Copilot

Carsten F. Schmidt, Sophie Petzolt, Wolfgang Beinhauer, Ingo Weber, Stefan Langer · April 23, 2026 · Zeitschrift für Arbeitswissenschaft

openalex descriptive low evidence 7/10 relevance Full text usable extracted full text DOI Source PDF

In a non-university research institute, employees generally judged Microsoft 365 Copilot to be user-friendly and reliable, with the largest perceived productivity and efficiency gains for administrative, structured text tasks and improving perceptions among scientific staff over time contingent on contextual training and governance.

Zusammenfassung Die Studie untersucht die Einführung von Microsoft 365 Copilot in einer außeruniversitären Forschungseinrichtung anhand einer wiederholten Querschnittsbefragung lizenzierter Beschäftigter. Erfasst werden Nützlichkeit, Benutzerfreundlichkeit, Ergebnisqualität und Zuverlässigkeit sowie der Nutzen für typische Tätigkeiten der Wissensarbeit. Verwaltungsmitarbeitende bewerten Nützlichkeit und Zuverlässigkeit höher, während wissenschaftliche Mitarbeitende im Zeitverlauf positivere Einschätzungen insbesondere zu Produktivität und Arbeitserleichterung entwickeln. Copilot wird überwiegend als benutzerfreundlich und technisch zuverlässig wahrgenommen, mit dem größten Mehrwert für klar strukturierte, textbasierte Aufgaben. Die Befunde unterstreichen die Bedeutung kontextspezifischer Einführung, rollenbezogener Qualifizierung und Governance für eine nachhaltige Akzeptanz generativer KI. Praktische Relevanz : Die Untersuchung liefert empirische Erkenntnisse zur Einführung generativer KI in wissensintensiven Organisationen. Sie zeigt, dass Microsoft 365 Copilot insbesondere im administrativen Bereich Effizienzgewinne ermöglicht, während in der Forschung kontextbezogene Schulungs- und Begleitmaßnahmen entscheidend für den Erfolg sind. Die Ergebnisse unterstützen Organisationen dabei, Implementierungsstrategien für KI arbeitskontextsensitiv zu gestalten, Akzeptanz zu fördern und Potenziale zielgerichtet zu nutzen, um produktive Mensch-KI-Interaktion und organisatorischen Mehrwert zu sichern.

Summary

Main Finding

In a repeated cross-sectional survey of employees at a large non-university research organization piloting Microsoft 365 Copilot, administrative staff report higher perceived usefulness and reliability overall, while scientific staff show significant positive shifts over time (T01 → T02) in perceived usefulness, ease of use, and perceptions of productivity/workload reduction. Copilot is seen as broadly user‑friendly and technically reliable and delivers the greatest incremental value for clearly structured, text‑based knowledge‑work tasks. The authors emphasize that context‑sensitive rollout, role‑specific training, and governance are critical to realize sustainable acceptance and organizational value.

Key Points

Differential baseline perceptions
- Administrative employees initially rate Copilot’s usefulness and reliability higher than scientific staff.
Learning / routinization effects
- Scientific staff’s mean perceived usefulness rose from 0.42 to 1.09 (scale –3 to +3; p = 0.022).
- Ease of use for scientific staff increased from 0.74 to 1.31 (p = 0.016).
- Output quality perceptions for scientific staff trended upward (0.26 → 0.73; p ≈ 0.067).
- Reliability remained comparatively stable across groups/time.
Task heterogeneity
- Highest added value for structured, text‑centric tasks (e.g., drafting, summarizing, template generation, administrative workflows).
- Less clear added value for open‑ended scientific reasoning, experiments, or tasks requiring domain‑specific validation.
Practical recommendations from authors
- Context‑sensitive implementation strategies, role‑tailored training and support, and clear governance (privacy, IP, compliance) are necessary for sustainable adoption.
Limitations flagged by authors
- Non-random quota sampling of license holders; modest sample sizes.
- Data are repeated cross‑sections (not a longitudinal panel) and based on self‑reports; not representative of entire organization (~32k staff).
- Multiple Copilot updates occurred during the observation window (8 documented updates), complicating attribution.

Data & Methods

Setting and sample
- Pilot rollout of Microsoft 365 Copilot in a large German non‑university research organization (Fraunhofer context).
- 550 employees received licenses and invitations; two survey waves:
  - T01 (Nov–Dec 2024): N = 106 (66 scientific, 40 administrative)
  - T02 (Mar–Apr 2025): N = 90 (51 scientific, 39 administrative)
- Pseudonymized IDs were available but overlap between waves was small, so analyses treated waves as repeated cross‑sections.
Measures
- Core acceptance constructs: perceived usefulness (PU; 5 items), perceived ease of use (2 items), output quality (1 item), reliability (1 item), voluntariness, and task‑specific usefulness (9 items).
- Items on 7‑point Likert scale, coded –3 to +3.
- PU scale reliability: Cronbach’s α = 0.97 (T01), 0.95 (T02). Ease‑of‑use inter‑item r = 0.58 (T01), 0.66 (T02).
Analysis
- Descriptive statistics by group and wave.
- Welch t‑tests for group/time differences; Cohen’s d reported for effect sizes.
- Task‑level usefulness analyzed descriptively (no multiple inferential tests to avoid α inflation).
Contextual factor
- Eight product/model updates to M365 Copilot between Nov 2024 and Apr 2025, including UI, integration, and stability changes.
Key numerical results (selected)
- Perceived usefulness (scientific): 0.42 → 1.09 (d_time ≈ 0.42, p = 0.022)
- Perceived usefulness (administrative): 0.94 → 1.35 (d_time ≈ 0.38, p = 0.092)
- Ease of use (scientific): 0.74 → 1.31 (d_time ≈ 0.45, p = 0.016)
- Output quality (scientific): 0.26 → 0.73 (d_time ≈ 0.34, p ≈ 0.067)
Limitations reiterated
- Self‑report measures, potential selection bias toward tech‑interested users, small samples, inability to infer individual longitudinal change or causal productivity effects.

Implications for AI Economics

Heterogeneous productivity effects
- Productivity gains from generative AI are task‑ and role‑dependent: routine, structured, text‑based administrative tasks show clearer, earlier gains than exploratory scientific work. Economic models forecasting AI impacts on labor should incorporate task heterogeneity within occupations and organizations.
Learning and adoption dynamics matter
- Observable positive shifts for scientific staff across a short pilot suggest adoption and learning effects can materially change perceived usefulness over months. Economic assessments (cost‑benefit, ROI) should account for time‑dependent uptake and skill acquisition rather than one‑off productivity multipliers.
Importance of complementary investments
- Realizing AI’s value requires investments in role‑specific training, process redesign, and governance. From an economics perspective, total gains = technology capability + complementary human capital + organizational change; omitting complements will overestimate realized benefits.
Governance and externalities
- Concerns around reliability, IP, data protection, and compliance (especially in research organizations) can constrain deployment or impose mitigation costs. Policy and firm‑level governance choices will influence net economic returns and distributional outcomes.
Procurement and update cycles
- Product/model updates (8 during the study window) affect perceived performance and stability; procurement decisions and cost forecasting should consider ongoing vendor updates, maintenance, and monitoring costs.
Research agenda suggestions (for AI economics)
- Move from perception surveys to causal, within‑subject longitudinal designs and objective productivity metrics (time‑on‑task, output quality validated by peers).
- Estimate heterogeneous treatment effects by task type, skill level, and occupation to inform labor demand forecasts.
- Model the optimal allocation of training and governance spending across organization units to maximize net productivity.
- Incorporate dynamic adoption curves and software update externalities into macro/micro estimates of AI’s economic impact.

Reference: Schmidt et al., "Generative AI in knowledge work: Perception, usefulness, and acceptance of Microsoft 365 Copilot" (Zeitschrift für Arbeitswissenschaft, accepted 25 Feb 2026; DOI: 10.1007/s41449-026-00517-5).

Assessment

Paper Typedescriptive Evidence Strengthlow — Findings are based on repeated cross-sectional self-report survey data from a single organization without experimental variation or objective productivity measures, so causal claims about Copilot's impact are not supported and results may reflect perception biases and selection effects. Methods Rigormedium — The study uses repeated cross-sectional surveying and disaggregates responses by role, which captures temporal trends and heterogeneity, but relies on self-reported outcomes, lacks objective performance metrics, has an unclear sample size/response rate, and is limited to one organization. SampleLicensed employees at a single non-university research institution (including administrative staff and scientific/research staff) surveyed repeatedly in cross-sectional waves after introduction of Microsoft 365 Copilot; exact N, sampling frame, response rates, and survey timing not specified in the summary. Themeshuman_ai_collab productivity adoption skills_training governance GeneralizabilitySingle-organization study limits external validity to other institutions or sectors, Findings driven by knowledge-work and research context—may not apply to manufacturing or frontline services, Specific to Microsoft 365 Copilot and the Microsoft ecosystem; other generative AI tools may perform differently, Cultural, regulatory and organizational factors (e.g., country, governance structures) may limit transferability, Self-reported outcomes may not reflect actual productivity gains, Short-to-medium-term follow-up may miss longer-run effects

Claims (8)

Claim	Direction	Outcome	Confidence & Evidence	Details
Die Studie basiert auf einer wiederholten Querschnittsbefragung lizenzierter Beschäftigter einer außeruniversitären Forschungseinrichtung. Other	null_result	Studiendesign / Datengrundlage (repeated cross-sectional survey)	Reading fidelity high Study strength high	not reported 0.3
Verwaltungsmitarbeitende bewerten die Nützlichkeit und die Zuverlässigkeit von Microsoft 365 Copilot höher als wissenschaftliche Mitarbeitende. Worker Satisfaction	positive	Perzipierte Nützlichkeit und Zuverlässigkeit (Selbstbericht)	Reading fidelity high Study strength medium	not reported 0.18
Wissenschaftliche Mitarbeitende entwickeln im Zeitverlauf positivere Einschätzungen, insbesondere hinsichtlich Produktivität und Arbeitserleichterung durch Copilot. Developer Productivity	positive	Perzipierte Produktivität und Arbeitserleichterung (Selbsteinschätzung über Zeit)	Reading fidelity high Study strength medium	not reported 0.18
Microsoft 365 Copilot wird überwiegend als benutzerfreundlich und technisch zuverlässig wahrgenommen. Worker Satisfaction	positive	Perzipierte Benutzerfreundlichkeit und technische Zuverlässigkeit	Reading fidelity high Study strength medium	not reported 0.18
Der größte Mehrwert von Copilot liegt bei klar strukturierten, textbasierten Aufgaben. Task Allocation	positive	Wahrgenommener Nutzen nach Aufgabentyp (textbasierte, strukturierte Aufgaben)	Reading fidelity high Study strength medium	not reported 0.18
Die Befunde unterstreichen die Bedeutung kontextspezifischer Einführung, rollenbezogener Qualifizierung und Governance für eine nachhaltige Akzeptanz generativer KI in Organisationen. Governance And Regulation	positive	Empfohlene Implementierungsmaßnahmen (Kontextanpassung, Schulung, Governance) zur Förderung von Akzeptanz	Reading fidelity high Study strength low	not reported 0.09
Die Untersuchung zeigt, dass Microsoft 365 Copilot insbesondere im administrativen Bereich Effizienzgewinne ermöglicht. Organizational Efficiency	positive	Wahrgenommene Effizienzgewinne im administrativen Bereich	Reading fidelity high Study strength low	not reported 0.09
Im Forschungskontext sind kontextbezogene Schulungs- und Begleitmaßnahmen entscheidend für den Erfolg der Copilot-Einführung. Training Effectiveness	positive	Bedeutung von Schulungs- und Begleitmaßnahmen für Erfolg/Adoption	Reading fidelity high Study strength low	not reported 0.09