Generative AI can reliably spot employment-law claims and draft investigatory plans comparable to junior associates, potentially saving firms many hours of research; but frequent incorrect case citations mean human lawyers must still verify and supervise outputs.

Robot Wingman: Using AI to Assess an Employment Termination

Perritt, Henry H., Jr. · March 16, 2026 · Insecta mundi

openalex descriptive low evidence 7/10 relevance Source PDF

In a small experiment, four generative AI systems identified major employment-law claims and proposed investigatory steps at a level comparable to experienced associates but frequently produced incorrect case citations, implying strong productivity potential tempered by important reliability limits and the need for attorney oversight.

Four major generative AI engines—DeepSeek, Claude, ChatGPT, and Grok—are useful legal analysis tools for employment law practitioners. This Article presents the results of an experiment in which a transcript of a hypothetical client interview involving potential disability discrimination, retaliation, and wrongful termination claims was submitted to each AI system. The accompanying prompts requested identification and assessment of viable legal theories. The experiment demonstrates that contemporary generative AI performs sophisticated legal analysis comparable to experienced associates, correctly identifying major employment law claims including ADA violations, Title VII discrimination, OSHA retaliation, FMLA interference, and workers’ compensation retaliation. All four engines successfully spotted legal issues, assessed claim strengths and weaknesses, and suggested follow-up investigation—tasks that traditionally required eight to forty hours of junior attorney research time. Significant limitations emerged, however, in case law citations, with most cited cases being non-existent or incorrectly referenced, though statutory and regulatory citations proved generally accurate and useful. The analysis reveals AI’s potential to transform law firm economics by dramatically reducing research time while maintaining analytical quality, though careful attorney oversight remains essential. The technology particularly benefits less experienced practitioners by providing comprehensive starting points for legal research, while experienced attorneys can use it for quality control and initial drafts. Generative AI serves as an effective “wingman” for employment lawyers, capable of replacing substantial junior associate work while requiring continued human expertise for client counseling, supervision, and final legal advice preparation.

Summary

Main Finding

Contemporary generative AI engines (DeepSeek, Claude, ChatGPT, Grok) can perform sophisticated employment-law analysis—issue spotting, claim assessment, and investigational next steps—at a level comparable to experienced junior associates, identifying ADA, Title VII, OSHA retaliation, FMLA interference, and workers’‑compensation retaliation claims from a hypothetical client interview. This capability can replace substantial junior research time (estimated savings: 8–40 hours per matter) but produces unreliable case‑law citations (statutes/regulations generally accurate). Careful attorney oversight remains essential.

Key Points

Engines tested: DeepSeek, Claude, ChatGPT, Grok.
Strengths
- Accurate issue spotting across multiple employment-law theories (disability discrimination, retaliation, wrongful termination, FMLA interference, OSHA/workers’‑compensation retaliation).
- Reasoned assessments of claim strengths/weaknesses and useful suggestions for follow‑up investigation.
- Statutory and regulatory citations usually correct and useful.
- Output quality comparable to experienced associates for initial legal analysis and drafting; can produce comprehensive starting points quickly.
Weaknesses / Risks
- Case‑law citations frequently incorrect or non‑existent (hallucinations), requiring verification.
- Single‑transcript experiment; generalizability depends on prompts, engine versions, and inputs.
- AI cannot replace client counseling, supervision, ethical judgment, or final legal advice.
- Malpractice/regulatory risk if outputs used without verification.
Practical role: effective “wingman” for employment lawyers—particularly valuable to less experienced practitioners; experienced attorneys benefit for quality control, faster drafting, and triage.

Data & Methods

Design: A transcript of a hypothetical client interview presenting potential disability discrimination, retaliation, and wrongful‑termination facts was submitted to each of the four generative-AI engines with prompts requesting identification and assessment of viable legal theories.
Evaluated outputs on:
- Issue spotting (which legal theories identified)
- Assessment (strengths/weaknesses of claims)
- Suggested follow‑up investigation steps
- Citation accuracy (case law vs. statutes/regulations)
- Time equivalence to junior associate research (estimated replacement of 8–40 hours)
Benchmark/comparison: outputs judged comparable to experienced associates’ work product for the tasks above.
Limitations of study:
- Appears to be a single‑scenario experiment (one transcript), limiting statistical generality.
- Engine versions, prompt phrasing, and evaluation scoring not fully specified.
- Citation failures likely sensitive to prompt engineering and post‑processing.

Implications for AI Economics

Labor substitution and complementarity
- Large component of junior associate research and drafting is substitutable by AI, reducing marginal cost of those tasks.
- Senior attorneys remain complementary—supervision, client counseling, strategy, litigation advocacy.
Firm economics and staffing
- Potential reduction in billed junior associate hours; firms may restructure hiring, delegate fewer routine research tasks to juniors, or redeploy juniors to higher‑value work (client contact, complex drafting).
- Smaller firms/solo practitioners can access high‑quality initial analysis at lower cost, lowering barriers to entry and increasing competition.
Pricing and business models
- Enables efficiency‑based pricing (flat fees, subscription models) as predictable research costs fall.
- Firms might offer cheaper, faster intake/triage services or expand volume of matters handled.
Productivity and output quality
- Large productivity gains if effective verification workflows are implemented (faster turnaround, more matters handled per attorney).
- Quality control costs (verification of case citations, oversight) will remain necessary and should be factored into ROI.
Risk, regulation, and insurance
- Increased malpractice exposure if inaccurate AI outputs are used without verification; expect higher emphasis on audit trails, disclaimers, and insurance adaptation.
- Regulators and bar associations may require transparency about AI use and verification.
Adoption considerations and strategic recommendations
- Implement AI for initial research and drafting but mandate human verification of case law and final advice.
- Invest in prompt engineering, integration, and training to reduce hallucinations and improve consistency.
- Redesign workflows: use AI for triage and first drafts; reallocate human labor to supervision, strategy, and client interaction.
- Track metrics (time saved, error rates, client outcomes) to calibrate staffing and pricing changes.

Overall, generative AI promises substantial cost and time efficiencies in employment-law research and early case assessment, shifting the economic balance toward lower marginal research costs and higher emphasis on human oversight and high‑value legal tasks.

Assessment

Paper Typedescriptive Evidence Strengthlow — Findings are based on a small, purposive experiment (one hypothetical client transcript submitted to four models) without systematic metrics, baseline comparisons, statistical analysis, or real-world outcome data, so claims about productivity and replacement of junior associate work are suggestive but not robustly supported. Methods Rigorlow — The study uses a single-case design with no randomization, no blinded scoring, no inter-rater reliability or quantitative performance metrics, limited transparency about prompts and evaluation criteria, and no replication across varied fact patterns or time, undermining internal validity and reproducibility. SampleOutputs from four commercial generative models (DeepSeek, Claude, ChatGPT, Grok) in response to a single hypothetical client interview transcript concerning potential ADA disability discrimination, Title VII discrimination, OSHA retaliation, FMLA interference, and workers' compensation retaliation; prompts requested identification and assessment of legal theories and suggested follow-up investigation, with qualitative comparison to experienced associate performance and notes on citation accuracy. Themesproductivity human_ai_collab skills_training org_design adoption GeneralizabilitySingle hypothetical vignette — may not generalize to other fact patterns or complexity levels, Only four proprietary models evaluated — results may differ across other models or future versions, Findings pertain to U.S. employment-law categories tested and may not generalize to other legal domains or jurisdictions, No real-world client interactions or downstream outcomes (litigation success, billing, client satisfaction) measured, Assessment relies on qualitative judgments rather than standardized, replicable metrics

Claims (9)

Claim	Direction	Confidence	Outcome	Details
Four major generative AI engines—DeepSeek, Claude, ChatGPT, and Grok—are useful legal analysis tools for employment law practitioners. Output Quality	positive	high	usefulness of AI as legal analysis tools (quality of analysis/output)	n=4 0.18
This Article presents the results of an experiment in which a transcript of a hypothetical client interview involving potential disability discrimination, retaliation, and wrongful termination claims was submitted to each AI system, with prompts requesting identification and assessment of viable legal theories. Other	null_result	high	experimental procedure (input and prompts)	n=4 0.3
Contemporary generative AI performs sophisticated legal analysis comparable to experienced associates, correctly identifying major employment law claims including ADA violations, Title VII discrimination, OSHA retaliation, FMLA interference, and workers’ compensation retaliation. Output Quality	positive	high	ability to identify relevant legal claims and assess them	n=4 0.18
All four engines successfully spotted legal issues, assessed claim strengths and weaknesses, and suggested follow-up investigation—tasks that traditionally required eight to forty hours of junior attorney research time. Output Quality	positive	high	issue-spotting and assessment quality; implied time savings relative to traditional junior attorney research	n=4 eight to forty hours of junior attorney research time 0.09
Significant limitations emerged in case law citations, with most cited cases being non-existent or incorrectly referenced. Error Rate	negative	high	accuracy of case law citations (error rate / hallucination rate)	n=4 0.18
Statutory and regulatory citations proved generally accurate and useful. Output Quality	positive	high	accuracy/usability of statutory and regulatory citations	n=4 0.18
The analysis reveals AI’s potential to transform law firm economics by dramatically reducing research time while maintaining analytical quality, though careful attorney oversight remains essential. Firm Productivity	positive	high	law firm economics (research time reduction and analytical quality)	n=4 0.03
The technology particularly benefits less experienced practitioners by providing comprehensive starting points for legal research, while experienced attorneys can use it for quality control and initial drafts. Training Effectiveness	positive	high	benefit to practitioners (training/assistance, drafting, quality control)	n=4 0.09
Generative AI serves as an effective 'wingman' for employment lawyers, capable of replacing substantial junior associate work while requiring continued human expertise for client counseling, supervision, and final legal advice preparation. Job Displacement	mixed	high	potential replacement of junior associate tasks and required human oversight	n=4 0.09