The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲
← Papers

Evidence that AI helps women’s careers is thin and fragmentary: most studies audit bias or offer short-term skills support, while few measure long-term retention or advancement or report robust governance and accountability practices.

Artificial intelligence applications supporting women’s career development: a scoping review
Sara Portell-Fonolla, Yasmina El Fassi, Augusta Gaspar, Luís Correia, Joana Carneiro Pinto · June 04, 2026 · International Journal for Educational and Vocational Guidance
openalex review_meta low evidence 7/10 relevance DOI Source PDF
This scoping review of 13 studies finds that AI applications aimed at supporting women’s careers mostly audit or mitigate bias in recruitment and evaluation systems or provide short-term skills support, but provide limited longitudinal evidence on career trajectories, theory use, or governance practices.

Abstract Artificial intelligence (AI) is increasingly integrated into career guidance and organisational decision systems, yet empirical evidence on applications designed to support women’s career development remains limited. Following the PRISMA extension for scoping reviews (PRISMA-ScR) and a preregistered protocol, we searched seven databases (plus backward and forward citation searching) and synthesised 13 empirical studies published between 2018 and 2025. Using inductive thematic analysis, we identified three functional domains: (1) bias mitigation and representation (e.g. auditing gendered language and platform-level disparities), (2) skills development and empowerment (e.g. AI-supported learning and writing interventions) and (3) career pathways and retention (e.g. matching and attrition-risk modelling). The evidence base was concentrated in system-facing applications that detect or shape inequities within recruitment, evaluation and exposure systems; fewer studies evaluated individual-facing developmental support, and sustained career outcomes were rarely measured. Formal theory use was limited, with only a small minority of studies explicitly drawing on established frameworks; reporting on ethics, transparency and governance was inconsistent. We suggest that research prioritises longitudinal and theory-informed evaluations, including intersectionality-informed analyses, and assess downstream impacts on women’s career trajectories alongside robust governance and accountability practices.

Summary

Main Finding

A scoping review of 13 empirical studies (2018–2025) finds AI applications aimed at supporting women’s career development cluster into three functional domains—(1) bias mitigation & representation, (2) skills development & empowerment, and (3) career pathways & retention. Evidence is concentrated on system-facing tools (e.g., auditing language, ad-delivery, screening, exposure models) with few sustained individual-level outcome evaluations. Formal theory, intersectional analyses and transparent ethics/governance reporting are limited, and long-term career impacts are rarely measured.

Key Points

  • Evidence base: 13 empirical studies identified via a preregistered PRISMA-ScR search across seven databases (search period 2010–Mar 2025; synthesis focused on 2018–2025).
  • Three functional domains:
    • Bias mitigation & representation: NLP and auditing tools that detect gendered language, platform-level exposure disparities, and biased visuals.
    • Skills development & empowerment: GenAI/NLP-based writing and learning interventions that improved short-term skills/confidence in small samples.
    • Career pathways & retention: ML/HR-analytics models for job matching, returnship alignment, and attrition risk forecasting.
  • Modality and focus: Majority of studies are system-facing (detecting or shaping inequities in recruitment, evaluation, exposure). Fewer studies evaluate individual-facing developmental supports; almost none track sustained labor-market outcomes (wages, promotions, long-run retention).
  • Methods & rigor: Heterogeneous methods (field experiments, predictive modeling, qualitative narratives, quasi-experiments); limited use of formal theoretical frameworks (some use Systems Theory Framework and Social Cognitive Career Theory).
  • Ethics & governance: Reporting on transparency, fairness metrics, accountability, and governance is inconsistent across studies.
  • Geographic spread: Studies from multiple regions (USA, Europe, Middle East, India, Saudi Arabia, UAE), often context-specific results (e.g., ad-delivery algorithms privileging men because of cost-optimization).
  • Representative findings: algorithmic ad delivery prioritized men for STEM ads; NLP revealed persistent gendered descriptors in evaluations and letters of recommendation; ChatGPT-based training improved writing fluency/confidence among teachers.

Data & Methods

  • Review design: PRISMA-ScR guided scoping review with preregistered protocol (OSF link provided in paper).
  • Search & selection:
    • Databases searched: PubMed, Scopus, Web of Science, APA PsyInfo, APA PsycArticles, Psychology & Behavioral Sciences Collection, Google Scholar.
    • Initial de-duplication and screening: 702 unique records screened; 36 full-text assessed (24 from databases + 12 via snowballing); 13 studies included.
    • Two-reviewer blinded screening using Rayyan; conflicts adjudicated by a third reviewer when needed.
  • Data extraction: standardized charting (authors, year, country, population, AI type, theoretical framework, career outcomes, ethical/technical challenges). Dual extraction with cross-checking on 15% of entries.
  • Synthesis: Inductive thematic analysis (Braun & Clarke; Saldaña) to generate thematic domains and map studies to system-facing vs individual-facing, proximal vs sustained outcomes, theory use, and ethics reporting.
  • AI methods observed in included studies: supervised ML (decision trees, logistic regression), NLP and text-mining, sentiment/emotion analysis, ML ad-delivery systems, GenAI (LLMs, image-based), KNN classifiers, HR analytics.
  • Outcome measurement scope:
    • Proximal outcomes: detection of bias, short-term skill/confidence gains, model prediction accuracy (e.g., resume-to-role alignment).
    • Sustained outcomes: sparse—some attrition-risk forecasts and retrospective cohort analyses, but limited causal or long-term follow-up.

Implications for AI Economics

  • Distributional impacts and market efficiency:
    • AI can reduce frictions in access to guidance (scale, personalization) and potentially increase labor-market participation and human capital accumulation among women, improving allocative efficiency.
    • However, platform-level optimization objectives (e.g., profit or engagement maximization) can generate negative distributional externalities—unequal exposure to job ads or recruitment funnels—that exacerbate gender gaps rather than correct them.
  • Labor supply, retention, and human-capital returns:
    • Tools that improve short-term skills/confidence may raise female labor supply or persistence in male-dominated fields, but absent long-run evidence, the effect on wages, promotions, and returns to training is unknown.
    • Attrition-risk models and targeted retention interventions have potential welfare gains if used to inform equitable HR policies; misuse (e.g., surveillance, punitive measures) could reduce worker welfare and increase turnover.
  • Incentive design and objective functions:
    • Economists and platform designers should re-specify algorithmic objectives to internalize equity considerations (e.g., multi-objective optimization combining engagement/revenue with exposure parity or downstream diversity targets).
    • Regulatory or subsidy mechanisms can shift private incentives—e.g., require fairness constraints, mandate transparency, or subsidize equity-oriented features on recruitment platforms.
  • Measurement and causal inference needs:
    • To quantify economic value and distributional effects, randomized controlled trials, quasi-experimental designs, and longitudinal administrative datasets are needed to estimate causal impacts on wages, promotions, labor-force participation, job-match quality, and retention.
    • Key metrics for economists: treatment effects on earnings, promotion hazard rates, job-match surplus, retention probabilities, return-on-investment of AI coaching, externalities on aggregate labor-market sorting, and cost-effectiveness compared to human-delivered interventions.
  • General equilibrium and long-run dynamics:
    • Widespread adoption of AI-mediated career tools could shift occupational composition and network effects (e.g., who gets visibility), altering wage structures and returns to skills—dynamic/GE models should assess feedback loops (e.g., firms’ hiring behavior adjusting to AI-screened pools).
  • Data, governance, and public policy:
    • Poor transparency and limited governance heighten risks of biased outcomes; policy tools (algorithmic audits, disclosure requirements, data-access regimes for independent evaluation) are necessary to ensure accountability and calibrate private incentives.
    • Intersectionality matters: economic impacts differ across race, class, geography—data collection and reporting should enable intersectional stratification to avoid masking heterogeneity in returns.
  • Research & policy priorities for AI economics:
    • Fund longitudinal RCTs and quasi-experimental studies measuring wage and promotion outcomes.
    • Evaluate trade-offs between efficiency gains and equity (cost-benefit analyses incorporating distributional weights).
    • Develop and test incentive-aligned algorithmic objectives and platform-level constraints that balance profitability with exposure parity.
    • Create standardized reporting (transparency, fairness metrics, governance) to enable cross-study meta-analysis and policy evaluation.
    • Encourage public-private data partnerships with privacy safeguards to enable external auditing and robust causal research.

In short, AI applications have meaningful potential to lower barriers and scale career supports for women, but economic benefits depend on objective design, governance, and careful measurement of long-run labor-market outcomes; absent those, algorithmic systems risk perpetuating or amplifying existing gendered inequities.

Assessment

Paper Typereview_meta Evidence Strengthlow — The review synthesises only 13 empirical studies (2018–2025) that are concentrated on system-facing audits and short-term interventions; few studies use longitudinal designs, formal causal identification, or measure downstream career outcomes, limiting cumulative causal evidence. Methods Rigorhigh — The review followed a preregistered protocol and the PRISMA-ScR guidelines, searched seven databases plus backward/forward citation searching, and used transparent inductive thematic analysis, indicating a comprehensive and reproducible scoping review approach despite limited primary studies. SampleA scoping synthesis of 13 empirical studies published 2018–2025 identified through searches of seven databases and citation chaining; the included studies span system-facing applications (bias audits, recruitment/evaluation platform analyses, exposure disparities), some AI-supported learning/writing interventions, and matching/attrition-risk models, with mixed qualitative and quantitative methods and generally short follow-up. Themesinequality human_ai_collab skills_training governance GeneralizabilitySmall number of included studies (n=13) limits representativeness, Heterogeneous methods, outcomes and intervention types impede cross-study generalization, Geographic and sectoral coverage not clearly comprehensive or representative, Findings concentrated on system-facing recruitment/evaluation tools rather than individual-facing interventions, Few longitudinal or long-term outcome measures, limiting inference about sustained career impacts, Limited use of formal theory and sparse intersectional analyses restrict applicability across demographic subgroups

Claims (9)

ClaimDirectionConfidenceOutcomeDetails
Artificial intelligence (AI) is increasingly integrated into career guidance and organisational decision systems. Adoption Rate positive high integration/adoption of AI into career guidance and organisational decision systems
0.24
Empirical evidence on applications designed to support women’s career development remains limited. Research Productivity negative high availability/quantity of empirical evidence on AI for women's career development
n=13
0.24
We searched seven databases (plus backward and forward citation searching) and synthesised 13 empirical studies published between 2018 and 2025. Research Productivity null_result high number of empirical studies identified and synthesized
n=13
13 empirical studies
0.4
Using inductive thematic analysis, we identified three functional domains: (1) bias mitigation and representation, (2) skills development and empowerment and (3) career pathways and retention. Innovation Output positive high categorisation of AI applications into functional domains
n=13
0.24
The evidence base was concentrated in system-facing applications that detect or shape inequities within recruitment, evaluation and exposure systems. Adoption Rate neutral high focus of existing empirical studies (system-facing vs individual-facing applications)
n=13
0.24
Fewer studies evaluated individual-facing developmental support, and sustained career outcomes were rarely measured. Employment negative high number of studies evaluating individual-facing developmental support and measurement of sustained career outcomes
n=13
0.24
Formal theory use was limited, with only a small minority of studies explicitly drawing on established frameworks. Research Productivity negative high use of formal theoretical frameworks in studies
n=13
small minority
0.24
Reporting on ethics, transparency and governance was inconsistent. Governance And Regulation negative high consistency of reporting on ethics, transparency and governance in the literature
n=13
0.24
Research should prioritise longitudinal and theory-informed evaluations, including intersectionality-informed analyses, and assess downstream impacts on women’s career trajectories alongside robust governance and accountability practices. Governance And Regulation positive high recommended research priorities (longitudinal/theory-informed studies, intersectional analyses, governance/accountability assessments)
0.04

Notes