← Papers

Agentic AI is already lifting productivity in banking and investment but evidence is uneven and governance gaps persist; insurance lags behind and standardized evaluation, regulation, and interdisciplinary research are urgently needed.

A Comparative & Systematic Review of Literature on the Impact of Agentic AI on Selected Financial Services: Banking, Insurance & Investment

Krishna Kedia, Dhaval J. Thaker, Dr. Chirag Mehta · March 31, 2026 · International Journal of Creative and Open Research in Engineering and Management

openalex review_meta n/a evidence 7/10 relevance DOI Source PDF

A systematic review finds agentic AI delivers notable productivity and operational efficiency gains mainly in banking and investment, while insurance is under-studied and significant governance, workforce, and integration barriers remain.

This review synthesizes research on "Comparative analysis of Agentic AI's impact across different financial services such as banking, insurance, and investment" to address the underexplored differential effects and governance challenges of agentic AI deployment. The review aimed to evaluate agentic AI applications and outcomes across sectors, benchmark architectural frameworks, identify ethical and regulatory challenges, compare productivity and risk management benefits, and analyze implementation barriers. A systematic analysis of multidisciplinary studies published up to mid-2024 was conducted, encompassing qualitative, quantitative, and bibliometric methodologies focused on agentic AI technologies in financial domains. Findings reveal substantial productivity gains and operational efficiencies predominantly in banking and investment, with insurance comparatively underrepresented; diverse architectural models such as multi-agent systems and cloud- based frameworks enable scalable, adaptive deployments; ethical concerns including bias, transparency, and regulatory compliance remain critical, necessitating layered governance and human-AI collaboration; and significant implementation barriers persist, notably workforce transformation, legacy system integration, and trust deficits. These findings collectively underscore the transformative potential of agentic AI while highlighting persistent gaps in empirical validation, standardized evaluation, and sector-specific comparative analyses. The review informs theoretical understanding and practical governance by emphasizing the need for interdisciplinary, longitudinal research and robust frameworks to optimize agentic AI integration and responsible innovation across financial services. Keywords: Agentic AI, Financial Services, Systematic LR, Banking, Insurance, etc.

Summary

Main Finding

Agentic AI (autonomous, goal-directed multi-agent systems) shows strong potential to raise productivity and improve risk management in financial services, with the largest documented impacts in banking and investment and comparatively limited evidence for insurance. Substantial operational gains (e.g., large task-time reductions, call-center efficiencies, better credit scoring and fraud detection) are reported across studies, but persistent gaps in empirical validation, standardized benchmarks, governance, and sector‑specific comparative evidence limit confident economic conclusions and policy prescriptions.

Key Points

Reported quantitative effects (as synthesized in the review)
- Productivity and task-time: reported gains range widely (examples in the literature include up to ~80% productivity gains on some data tasks (Joshi, 2025); 34% task-time reduction and a 7.7% accuracy gain in one quantitative study (Sawant, 2025); 20–60% productivity gains (Shukla, 2025)).
- Customer-service and operations: 40–60% reduction in call-handle time and ~30% boost in resolution rates for voice/agent automation (Bhogawar, 2025).
Sectoral differentiation
- Banking and investment: most evidence of productivity, risk-model improvement, and deployment of agentic AI for trading, credit scoring, portfolio optimization, AML, and personalized advisory.
- Insurance: underrepresented in the literature; some work on underwriting and claims automation but fewer comparative/quantitative studies.
Technical and architectural patterns
- Dominant architectures include multi-agent systems, reinforcement-learning agents, modular agent orchestration, Retrieval-Augmented Generation (RAG) for knowledge grounding, and cloud-native deployments (AWS-based architectures, LangGraph, CrewAI, AutoGen, IBM watsonx referenced).
- Hybrid human–AI decision models and orchestrated agent frameworks are common practical patterns.
Governance, ethics and robustness
- Recurrent concerns: algorithmic bias, lack of transparency/explainability, hallucinations, coordination failures, accountability and legal liability, data privacy, and regulatory uncertainty.
- Mitigation tools discussed: XAI techniques (SHAP, LIME), fairness metrics and testing frameworks, layered governance combining technical controls and institutional oversight.
Implementation barriers
- Workforce transformation and skill gaps, legacy system integration, trust deficits among customers and staff, orchestration complexity, and data quality/availability.
Evidence gaps and research needs
- Lack of standardized evaluation metrics and benchmarks, few longitudinal or randomized causal studies, limited sector‑specific comparative analyses, and insufficient empirical validation of large reported gains.

Data & Methods

Review scope and corpus
- Time window: studies published primarily between 2022–2025 (the review includes peer‑reviewed articles, industry reports, and case studies up to mid‑2024/2025).
- Search and screening: transformed core query into multiple targeted queries; initial retrieval yielded 379 papers, plus 83 from backward/forward citation chaining (total 462 candidate papers).
- Relevance filtering: 448 deemed relevant; 50 identified as highly relevant and synthesized in depth.
Study types included
- Mixed methods: qualitative thematic and governance analyses, case studies, quantitative performance evaluations (task-time, accuracy, resolution rates), and bibliometric mappings.
- Architectural and technical analyses covering multi-agent designs, RL decision models, RAG/LLM orchestration, cloud architectures, and explainability/fairness tool use.
Synthesis approach
- Thematic comparison across the three sectors (banking, insurance, investment) focusing on: applications/outcomes, architectural frameworks, ethical/regulatory challenges, productivity/risk-management metrics, and implementation barriers.

Implications for AI Economics

Productivity and growth measurement
- Agentic AI can generate large within‑firm productivity improvements in routine and semi-routine cognitive tasks; but the magnitude and persistence of economy-wide total factor productivity (TFP) gains are uncertain without standardized measurement and longitudinal evidence.
- Requirement: develop sector-tailored productivity metrics (e.g., task-time adjusted throughput, risk-adjusted returns in investment, claims-processing cycle time) and standardized benchmarks to enable comparable economic estimates.
Labor markets and distributional effects
- Likely heterogeneous labor impacts: task displacement in back‑office and call‑center roles, but sizable re‑skilling/up‑skilling opportunities and new roles (agent orchestration, governance, data engineering).
- Economics research should quantify net employment effects, wage impacts across skill groups, and reallocation frictions (search, retraining, geographic mismatch).
Market structure and competition
- Agentic AI may shift competitive advantage toward firms that (a) integrate agentic architectures effectively, (b) control high-quality data, and (c) can invest in governance and orchestration—raising concerns about concentration and economies of scale in finance.
- Policy implications: antitrust monitoring, access-to-data rules, and standards to avoid lock-in.
Risk, systemic stability and regulation
- Improved fraud detection and real-time risk monitoring are potential social benefits, but emergent agentic behavior, coordination failures among agents, and model opacity create new systemic risks.
- Regulators need frameworks addressing model auditability, incident reporting, and stress-testing of autonomous agents (analogous to stress tests for banks).
Welfare and consumer outcomes
- Potential consumer gains: lower costs, faster service, expanded financial access and personalization.
- Risks: biased decisioning, unfair denial of services, and opacity that undermines consumer trust—necessitating consumer protections and transparency mandates.
Research and policy agenda for AI economics
- Empirical priorities: randomized controlled trials (where feasible), panel/longitudinal firm-level data, and cross-sector comparative datasets to estimate causal impacts on productivity, employment, and market outcomes.
- Standardization: develop common benchmarks and reporting standards (performance, fairness, explainability, governance) to enable meta-analyses and policy evaluation.
- Interdisciplinary work: economics, computer science, law, and organizational studies to design incentive-compatible governance, retraining programs, and regulatory instruments that balance innovation and social protection.

Suggested short-term policy/research actions - Fund and coordinate longitudinal data collection linking firm AI adoption to outcomes (productivity, employment, consumer prices). - Create sector-specific benchmark tasks and datasets for agentic AI evaluation in banking, insurance, and investment. - Pilot regulatory sandboxes focused on agentic AI to test governance approaches and stress-test systemic risk channels. - Invest in workforce transition programs tied to measurable upskilling outcomes in finance.

If you want, I can extract a one‑page table of the 50 "highly relevant" studies with their reported metrics and governance recommendations for quicker reference.

Assessment

Paper Typereview_meta Evidence Strengthn/a — This is a systematic literature review synthesizing secondary studies rather than producing primary causal estimates; the underlying evidence base is heterogeneous and often lacks rigorous causal identification, so the review itself does not provide primary causal evidence. Methods Rigormedium — The paper uses a systematic, multidisciplinary search and combines qualitative, quantitative, and bibliometric methods, which supports thorough coverage; however, heterogeneity in included studies, likely variation in study quality, limited disclosure of inclusion/exclusion criteria (as summarized), and scarce longitudinal or causal studies in the underlying literature constrain rigor. SampleA systematic corpus of multidisciplinary studies on agentic AI in financial services published through mid-2024, including qualitative case studies, quantitative analyses, and bibliometric investigations covering applications in banking, investment, and insurance (with insurance underrepresented); no single primary dataset—findings synthesized across published papers and reports. Themesproductivity governance human_ai_collab adoption GeneralizabilitySector imbalance: banking and investment are overrepresented while insurance is underrepresented, Heterogeneous study designs and definitions of 'agentic AI' reduce comparability across findings, Possible geographic bias toward advanced economies and large financial institutions, Publication and positive-result bias in literature syntheses, Rapid technological change limits applicability of some older studies, Lack of longitudinal and causal studies restricts inference about long-term impacts

Claims (8)

Claim	Direction	Confidence	Outcome	Details
Findings reveal substantial productivity gains and operational efficiencies predominantly in banking and investment. Firm Productivity	positive	high	productivity gains and operational efficiencies	0.24
Insurance is comparatively underrepresented in the literature and in reported agentic AI deployments compared with banking and investment. Adoption Rate	negative	high	relative representation/adoption across financial subsectors	0.24
Diverse architectural models such as multi-agent systems and cloud-based frameworks enable scalable, adaptive agentic AI deployments in financial services. Adoption Rate	positive	high	scalability and adaptivity of deployments	0.24
Ethical concerns—including bias, lack of transparency, and regulatory compliance risks—remain critical for agentic AI in financial services and necessitate layered governance and human-AI collaboration. Governance And Regulation	negative	high	prevalence/severity of ethical and regulatory risks and governance needs	0.24
Significant implementation barriers persist, notably workforce transformation challenges, legacy system integration difficulties, and trust deficits. Skill Obsolescence	negative	high	implementation barriers (workforce, legacy systems, trust)	0.24
The literature shows persistent gaps in empirical validation, standardized evaluation methods, and sector-specific comparative analyses of agentic AI in financial services. Research Productivity	negative	high	availability/quality of empirical validation and evaluation standards	0.24
The review employed a systematic analysis of multidisciplinary studies (qualitative, quantitative, and bibliometric) focused on agentic AI technologies in financial domains, covering literature published up to mid-2024. Research Productivity	null_result	high	scope and methods of the review itself	0.4
To optimize agentic AI integration and ensure responsible innovation across financial services, interdisciplinary, longitudinal research and robust governance frameworks are needed. Governance And Regulation	positive	high	recommended research and governance actions	0.04