When paired with mechanistic priors, synthesis‑aware design, robust external validation and regulatory alignment, AI can cut drug development time and raise early‑phase success rates; absent proper validation, dataset bias and misalignment with regulators can negate gains and create costly setbacks.

Artificial Intelligence in Drug Discovery and Development: Raising Quality per Decision

Shota Furukawa, Hiroyuki Uchida, Gabriela Novak · March 05, 2026 · Pharmacopsychiatry

openalex review_meta medium evidence 7/10 relevance DOI Source PDF

This narrative review argues that AI can materially speed and improve drug discovery and early development—raising early‑phase success and compound quality—if models are predictive, interpretable, synthesis/physics‑aware, externally validated within defined applicability domains, and governed to meet regulatory and equity requirements; without these controls, AI risks overfitting, biased outcomes, and regulatory friction.

Drug research and development continuously encounters prolonged timelines, escalating costs, and high attrition rates. In this narrative review, we integrated recent advances in artificial intelligence across target identification, drug repurposing, de novo molecular design, structural biology, safety prediction, and artificial intelligence-supported clinical development, aligning these innovations with evolving global regulatory frameworks. Predictive and interpretable artificial intelligence could enhance the quality of decision-making throughout the research and development process when combined with causal or mechanistic priors, synthesis-aware and physics-informed molecular design, external validation with clear applicability domains, and governance systems aligned with multiple regulatory guidelines and qualified digital endpoint applications. Case studies of artificial intelligence-assisted discovery and repurposing demonstrate shorter development timelines, improved compound quality, and higher-level early-phase success, while underscoring challenges such as overfitting, model generalizability, and dataset bias. Establishing a context-of-use-based "credibility plan" and adopting equity-by-design through the inclusion of non-European datasets and subgroup performance evaluation are essential for achieving generalizable impact. Artificial intelligence integration with new approach methodologies and adaptive or covariate-adjusted clinical trials may help reduce development inefficiency without compromising scientific or ethical rigor.

Summary

Main Finding

AI/ML are maturing from proofs-of-concept into commercially and regulatorily relevant tools across the drug R&D pipeline; when combined with causal priors, physics/synthesis-aware design, rigorous external validation, and proportional governance, AI can raise “quality per decision” — shortening timelines, improving early-phase candidate quality, and materially changing the economics of drug development, but benefits depend on careful validation, applicability-domain controls, and regulatory alignment.

Key Points

R&D economics baseline: typical discovery-to-approval timelines remain ~10–15 years and capitalized cost per approved drug is estimated at USD 1.1–2.6 billion (including failures); Phase I → approval success ~7.9% (2011–2020).
Where AI shows highest leverage:
- Target ID & repurposing: KGs, multi‑omics and genetics integration can prioritize genetically validated targets and repurposing candidates faster and with mechanistic explainability.
- De novo design & structural biology: AlphaFold (and AF3) and diffusion/ML approaches accelerate structure-informed design; examples show orders-of-magnitude faster target→hit cycles (e.g., CDK20 hit in ~30 days).
- ADMET & safety: multitask GNNs, self‑supervised DL, and integrated ADMET platforms improve early developability triage; explainability (SHAP/LIME) and synthetic‑feasibility checks are critical.
- Clinical development: AI aids patient selection, trial simulation, early safety signal detection, and digital endpoints; adaptive and covariate‑adjusted designs with prognostic models can raise trial efficiency.
Validation & governance are central: regulators (ICH E6(R3), FDA draft AI guidance, EMA reflection paper, EU AI Act, WHO guidance) emphasize context-of-use (COU), data provenance, V&V, lifecycle monitoring, transparency, and human oversight.
Common failure modes and limits: overfitting, poor out‑of‑domain generalization, dataset bias (Eurocentric datasets), leakage in KGs, hallucinated generative outputs, and non‑viable synthetic routes.
Business activity: major platform–pharma deals (e.g., Sanofi–Exscientia up to US$5.2B headline, Alphabet/Isomorphic Labs with Lilly/Novartis ~US$3B, Roche–Recursion multi‑billion) and compute–biotech collaborations (NVIDIA–Recursion) indicate large capital commitments and a shift toward multi‑program platform alliances.

Data & Methods

Paper type: narrative review synthesizing peer‑reviewed literature, regulatory guidance, industry press releases, technical reports, and selected translational case studies (not a systematic review).
Search strategy: PubMed, Google Scholar, forward/back citation tracking, targeted grey literature searches; emphasis on advances through late 2025.
AI/ML methods surveyed:
- Targeting & repurposing: knowledge graphs and link‑prediction, GWAS colocalization, CRISPR functional genomics, transcriptomic/proteomic signature matching, Bayesian colocalization (reporting PP3/PP4, priors), degree‑confounding adjustments for KGs.
- Structural & design: AlphaFold database (>200M proteins), AF3 for protein–ligand complexes, diffusion architectures, docking with confidence‑guided refinement, generative models for molecules with synthesis‑aware constraints (SELFIES, AiZynthFinder), multi‑objective optimization (BBB, hERG, CYP constraints).
- ADMET & toxicity: pkCSM, ADMETlab2.0, multitask GNNs, HelixADMET/ADMET‑AI, pretraining/transfer learning, explainability tools (SHAP/LIME/SME), synthetic feasibility checks, scaffold/time‑aware splits, model calibration.
- Clinical & post‑market: prognostic covariate adjustment (PROCOVA), digital endpoints fit‑for‑purpose validation per FDA DHT guidance, NLP‑based pharmacovigilance signal detection.
Validation recommendations: external/independent cohorts, leakage‑robust data splits, applicability domains, prespecified negative controls, orthogonal experimental confirmation for repurposing hits.
Limitations of the review: non‑systematic selection, early-stage examples with limited public replication, and heterogeneous reporting of model performance.

Implications for AI Economics

Potential cost and timeline impact
- Faster early discovery and repurposing can materially reduce time‑to‑hit and preclinical durations (example: repurposed programs often reach market within 3–12 years vs 10–15 for de novo), improving net present value (NPV) and lowering required capital.
- Improved early candidate quality and predictive ADMET can reduce late‑stage attrition, which is the major driver of the per‑approved‑drug capitalized cost; even modest reductions in late‑stage failure rates amplify ROI.
- Regulatory acceleration (e.g., priority review vouchers, adaptive reviews) combined with validated AI can shorten approval timelines and decrease time in development, again improving NPV.
Investment and business model shifts
- Large headline deals and strategic investments signal a platformizing of drug discovery: techbio/platform vendors partnering with big pharma on multi‑program portfolios — translating fixed‑cost AI platforms into scalable per‑program marginal returns.
- High upfront compute and data costs (and the importance of proprietary/curated datasets) create winner‑take‑most dynamics; compute‑equipped firms and data holders gain pricing and bargaining power.
- Partnerships with cloud/compute vendors (e.g., NVIDIA) reflect a verticalization: compute + ML models + biology expertise bundled as a service — altering CAPEX/OPEX profiles for drug companies.
Risk, governance and value realization
- Realized economic benefit depends on rigorous validation and regulatory acceptance. Models that lack external validity or have biased applicability domains risk costly failed programs and reputational loss.
- Governance, COU‑based credibility plans, and equity‑by‑design (diverse data, subgroup performance reporting) are economic necessities: poor generalizability can constrain market access and create regulatory delays/costs.
- Operational costs increase: compliance with AI governance, lifecycle monitoring, documentation, and audits adds overhead; these are necessary to convert AI outputs into regulatory‑usable evidence and commercial value.
Market and competition effects
- Shorter development cycles and repurposing may increase product throughput, intensify competition, and potentially compress exclusivity windows relative to price/reimbursement strategies.
- Data and compute concentration could raise barriers to entry and prompt consolidation or more partnerships between pharma and AI/platform firms.
- Democratization risks vs rewards: Federated learning and data‑sharing consortia can widen the pool of actionable data and reduce single‑player dominance but require governance and investment.
Practical takeaways for investors and managers
- Invest in validated, COU‑defended AI assets with documented external performance; prioritize integration of synthesis and ADMET constraints to avoid downstream failures.
- Treat AI as a platform requiring sustained investment (data curation, compute, regulatory evidence generation) rather than a one‑off cost reduction tool.
- Use staged investments and milestone structures (as seen in industry deals) to align financial exposure with technical/clinical validation inflection points.

If you want, I can: (a) extract the specific numeric examples and deal values into a one‑page investor briefing, or (b) produce a short slide‑ready summary highlighting projected ROI scenarios under different attrition‑reduction assumptions. Which would be most useful?

Assessment

Paper Typereview_meta Evidence Strengthmedium — The paper synthesizes multiple case studies, recent empirical papers, and regulatory analyses that collectively point to material benefits of AI in drug R&D, but it does not provide pooled effect-size estimates or causal identification; findings rely on heterogeneous examples, published successes, and plausibility arguments rather than systematic, counterfactual evaluation. Methods Rigormedium — The narrative synthesis draws on a broad set of relevant data types and discusses technical and regulatory validation practices, but it lacks a pre-registered systematic review protocol, explicit search and inclusion criteria, risk-of-bias assessment, or meta-analysis—limiting reproducibility and strength of inference. SampleNarrative synthesis of recent literature and case examples covering AI applications across target identification, drug repurposing, de novo molecular design, structural biology, safety/toxicity prediction, and AI-supported clinical development; data sources discussed include high-throughput screening and cheminformatics libraries, multi-omics and transcriptomics, structural biology (cryo-EM/X-ray and predicted structures), preclinical safety datasets, clinical trial datasets, real-world data, and digital/sensor endpoint data; no original primary dataset or pooled quantitative sample is analyzed. Themesproductivity innovation governance adoption inequality GeneralizabilityFindings are synthesized from heterogeneous case studies and may reflect publication and selection bias toward successful AI examples, Evidence is likely concentrated in data‑rich firms and specific therapeutic areas, limiting transferability to smaller biotechs or under-resourced programs, Geographic and demographic bias in datasets (overrepresentation of European/US data) constrains global generalizability and equity claims, Rapid evolution of AI methods and regulatory standards may outdate specific technical recommendations, Absence of randomized or counterfactual estimates means economic impact magnitudes (e.g., cost or time reductions) are uncertain across contexts

Claims (16)

Claim	Direction	Confidence	Outcome	Details
Artificial intelligence (AI) can materially shorten drug development timelines when models are predictive, interpretable, and integrated with causal/mechanistic priors, synthesis- and physics-aware molecular design, rigorous external validation (with defined applicability domains), and governance aligned to regulatory requirements. Task Completion Time	positive	medium	drug development timeline (project duration from discovery to early development milestones)	0.14
AI can raise early-phase (e.g., Phase I/II) success rates when effectively applied with the technical and governance controls described. Research Productivity	positive	medium	early-phase clinical success rate (probability of progression through Phase I/II)	0.14
AI-assisted molecular design can improve lead/compound quality (e.g., potency, selectivity, developability) when using synthesis-aware and physics-informed approaches. Output Quality	positive	medium	compound/lead quality metrics (potency, selectivity, developability, synthetic feasibility)	0.14
Structural prediction tools and structural-biology advances speed target validation and can accelerate target identification/validation workflows. Task Completion Time	positive	medium	time to target validation and throughput of target characterization	0.14
Absent rigorous controls (validation, applicability-domain reporting, attention to dataset bias), AI models risk overfitting, producing inequitable outcomes and regulatory friction that can undermine economic benefits. Ai Safety And Ethics	negative	high	model generalizability (out-of-sample performance), subgroup performance disparities, regulatory approval/clearance outcomes, economic impact (stranded R&D spending)	0.24
External validation, explicit applicability-domain reporting, and subgroup performance reporting improve model reliability and support regulatory alignment. Regulatory Compliance	positive	medium	model reliability/generalizability metrics and likelihood of regulatory acceptance	0.14
Synthesis-aware and physics-informed molecular design increases the downstream feasibility (synthetic accessibility and developability) of AI-designed compounds. Output Quality	positive	medium	synthetic success rate, developability indicators (e.g., ADMET proxies), time/cost to synthesize candidate compounds	0.14
AI-enabled trial innovations—such as integration with new approach methodologies (NAMs), adaptive and covariate-adjusted designs, and digital biomarkers—can reduce trial inefficiency while preserving scientific and ethical standards. Research Productivity	positive	medium	trial efficiency metrics (sample size, duration, cost) and maintenance of scientific/ethical integrity	0.14
Adopting equity-by-design (including diverse, non‑European datasets and subgroup evaluation) reduces model bias and improves global generalizability of AI models. Ai Safety And Ethics	positive	medium	subgroup performance disparities, generalizability across populations/geographies	0.14
Key failure modes for AI in drug R&D include overfitting, poor generalizability, dataset bias, insufficient external validation, and misalignment with evolving regulatory expectations. Research Productivity	negative	high	failure incidence of AI projects (model performance collapse, regulatory rejection, biased clinical outcomes)	0.24
Economic value from AI adoption concentrates with data-rich firms and platforms that own large, high-quality datasets and validation pipelines. Firm Revenue	positive	medium	firm returns/competitive advantage attributable to dataset ownership and validation capacity (e.g., ROI, market share)	0.14
Clear regulatory alignment (e.g., preparation of credibility plans and qualified digital endpoints) reduces regulatory uncertainty, de-risks investment, and raises adoption rates of AI tools. Adoption Rate	positive	medium	regulatory uncertainty (qualitative), investment adoption rates in AI tools, pace of deployment	0.14
Conversely, lack of standards or failed validation can create regulatory setbacks, reputational risk, and stranded R&D spending. Regulatory Compliance	negative	medium	incidence of regulatory setbacks, reputational damage, amount of stranded/wasted R&D expenditure	0.14
Qualified digital endpoints and validated in silico markers create new markets and assets (digital biomarkers, validation services, certified datasets) with potential commercial value. Firm Revenue	positive	speculative	emergence and revenue of markets for digital biomarkers, certification/validation services, and standardized datasets	0.02
AI methods such as transfer learning, active learning, and Bayesian approaches improve data efficiency and uncertainty quantification in drug discovery and preclinical modeling. Research Productivity	positive	medium	data efficiency (number of experiments/samples needed), calibration of uncertainty estimates	0.14
This paper is a narrative review synthesizing heterogeneous studies and case reports rather than providing meta-analytic estimates of effect sizes. Research Productivity	null_result	high	presence/absence of pooled/meta-analytic effect size estimates	0.24