Pharma’s AI promise won’t be proven by accuracy metrics alone: EFMC2 calls for new impact-oriented KPIs to measure whether AI actually raises R&D productivity and drives top-line value, arguing that strategic indicators are needed to guide adoption and innovation.
With increasing cost and failure rates in the pharmaceutical R&D process not fundamentally improving over the last decade, pressure remains high to increase the probability of success to improve the effectiveness of pharmaceutical R&D. The broad introduction of AI into the R&D landscape over the last years holds the promise to lift pharmaceutical R&D out of its productivity problem, as preliminary analyses suggest that “AI‐native” companies may be outpacing traditional peers. However, harnessing this potential requires moving beyond measuring technical model performance (e.g., predictive accuracy) to measuring strategic impact. In this perspective, members of the EFMC2 community—focused on advancing the collaboration between computational and medicinal chemists—discuss the challenges of applying key performance indicators (KPIs) in the idiosyncratic environment of drug discovery. We argue that the shift from expert‐driven computer‐aided drug design (CADD) to semiautonomous AI necessitates a new framework of impact‐oriented KPIs. We provide recommendations for designing these strategic indicators to drive adoption, foster innovation, and objectively assess whether digital tools are delivering top‐line impact.
Summary
Main Finding
The EFMC2 community argues that to realize AI's promise for pharmaceutical R&D we must move from measuring narrow technical performance (e.g., predictive accuracy) to strategic, impact‑oriented KPIs that capture real contributions to discovery productivity (time, cost, and probability of success). Semiautonomous AI changes the decision‑making landscape and therefore requires a new KPI framework to drive adoption, align incentives, and objectively assess top‑line impact.
Key Points
- Problem context: Pharmaceutical R&D remains costly and failure‑prone; technical improvements alone have not materially reversed productivity decline.
- Limitation of current metrics: Traditional KPIs focus on model-level technical metrics (AUROC, RMSE, etc.) that do not translate reliably into business or clinical impact.
- Shift in paradigm: Moving from expert‑driven CADD to semiautonomous AI amplifies both potential upside and the need for different measurement—AI influences decisions, workflows, and portfolio composition.
- Core challenges for KPI design:
- Attribution: long, heterogeneous development timelines make causal attribution of outcomes to AI tools difficult.
- Idiosyncrasy: project‑level heterogeneity (targets, modalities, novelty) undermines one‑size‑fits‑all metrics.
- Timing: meaningful outcomes (e.g., clinical success) occur years later; need leading and lagging indicators.
- Data and governance: inconsistent data standards, integration, and reproducibility hamper measurement.
- Adoption and behavioral factors: tool uptake depends on trust, usability, and incentive alignment.
- Recommendations (high level): adopt outcome‑oriented, portfolio‑level KPIs; include leading indicators that predict downstream value; ensure metrics enable causal inference and are aligned with commercial objectives.
Data & Methods
- Type of paper: Perspective / conceptual piece based on EFMC2 community expertise and domain experience—no new primary empirical dataset reported.
- Suggested empirical approaches for KPI validation and causal assessment:
- Randomized evaluations and A/B tests at workflow or project allocation level where feasible.
- Matched controls and before/after analyses with careful selection of comparable projects.
- Counterfactual portfolio simulations and value‑of‑information modelling to estimate impact on probability of success and expected net present value (eNPV).
- Time‑to‑event analyses (e.g., time‑to‑lead, time‑to IND) and survival analysis for milestone timing.
- Cost accounting and activity‑based costing to measure resource shifts and cost per decision/candidate.
- Calibration and uncertainty quantification metrics for probabilistic predictions (e.g., Brier score, calibration plots).
- Recommended data sources to operationalize KPIs:
- Internal R&D databases (assays, screening, ADME/Tox results), electronic lab notebooks, project management timestamps.
- External outcomes: clinical trial starts, regulatory filings, attrition statistics, licensing deals.
- Operational data: headcount/time allocations, CRO spend, reagent consumables, cycle times.
- Adoption telemetry: tool usage logs, decision records, user feedback.
Implications for AI Economics
- Productivity and returns: If well‑measured and validated, semiautonomous AI could raise effective probability of success and reduce marginal discovery costs, increasing R&D productivity and potentially raising expected returns on R&D investment.
- Valuation and competitive dynamics: Firms that credibly demonstrate impact‑oriented KPIs (not just models) may command valuation premia; AI‑native organizations could outpace incumbents if they embed AI into decision loops and portfolio allocation.
- Investment and resource allocation: Better KPIs enable more efficient capital allocation across programs and clearer cost‑benefit analysis for AI investments.
- Labor and skills: KPI focus on strategic outcomes shifts incentives toward skills in AI‑augmented decision making, causal evaluation, and data governance; roles may reallocate from manual tasks to higher‑level decision oversight.
- Policy and disclosure: Investors and regulators may require standardized, outcome‑oriented KPIs to assess claims about AI impact; transparency and reproducibility become economic signals.
- Measurement risks: Poor or misaligned KPIs can distort behavior (gaming, short‑termism); economic gains require careful metric design that balances leading/lagging measures and adjusts for risk and heterogeneity.
Practical next steps implied by the perspective: define a small set of standardized outcome KPIs (portfolio eNPV uplift, time‑to‑lead, cost‑per‑lead, adjusted probability of success), pilot RCTs/A–B tests or matched‑control studies to assess causal impact, and build data infrastructure and governance to support longitudinal tracking and cross‑project benchmarking.
Assessment
Claims (8)
| Claim | Direction | Confidence | Outcome | Details |
|---|---|---|---|---|
| Increasing cost and failure rates in the pharmaceutical R&D process have not fundamentally improved over the last decade. Firm Productivity | negative | high | cost and failure rates in pharmaceutical R&D |
0.06
|
| Pressure remains high to increase the probability of success to improve the effectiveness of pharmaceutical R&D. Firm Productivity | negative | high | probability of success in pharmaceutical R&D |
0.01
|
| The broad introduction of AI into the R&D landscape over the last years holds the promise to lift pharmaceutical R&D out of its productivity problem. Firm Productivity | positive | high | potential improvement in pharmaceutical R&D productivity due to AI adoption |
0.01
|
| Preliminary analyses suggest that 'AI-native' companies may be outpacing traditional peers. Firm Productivity | positive | high | relative performance of AI-native companies versus traditional peers (e.g., productivity/pace) |
0.06
|
| Harnessing AI's potential requires moving beyond measuring technical model performance (e.g., predictive accuracy) to measuring strategic impact. Adoption Rate | positive | high | usefulness of measurement approaches (technical model metrics versus strategic impact metrics) for capturing AI value |
0.01
|
| The shift from expert-driven computer-aided drug design (CADD) to semiautonomous AI necessitates a new framework of impact-oriented KPIs. Organizational Efficiency | positive | high | need for new KPI frameworks to assess impact of semiautonomous AI in drug discovery |
0.01
|
| The paper provides recommendations for designing strategic indicators to drive adoption, foster innovation, and objectively assess whether digital tools are delivering top-line impact. Innovation Output | positive | high | existence of recommended strategic KPIs intended to affect adoption, innovation, and top-line impact assessment |
0.1
|
| Measuring only technical model performance (such as predictive accuracy) is insufficient for assessing the strategic impact of AI in drug discovery. Organizational Efficiency | negative | high | adequacy of technical model performance metrics for capturing strategic impact |
0.01
|