← Epidemiology-Based Market Sizing
Commercial / Forecasting · Commercial forecasting analyst

Epidemiology-Based Market Sizing — agent trajectory for expert review

claude-opus-4-8 · 2026-06-06T09:07:33.290492+00:00 · 4 tool calls · 19.5s · 6309 tokens
run id: market_sizing_20260606_020653_02_build-the-eu5-addressable-patient-funnel-for-a-f

Query

Build the EU5 addressable-patient funnel for a first-line EGFR TKI in EGFR-mutant NSCLC, and give a rough peak-revenue sketch.

Trajectory

Agent · turn 1

I'll build this funnel step by step. Let me start by exploring the dataset structure and gathering the relevant data.

⚙ calls list_indications()
↳ list_indications result
show data
{
  "disclaimer": "Teaching snapshot: curated, rounded epidemiology for funnel-building practice. Not a live data pull; do not cite as ground truth.",
  "indications": {
    "NSCLC": {
      "label": "Non-small cell lung cancer (EGFR-mutant subset of interest)",
      "geographies": [
        "US",
        "EU5"
      ],
      "funnel_basis": "incidence"
    },
    "HER2+ Breast": {
      "label": "HER2-positive breast cancer",
      "geographies": [
        "US",
        "EU5"
      ],
      "funnel_basis": "incidence"
    },
    "ES-SCLC": {
      "label": "Extensive-stage small cell lung cancer",
      "geographies": [
        "US",
        "EU5"
      ],
      "funnel_basis": "incidence"
    },
    "Ulcerative Colitis": {
      "label": "Ulcerative colitis (moderate-to-severe)",
      "geographies": [
        "US",
        "EU5"
      ],
      "funnel_basis": "prevalence"
    },
    "ATTR-CM": {
      "label": "Transthyretin amyloid cardiomyopathy",
      "geographies": [
        "US",
        "EU5"
      ],
      "funnel_basis": "prevalence"
    }
  }
}
Agent · turn 2

NSCLC uses an incidence basis. Let me pull the epidemiology, subtype fractions, and pricing for NSCLC in EU5.

⚙ calls get_epidemiology(indication=NSCLC, geography=EU5)
⚙ calls get_subtype_prevalence(indication=NSCLC)
⚙ calls get_pricing(indication=NSCLC)
↳ get_epidemiology result
show data
{
  "disclaimer": "Teaching snapshot: curated, rounded epidemiology for funnel-building practice. Not a live data pull; do not cite as ground truth.",
  "indication": "NSCLC",
  "label": "Non-small cell lung cancer (EGFR-mutant subset of interest)",
  "funnel_basis": "incidence",
  "diagnosis_rate": 0.92,
  "treatment_rate": 0.8,
  "epidemiology": {
    "EU5": {
      "incidence_per_year": 240000,
      "prevalence": 620000
    }
  },
  "note": "funnel_basis tells you whether to start the funnel from incidence (new patients/year) or prevalence (point count of living patients)."
}
↳ get_subtype_prevalence result
show data
{
  "disclaimer": "Teaching snapshot: curated, rounded epidemiology for funnel-building practice. Not a live data pull; do not cite as ground truth.",
  "indication": "NSCLC",
  "subtypes": {
    "advanced_stage_fraction": 0.6,
    "EGFR_mutant_fraction": 0.15,
    "EGFR_exon19del_L858R_fraction": 0.85
  },
  "line_splits": {
    "1L": 0.7,
    "2L": 0.2,
    "3L+": 0.1
  },
  "notes": "EGFR-mutant NSCLC ~15% of NSCLC in Western populations (far higher in East Asia). Funnel for a 1L EGFR TKI gates on advanced stage AND an activating EGFR mutation (exon19del / L858R)."
}
↳ get_pricing result
show data
{
  "disclaimer": "Teaching snapshot: curated, rounded epidemiology for funnel-building practice. Not a live data pull; do not cite as ground truth.",
  "indication": "NSCLC",
  "annual_price": 190000,
  "treated_months": 18,
  "persistence": 0.85,
  "note": "annual_price is the list price for a full year. persistence is the fraction of a year actually paid for (discontinuation/mortality). Effective revenue per addressable patient/year = annual_price x persistence."
}
Agent · turn 3

EU5 Addressable-Patient Funnel — 1L EGFR TKI in EGFR-mutant NSCLC

Funnel (incidence-based)

Step Factor Patients/year
NSCLC incidence (EU5) 240,000
× Diagnosed 0.92 220,800
× Treated 0.80 176,640
× Advanced stage 0.60 105,984
× EGFR-mutant 0.15 15,898
× Activating mutation (exon19del / L858R) 0.85 13,513
× 1L share 0.70 9,459

Addressable 1L population ≈ 9,500 patients/year (EU5).

A note on the exon19del/L858R gate: I've applied it because a 1L EGFR TKI is indicated for activating mutations specifically, and the dataset flags this as part of the gating. If you want a more inclusive view (some TKIs cover broader EGFR mutations), drop that step and the addressable pool rises to ~11,100/year.

Rough Peak-Revenue Sketch

Pricing inputs: annual price $190,000; persistence 0.85; treated duration 18 months.

Method A — annualized effective revenue (treats the funnel as patients on therapy in a steady-state year):
- Effective revenue per patient/year = $190,000 × 0.85 = $161,500
- 9,459 × $161,500 ≈ $1.53B

Method B — adjust for 18-month treated duration (prevalent on-therapy pool > annual incident inflow):

Because each 1L patient stays on drug ~18 months (1.5 years), the steady-state treated population is larger than one year's incident flow:
- On-therapy pool ≈ 9,459 × 1.5 ≈ 14,200 patients
- × $161,500 ≈ $2.3B

Caveat: Method B double-counts unless you're careful — persistence (0.85) already discounts for discontinuation/mortality within a year. The cleaner framing is: peak revenue ≈ incident patients/year × price × (treated duration in years) × persistence-type adjustment. Using 9,459 × $190,000 × 1.5 × 0.85 ≈ $2.3B.

Headline

The realistic peak depends heavily on peak market share (no competition assumed above — these figures imply 100% capture). At a more plausible 30–40% peak share for one TKI in a competitive 1L EGFR market, expect roughly $0.5B–$0.9B.


Source caveat: this is a curated teaching snapshot with rounded figures — not a live data pull. Key swing assumptions are the 15% EGFR-mutant rate (Western; far higher in East Asia), the 85% activating-mutation gate, the 18-month duration, and peak share.

Final output

EU5 Addressable-Patient Funnel — 1L EGFR TKI in EGFR-mutant NSCLC

Funnel (incidence-based)

Step Factor Patients/year
NSCLC incidence (EU5) 240,000
× Diagnosed 0.92 220,800
× Treated 0.80 176,640
× Advanced stage 0.60 105,984
× EGFR-mutant 0.15 15,898
× Activating mutation (exon19del / L858R) 0.85 13,513
× 1L share 0.70 9,459

Addressable 1L population ≈ 9,500 patients/year (EU5).

A note on the exon19del/L858R gate: I've applied it because a 1L EGFR TKI is indicated for activating mutations specifically, and the dataset flags this as part of the gating. If you want a more inclusive view (some TKIs cover broader EGFR mutations), drop that step and the addressable pool rises to ~11,100/year.

Rough Peak-Revenue Sketch

Pricing inputs: annual price $190,000; persistence 0.85; treated duration 18 months.

Method A — annualized effective revenue (treats the funnel as patients on therapy in a steady-state year):
- Effective revenue per patient/year = $190,000 × 0.85 = $161,500
- 9,459 × $161,500 ≈ $1.53B

Method B — adjust for 18-month treated duration (prevalent on-therapy pool > annual incident inflow):

Because each 1L patient stays on drug ~18 months (1.5 years), the steady-state treated population is larger than one year's incident flow:
- On-therapy pool ≈ 9,459 × 1.5 ≈ 14,200 patients
- × $161,500 ≈ $2.3B

Caveat: Method B double-counts unless you're careful — persistence (0.85) already discounts for discontinuation/mortality within a year. The cleaner framing is: peak revenue ≈ incident patients/year × price × (treated duration in years) × persistence-type adjustment. Using 9,459 × $190,000 × 1.5 × 0.85 ≈ $2.3B.

Headline

The realistic peak depends heavily on peak market share (no competition assumed above — these figures imply 100% capture). At a more plausible 30–40% peak share for one TKI in a competitive 1L EGFR market, expect roughly $0.5B–$0.9B.


Source caveat: this is a curated teaching snapshot with rounded figures — not a live data pull. Key swing assumptions are the 15% EGFR-mutant rate (Western; far higher in East Asia), the 85% activating-mutation gate, the 18-month duration, and peak share.

Reference answer key (deterministic) — open after grading

Deterministic funnel for NSCLC in EU5 built from the curated tool data (basis = incidence). Eligibility gating is the product of the subtype fractions that apply to this drug, then the line-of-therapy gate. The peak-revenue sketch uses an illustrative 30% peak share; a strong agent may choose a different, well-justified share. Teaching snapshot: curated, rounded epidemiology for funnel-building practice. Not a live data pull; do not cite as ground truth.

Funnel stepPatients / valueCalculation
1. Incidence (new/yr)240,000EU5 NSCLC; basis = incidence
2. Diagnosed220,800x 92% diagnosis rate
3. Drug-treated176,640x 80% treatment rate
4. Advanced stage (IIIB/IV)105,984x 60%
5. EGFR-mutant15,898x 15%
6. Activating mut (exon19del/L858R)13,513x 85%
7. Line gate: 1L9,459x 70% (=1.0 if already encoded above)
= Addressable patients9,459eligibility = 8% of treated, x line
Peak capture @ 30% share2,838illustrative peak share assumption
Eff. revenue / patient-yr$161,500$190,000 list x 85% persistence
~ Peak revenue (USD/yr)$458,292,038= captured patients x eff. revenue/pt

Expert grading

Epidemiology sourcing & funnel construction 1 (poor) 5 (excellent)
5: Pulls the right epidemiology via the tools, correctly chooses incidence vs prevalence as the funnel basis for the disease (incidence for acute/short-survival oncology, prevalence for chronic disease), and lays out a clean population -> diagnosed -> treated -> eligible -> addressable chain for the queried geography.
1: Builds the funnel from the wrong base (e.g. prevalence for an incidence-driven cancer, or vice versa), skips diagnosis/treatment steps, ignores the requested geography, or reasons from memory instead of the tool data.
Eligibility / biomarker gating correctness 1 (poor) 5 (excellent)
5: Applies the correct biomarker / subtype / stage gates for THIS drug and only those — e.g. advanced-stage AND activating EGFR mutation for a 1L EGFR TKI, DLL3-expressing AND fit-for-2L for a DLL3 engager, HER2+ AND metastatic for a 2L ADC — and applies the line-of-therapy gate without double-counting.
1: Omits a required gate (e.g. forgets the EGFR-mutant or DLL3 filter), applies an irrelevant or wrong-direction gate, double-counts a line split already implied by another gate, or multiplies fractions that should not stack.
Numerical correctness & internal consistency 1 (poor) 5 (excellent)
5: Every multiplication checks out against the returned tool numbers; intermediate counts are consistent and monotonically shrinking down the funnel; the final addressable number is reproducible from the stated inputs.
1: Arithmetic errors, mismatched units, numbers that don't follow from the cited fractions, or a funnel step larger than the one above it.
Assumptions & peak-share / revenue reasoning 1 (poor) 5 (excellent)
5: States and justifies the peak-share and pricing assumptions, applies persistence / treated duration sensibly, and produces a revenue sketch (addressable x share x effective price) whose magnitude is defensible; flags the key sensitivities.
1: Pulls a peak share or price out of thin air with no rationale, ignores persistence/duration, garbles the revenue formula, or presents a point estimate with no acknowledgement of uncertainty.
Evidence faithfulness 1 (poor) 5 (excellent)
5: Every number traces to a specific tool output (epidemiology counts, subtype fractions, price); no fabricated rates or invented epidemiology; the curated/teaching nature of the data is respected and not overstated as live truth.
1: Invents incidence/prevalence or subtype fractions not returned by the tools, contradicts the tool data, or presents the snapshot numbers as authoritative real-world figures.
Overall verdict
Your name / ID