← all environments

Epidemiology-Based Market Sizing

Commercial / ForecastingCommercial forecasting analyst

Given a query naming a disease, a drug or mechanism, and a geography, build the addressable-patient funnel: epidemiology (incidence or prevalence) -> diagnosed -> treated -> eligible (biomarker / subtype / line-of-therapy gating) -> addressable patients, then an optional peak-share and rough peak-revenue sketch. The agent has only read-only epidemiology / subtype / pricing tools that return raw numbers, and must do all gating, multiplication, and assumption-setting itself.

Why this is fundable

Scarce expert who grades this
Commercial forecasting analyst / epidemiologist (~$150–300/hr loaded); senior diligence-grade forecasters command more
What one decision is worth
Drives go/no-go and deal valuation: peak-sales forecasts anchor $100M–$5B+ licensing and M&A prices. A funnel off by a biomarker fraction moves an NPV by hundreds of millions.
Real-world data sources
SEER / GLOBOCAN incidence & prevalence, biomarker-prevalence literature, IQVIA-style treatment rates, list pricing. Curated snapshot here; each input maps to a citable real source.

Agent tools

list_indicationsget_epidemiologyget_subtype_prevalenceget_pricing

Expert grading rubric

Dimension5 (excellent)1 (poor)
Epidemiology sourcing & funnel constructionPulls the right epidemiology via the tools, correctly chooses incidence vs prevalence as the funnel basis for the disease (incidence for acute/short-survival oncology, prevalence for chronic disease), and lays out a clean population -> diagnosed -> treated -> eligible -> addressable chain for the queried geography.Builds the funnel from the wrong base (e.g. prevalence for an incidence-driven cancer, or vice versa), skips diagnosis/treatment steps, ignores the requested geography, or reasons from memory instead of the tool data.
Eligibility / biomarker gating correctnessApplies the correct biomarker / subtype / stage gates for THIS drug and only those — e.g. advanced-stage AND activating EGFR mutation for a 1L EGFR TKI, DLL3-expressing AND fit-for-2L for a DLL3 engager, HER2+ AND metastatic for a 2L ADC — and applies the line-of-therapy gate without double-counting.Omits a required gate (e.g. forgets the EGFR-mutant or DLL3 filter), applies an irrelevant or wrong-direction gate, double-counts a line split already implied by another gate, or multiplies fractions that should not stack.
Numerical correctness & internal consistencyEvery multiplication checks out against the returned tool numbers; intermediate counts are consistent and monotonically shrinking down the funnel; the final addressable number is reproducible from the stated inputs.Arithmetic errors, mismatched units, numbers that don't follow from the cited fractions, or a funnel step larger than the one above it.
Assumptions & peak-share / revenue reasoningStates and justifies the peak-share and pricing assumptions, applies persistence / treated duration sensibly, and produces a revenue sketch (addressable x share x effective price) whose magnitude is defensible; flags the key sensitivities.Pulls a peak share or price out of thin air with no rationale, ignores persistence/duration, garbles the revenue formula, or presents a point estimate with no acknowledgement of uncertainty.
Evidence faithfulnessEvery number traces to a specific tool output (epidemiology counts, subtype fractions, price); no fabricated rates or invented epidemiology; the curated/teaching nature of the data is respected and not overstated as live truth.Invents incidence/prevalence or subtype fractions not returned by the tools, contradicts the tool data, or presents the snapshot numbers as authoritative real-world figures.

Example queries

Trajectories

model panel (compare side by side)

ModelProviderTierJudge 1–5Verdict
Claude Opus 4.8anthropicfrontier3.8flawed
GPT (frontier)openaifrontier3.6acceptable
Claude Haiku 4.5anthropicsmall2.6
GPT-4o miniopenaismall2.4flawed

batch 20260606_021624

QueryModelTool callsTimeStatus
Size the US addressable patient population and rough peak revenue for a DLL3 T-cell engaclaude-haiku-4-5-2025100137.4sok
Build the EU5 addressable-patient funnel for a first-line EGFR TKI in EGFR-mutant NSCLC,claude-haiku-4-5-2025100147.3sok

batch 20260606_020653

QueryModelTool callsTimeStatus
Size the US addressable patient population and rough peak revenue for a DLL3 T-cell engaclaude-opus-4-8420.3sok
Build the EU5 addressable-patient funnel for a first-line EGFR TKI in EGFR-mutant NSCLC,claude-opus-4-8419.5sok
Size the US addressable population and a peak-revenue estimate for a HER2-directed ADC (claude-opus-4-8424.8sok
Estimate the US addressable patient funnel and rough peak revenue for an oral advanced tclaude-opus-4-8420.6sok