← Adverse-Event Coding & Causality comparison · model: GPT-4o mini
LLM-judge score: 2.8 / 5 · flawed
extraction: 3 coding: 2 seriousness: 3 causality: 3 faithfulness: 3
The agent correctly identified both clinical syndromes (CRS and neurotoxicity/ICANS) but failed to code the CRS event properly — it coded 'fever' as Pyrexia/General disorders rather than recognizing the narrated CRS as the diagnosis PT 'Cytokine release syndrome' (Immune system disorders), and split confusion+tremor redundantly into two identical ICANS codes rather than recognizing one neurotoxicity diagnosis. Seriousness is broadly right (hospitalization) but the rationale is loose and conflates treatments (it wrongly attributes tocilizumab/steroids to the neurotoxicity event). Causality uses the vague 'Related' rather than the WHO-UMC 'Probable' category and omits dechallenge/rechallenge and alternative-etiology reasoning, leaving the assessment shallow though not wrong.
Clinical / Pharmacovigilance · Drug-safety physician (PV)

Adverse-Event Coding & Causality — agent trajectory for expert review

gpt-4o-mini · 2026-06-06T10:27:26.246803+00:00 · 5 tool calls · 7.3s · 2662 tokens
run id: ae_causality_20260606_031816_ae-causality__gpt-4o-mini

Query

Code the adverse events to MedDRA PT/SOC and assess seriousness and drug causality for case ICSR-001 (a patient on a DLL3 T-cell engager). Justify each causality category.

Trajectory

Agent · turn 1
⚙ calls get_case(case_id=ICSR-001)
↳ get_case result
show data
{
  "case_id": "ICSR-001",
  "study_drug": "Tarlatamab",
  "drug_class": "DLL3 T-cell engager (BiTE)",
  "patient": {
    "age": 64,
    "sex": "M",
    "comorbidities": [
      "ES-SCLC, 2nd line",
      "former smoker"
    ],
    "concomitant_meds": [
      "dexamethasone premedication",
      "ondansetron"
    ]
  },
  "narrative": "64-year-old man with extensive-stage small cell lung cancer received his first step-up dose of the DLL3 T-cell engager. Approximately 8 hours after dosing he developed fever to 39.2C with rigors and low blood pressure (88/54) requiring IV fluids; this was graded as cytokine release syndrome. On day 2 nursing noted he was confused with a coarse tremor and a drop in his ICE score (neurotoxicity). He was admitted to hospital and treated with tocilizumab and steroids; symptoms resolved over 48 hours. He was re-dosed at the next cycle with premedication and had only mild recurrence. No infection was identified on workup.",
  "events": [
    "fever with low blood pressure (CRS)",
    "confusion with tremor (neurotoxicity)"
  ]
}
Agent · turn 2
⚙ calls meddra_lookup(term=fever)
⚙ calls meddra_lookup(term=confusion)
⚙ calls meddra_lookup(term=tremor)
⚙ calls get_drug_label(drug_class=DLL3 T-cell engager (BiTE))
↳ meddra_lookup result
show data
{
  "query": "fever",
  "count": 2,
  "matches": [
    {
      "pt": "Febrile neutropenia",
      "soc": "Blood and lymphatic system disorders",
      "synonyms": [
        "fever with low white count",
        "neutropenic fever"
      ]
    },
    {
      "pt": "Pyrexia",
      "soc": "General disorders and administration site conditions",
      "synonyms": [
        "fever",
        "high temperature",
        "febrile",
        "raised temperature"
      ]
    }
  ]
}
↳ meddra_lookup result
show data
{
  "query": "confusion",
  "count": 1,
  "matches": [
    {
      "pt": "Immune effector cell-associated neurotoxicity syndrome",
      "soc": "Nervous system disorders",
      "synonyms": [
        "icans",
        "neurotoxicity",
        "neurologic toxicity",
        "confusion with tremor",
        "encephalopathy",
        "ice score drop"
      ]
    }
  ]
}
↳ meddra_lookup result
show data
{
  "query": "tremor",
  "count": 1,
  "matches": [
    {
      "pt": "Immune effector cell-associated neurotoxicity syndrome",
      "soc": "Nervous system disorders",
      "synonyms": [
        "icans",
        "neurotoxicity",
        "neurologic toxicity",
        "confusion with tremor",
        "encephalopathy",
        "ice score drop"
      ]
    }
  ]
}
↳ get_drug_label result
show data
{
  "drug_class": "DLL3 T-cell engager (BiTE)",
  "examples": [
    "Tarlatamab"
  ],
  "labeled": [
    "Cytokine release syndrome",
    "Immune effector cell-associated neurotoxicity syndrome",
    "Pyrexia",
    "Neutropenia",
    "Anaemia",
    "Fatigue",
    "Hypotension"
  ],
  "boxed_warning": "Cytokine release syndrome and neurologic toxicity including ICANS.",
  "note": "CRS and ICANS are expected, very common, on-target effects of T-cell engagers; typically occur within days of the first/step-up doses."
}
Agent · turn 3

Coding of Adverse Events

  1. Fever:
    - MedDRA PT: Pyrexia
    - MedDRA SOC: General disorders and administration site conditions

  2. Confusion:
    - MedDRA PT: Immune effector cell-associated neurotoxicity syndrome (ICANS)
    - MedDRA SOC: Nervous system disorders

  3. Tremor:
    - MedDRA PT: Immune effector cell-associated neurotoxicity syndrome (ICANS)
    - MedDRA SOC: Nervous system disorders

Assessment of Seriousness and Causality

  1. Fever with low blood pressure (Cytokine Release Syndrome - CRS):
    - Seriousness: Yes (due to hospitalization and treatment with IV fluids; symptomatic of CRS)
    - Causality: Related

    • Justification: Fever is a recognized adverse reaction related to the drug class (DLL3 T-cell engagers). The onset of fever occurred within 8 hours post-dose, aligning with the expected timing for CRS, as highlighted in the drug label stating CRS is a common on-target effect.
  2. Confusion with tremor (Neurotoxicity):
    - Seriousness: Yes (required hospital admission and treatment)
    - Causality: Related

    • Justification: Neurotoxicity is also listed as an expected adverse effect of DLL3 T-cell engagers. The symptoms of confusion and tremor were observed on Day 2 after drug administration, consistent with the timeline for ICANS, as noted in the boxed warning. The treatment with tocilizumab and steroids further supports the causality due to the medication's action in mitigating symptoms associated with neurotoxicity.

Summary

Both adverse events are serious, requiring hospitalization, and are determined to be related to the study drug (Tarlatamab) based on their timing, clinical presentation, and their classification as expected adverse effects in the drug labeling for DLL3 T-cell engagers.

Final output

Coding of Adverse Events

  1. Fever:
    - MedDRA PT: Pyrexia
    - MedDRA SOC: General disorders and administration site conditions

  2. Confusion:
    - MedDRA PT: Immune effector cell-associated neurotoxicity syndrome (ICANS)
    - MedDRA SOC: Nervous system disorders

  3. Tremor:
    - MedDRA PT: Immune effector cell-associated neurotoxicity syndrome (ICANS)
    - MedDRA SOC: Nervous system disorders

Assessment of Seriousness and Causality

  1. Fever with low blood pressure (Cytokine Release Syndrome - CRS):
    - Seriousness: Yes (due to hospitalization and treatment with IV fluids; symptomatic of CRS)
    - Causality: Related

    • Justification: Fever is a recognized adverse reaction related to the drug class (DLL3 T-cell engagers). The onset of fever occurred within 8 hours post-dose, aligning with the expected timing for CRS, as highlighted in the drug label stating CRS is a common on-target effect.
  2. Confusion with tremor (Neurotoxicity):
    - Seriousness: Yes (required hospital admission and treatment)
    - Causality: Related

    • Justification: Neurotoxicity is also listed as an expected adverse effect of DLL3 T-cell engagers. The symptoms of confusion and tremor were observed on Day 2 after drug administration, consistent with the timeline for ICANS, as noted in the boxed warning. The treatment with tocilizumab and steroids further supports the causality due to the medication's action in mitigating symptoms associated with neurotoxicity.

Summary

Both adverse events are serious, requiring hospitalization, and are determined to be related to the study drug (Tarlatamab) based on their timing, clinical presentation, and their classification as expected adverse effects in the drug labeling for DLL3 T-cell engagers.

Reference answer key (deterministic) — open after grading

Deterministic reference key for ICSR-001 (Tarlatamab). Causality categories follow WHO-UMC; where the narrative supports more than one defensible category the rationale says so. A strong analyst recovers all events, the correct PT/SOC from the dictionary, the right seriousness criterion, and well-reasoned causality grounded in temporality, dechallenge/rechallenge, alternative etiology, and the drug label.

Event (as narrated)Expected PTExpected SOCSeriousness (ICH E2A)Causality (WHO-UMC) + rationale
fever with low blood pressure (CRS)Cytokine release syndromeImmune system disordersSerious — Hospitalization (and medically important)Probable — Onset hours after the first step-up dose (tight temporal link), CRS is a labeled/boxed on-target effect of T-cell engagers, no infection found (no alternative cause), positive dechallenge with tocilizumab/steroids and only mild recurrence on rechallenge with premedication. Strong case; 'Certain' is defensible if rechallenge is weighted heavily.
confusion with tremor (neurotoxicity)Immune effector cell-associated neurotoxicity syndromeNervous system disordersSerious — Hospitalization (and medically important)Probable — Onset day 2 after dosing, ICANS-type neurotoxicity is a labeled/boxed class effect, resolved with steroids (positive dechallenge), no metabolic or structural alternative identified.
OVERALL CASESerious — Hospitalization / initially life-threatening (CRS with hypotension)Case-level ICH-E2A determination

Expert grading

Event extraction completeness 1 (poor) 5 (excellent)
5: Identifies every reportable adverse event in the narrative — including a serious event that is unrelated to the drug and secondary/lab events — without lumping distinct events or inventing ones.
1: Misses reportable events (e.g. overlooks the neutropenia behind a febrile-neutropenia admission, or drops the unrelated fracture), or merges separate events into one.
MedDRA coding accuracy 1 (poor) 5 (excellent)
5: Maps each lay event to the correct Preferred Term and its System Organ Class using the dictionary lookup (e.g. 'low white count' -> Neutropenia / Blood and lymphatic system disorders; 'shortness of breath with infiltrates' -> Pneumonitis / Respiratory). PT and SOC are consistent with the dictionary, not guessed from memory.
1: Wrong PT or mismatched SOC, codes to a symptom when a diagnosis PT exists (or vice versa), or fabricates a code never returned by meddra_lookup.
Seriousness determination (ICH E2A) 1 (poor) 5 (excellent)
5: Correctly classifies each event as serious/non-serious and names the right ICH E2A criterion (death, life-threatening, hospitalization, disability, congenital anomaly, medically important) — e.g. flags the hospitalized CRS/pneumonitis/febrile-neutropenia events as serious and the asymptomatic resolved lab abnormality as non-serious.
1: Calls a clearly serious (hospitalized/life-threatening) event non-serious or vice versa, or cites the wrong/no criterion, or conflates severity grade with seriousness.
Causality assessment quality (WHO-UMC) 1 (poor) 5 (excellent)
5: Assigns a defensible WHO-UMC category with sound reasoning: weighs temporality (onset vs dosing), dechallenge/rechallenge, alternative etiologies (confounding meds, comorbidity, trauma), and whether the event is labeled — e.g. probable/certain for step-up-dose CRS, unlikely for the antibiotic-confounded transaminitis, unrelated for the traumatic fracture.
1: Reflexively blames or exonerates the drug, ignores a strong alternative etiology or the temporal mismatch, conflates seriousness with causality, or misuses the WHO-UMC categories.
Evidence faithfulness 1 (poor) 5 (excellent)
5: Grounds every PT/SOC in an actual meddra_lookup result and every causality factor in the narrative or the drug label; no invented events, codes, lab values, or label claims; states uncertainty where the narrative is genuinely ambiguous.
1: Invents events or MedDRA codes, asserts label content not returned by get_drug_label, or contradicts the narrative (wrong timing, fabricated dechallenge).
Overall verdict
Your name / ID