Medicaid obesity, sized honestly

How the public studies differ — and what it means for pharma

Two open, citable studies anchor most state obesity projections: Harvard CHOICES (Ward et al., NEJM 2019), which publishes state-level and income-banded trajectories, and IHME/GBD (Lancet 2024), a general-population forecast to 2050. They are built differently and answer different questions — so they disagree, and neither is a drop-in for a Medicaid market-access view.

State (drives the CHOICES rows; IHME is national)

Study / series	Population	Projection	Base data (vintage)	Key concern for pharma

Why these studies differ

Base-year vintage. CHOICES is fit to BRFSS through 2016; IHME ingests sources through 2021. Neither sees the post-2023 GLP-1 era — both are, in effect, pre-intervention baselines.
Population framing. CHOICES reports a low-income band and a general-population figure; IHME is general-population only. Low-income obesity runs ~1.1–1.2× the state average, so the income-banded number sits well above the headline.
Horizon. CHOICES stops at 2030; IHME runs to 2050 (~57% national). Their headline numbers aren't directly comparable — different finish lines.
Method. CHOICES is a microsimulation with a quantile-specific NHANES self-report correction; IHME uses GBD crosswalks across 134 sources. Same definition (BMI ≥ 30, adults) — the gaps are vintage / population / horizon / method, not a definitional mismatch.

What it means for pharma — upside

A credible, peer-reviewed baseline already exists at the state level — no need to defend a from-scratch forecast.
The income-banded view shows the Medicaid-eligible population is materially larger than the state headline implies — a bigger addressable base than the obvious number.
Both point the same direction: a durable, rising trend — the category isn't a near-term fad.

What it means for pharma — downside

Both are stale and pre-GLP-1 — neither anchors near-term demand once therapy uptake is in play.
Neither is Medicaid-framed on current data — the wrong denominator for access/contracting.
The eligible population is a moving target. Medicaid redeterminations (post-unwinding), possible expansion rollbacks, and work-requirement proposals will spike or suppress the covered-lives count state by state. Demand projections built on these studies carry large, policy-driven error bars — and no historical model can resolve that (see Part 2 below).

Which gaps can be plugged, and which can't

The gaps that can't be closed are the ones a pharma team should watch most closely: neither these studies nor any public-data model can hand you a number there, so they have to be planned around — with a modeled scenario and a plan's proprietary data — rather than assumed away. That's where the real demand-planning risk sits.

Closable with public data

A transparent, calibrated estimate — defensible as a proxy

ACurrency. Re-fit to BRFSS through ~2023 instead of a 2016 base.
BMedicaid framing. Re-base the headline to the low-income proxy + an NHANES-derived self-report correction (×1.16).
CCost, decomposed. Build obesity's excess cost bottom-up and split it by driver (~80% diabetes + hypertension).
DGLP-1 ROI framing. Net published drug prices against cited medical-cost offsets, by comorbid segment.
EUsability. Make it interactive and reproducible — run any state, inspect the method.

Structurally not closable with public data — where pharma should focus

The gap — and why public data can't close it	Can a plan's proprietary data close it?
Enrollee-level prediction. Public data is aggregate; inferring an individual's risk from state averages is the ecological fallacy.	Yes. Longitudinal member-level claims support individual risk models.
A plan's exact numbers. Prevalence and cost depend on a book's specific age/sex/comorbidity mix; public averages can't reproduce it.	Yes. A plan's proprietary data is exactly what produces its real figures.
GLP-1 demand & behavioral response. Uptake, adherence, and discontinuation in the post-2023 era aren't in pre-trend public data.	Partly. A plan's claims give real adherence/discontinuation curves, but future uptake under a new coverage policy still needs assumptions.
Policy-driven eligibility swings. Redeterminations, expansion changes, and work requirements aren't in any historical series — the denominator's future isn't in the data.	No — for anyone. It must be modeled as a scenario, not predicted; a plan's data only sizes the sensitivity. Part 4 of this page provides that scenario model.
State-level unit cost. The public MEPS file isn't state-identified, so cost weights stay national; only prevalence varies by state (via PLACES).	Partly. Closed for a plan's own footprint; an all-state cost picture needs broader claims (e.g., HCUP).
Causal attribution. The obesity→comorbidity link here is associational.	Partly. Claims support quasi-experimental estimates, not RCT-grade proof.

The model I built — against the closable gaps, and why each holds up

One calibrated engine, public data, addressing gaps A–E. Each piece carries its own credibility check — most importantly, the cost rebuilt from the ground up lands within ~6% of the published figure.

Calibrated Medicaid-eligible (NJ) CHOICES low-income CHOICES overall

The one state with an independent starting point. For New Jersey I ran a separate calibration — re-basing the published state obesity rate to the Medicaid-eligible population on recent (≈2023) survey data, with a self-report correction validated against measured national data. That gives a current ~40% starting point (the blue dot — what I call the "anchor"). Because it's measured-equivalent rather than borrowed from the older studies, NJ's line moves on its own: it lands below CHOICES's low-income proxy (whose 2016 base runs hot) and below the general-population line — deliberately conservative. Every other state defaults to the CHOICES low-income trajectory until a plan supplies its own measured rate (that needs the plan's proprietary data, not something public data can produce). That's why New Jersey is shown here as the worked example rather than drawing a calibrated line for all 50 states.

Closes A

Current, Medicaid-framed prevalence

The headline state rate re-based to the Medicaid-eligible population on BRFSS ≈2023 data, self-report corrected (×1.16, NHANES-derived), shown by county.

Why it holds up: the calibration is shown step-by-step and the correction factor is data-derived — a transparent proxy, presented as a range, not a point claim.

Closes C

How much obesity costs — decomposed

Bottom-up from public data: ~$1,980/adult/yr, of which ~80% is diabetes + hypertension. It's a diabetes-and-hypertension budget problem wearing an "obesity" label.

Why it holds up: that bottom-up $1,980 independently brackets the published $1,861 (Cawley 2021) — within ~6%, by a different route.

Closes D

Do GLP-1s pay off — by segment

Drug cost netted against the medical-cost offset by comorbid segment. Even at the ~$2,940 Medicaid price (BALANCE), GLP-1s stay a net cost — the lever is which members (the diabetes/CVD core), not blanket coverage.

Why it holds up: offsets use SELECT's 24% CVD-event reduction and published net prices — cited inputs, not optimistic assumptions.

Closes B+E

Where the cost is heading — interactive

The comorbidity-loaded cost projected on the CHOICES slope. For a 1.7M-life book: ~$619M (2026) → ~$654M (2030), ~$3.2B cumulative.

Why it holds up: an honest calibrated extrapolation on a validated slope — labelled as such, not dressed up as a dynamic incidence model.

The live engine behind all of this — it isn't one gap, it's the whole chain: pick any state and walk eligibility → prevalence → cost → GLP-1 scenario in one tool.

▶ Open the live app — any state, full chain ↗

Cost decomposition ↗ GLP-1 ROI ↗ Cost trajectory ↗ Prevalence projection — to 2030 ↗ Full model comparison ↗

The moving denominator — mapping the eligibility scenarios

The hardest pharma problem in Part 1 was that policy moves the eligible population. You can't forecast that from history — but you can map the scenarios explicitly: for any state, who qualifies under the current rules, and how the addressable base shifts if expansion, thresholds, or categories change. These two tools do exactly that.

Eligibility calculator — per state, 2026 rules ↗ Coverage-gap map — 51 states ↗

Read the projections in Parts 1 and 3 as "no-intervention" baselines: none reflects the post-2023 GLP-1 trend break. They're the curve a coverage program would bend — and the eligibility scenarios above are the denominator that program is applied to.

What I'm building next

Two extensions are in progress. First, the same engine run on member-level data shaped like a real Medicaid book — a synthetic, no-PHI demo of 50,000 members (already prototyped) that shows the member-level output without touching any real records. Second, an all-states integrated view that layers this cost engine onto live state-level comorbidity data, so the full chain runs for every state, not just the New Jersey worked example.

These are also where the gaps public data can't close (Part 2) would be addressed — on a plan's proprietary data, which is a separate engagement, not a public-data result. The policy-driven eligibility swings remain a modeled scenario in either case (the model in Part 4).