Exploring Humira prescription costs using Medicare Part D data

Earlier this year I joined the many thousands of Americans who regularly take Humira to treat conditions like rheumatoid arthritis, Crohn's, psoriasis, and in my case, ankylosing spondylitis. I'm lucky enough to say that it's done wonders for my health and allowed me to reimagine what my body is capable of, from running to backpacking to simply waking up each morning pain-free.

But Humira comes at a cost. When my doctor first recommended it to me, I was shocked to learn that it's one of the biggest selling drugs in the world, bringing in nearly $20 billion in revenues in 2019 alone. I braced myself for the worst, and looked up the average cost of a one-month supply: $5,800.

Once I got over the sticker shock, I realized that my insurance would cover most of the cost, and I later discovered that with the drugmaker Abbvie's patient assistance program I could lower my co-pay to just $5 per shipment. This makes me, undoubtedly, one of the lucky ones. Less fortunate are the 30 million Americans without health insurance who couldn't dream of paying out-of-pocket for Humira, or the well over 100 million Americans who have non-commercial insurance (e.g. Medicare, Medicaid, VA health benefits) and are therefore ineligible for Abbvie's reduced co-pay.

But I still wondered: Why is Humira so expensive? It's partly explained by the monopoly its maker, Abbvie, has maintained in the US. While five generic alternatives have been approved by the FDA and some are already on the market in Europe, Abbvie has prevented them from reaching the hands of American patients through a controversial patent strategy. According to The Washington Post:

Humira was approved by the FDA in 2002 and its core patents expired in 2016, according to the Biosimilars Council, a division of the Association for Accessible Medicines, which represents generic manufacturers. But AbbVie won dozens of additional patents — what critics call a “patent thicket” — that extended the exclusive market for the drug to as late as 2034, the council said.

What does this monopoly mean for the cost of Humira? And is there any sign that the drug could become more affordable in the future? I decided to use publicly available data to find out.

Loading packages

I'm trying out a few new packages in this post. RSocrata makes it super easy to hit the Data.CMS.gov API, which I used to query provider-level Medicare claim summaries from 2013-2018. reactable and htmltools work together to make beautiful and highly styleable tables, and sparkline allows for easy row-level visualizations. Lastly, widgetframe is a handy package that embeds widgets inside iframes, meaning I can avoid the jQuery conflicts described in this chapter of the blogdown book (this is only relevant if you're loading multiple htmlwidgets on a single page).

library(tidyverse)
library(RSocrata)
library(reactable)
library(htmltools)
library(widgetframe)
library(sparkline)

The data

Medicare Part D is the section of Medicare that provides drug coverage to enrollees. Unlike for prescriptions obtained through private plans, Medicare prescription claims data is made public each year in summarized forms. For my purposes, I decided to look at the provider- and service-level Medicare Provider Utilization and Payment Data: Part D Prescriber data available from the CMS. Data sets are available for each year between 2013 and 2018. Each row describes a medical service performed by a given provider and the number of Medicare claims made that year, as well as their total cost.

Per the CMS: "Although the Part D Prescriber PUF has a wealth of information on payment and utilization for Medicare Part D prescriptions, the dataset has a number of limitations. Of particular importance is the fact that the data may not be representative of a physician’s entire practice or all of Medicare as it only includes information on beneficiaries enrolled in the Medicare Part D prescription drug program (i.e., approximately two-thirds of all Medicare beneficiaries). In addition, the data are not intended to indicate the quality of care provided."

The CMS has a Socrata API set up to make access to the data simple after a few authentication steps. I registered for an account, created an app token, and was able start hitting the endpoint in a matter of minutes using the RSocrata package, which structured the calls I was making. Filtering was just a matter of appending ?[field_name]=[field_value] to the end of the statement. I requested all records where the drug associated with the claim had the generic name Adalimumab, which is Humira's technical name — this allowed me to collect data on Humira's variants (e.g. syringe vs. pen injectors) and was more comprehensive than just searching for "Humira." Lastly, since each year's data set had a different endpoint, I collected those in a tibble and used purrr::map_dfr() to iteratively query and collate the data I wanted. Here's the script I wrote, which should be easily repurposable for exploring other drug costs:

api_endpoints <-
  tribble(
    ~ endpoint, ~ claim_year,
    # 2018 CY
    "https://data.cms.gov/resource/mhdd-npjx.json", 2018,
    # 2017 CY
    "https://data.cms.gov/resource/77gb-8z53.json", 2017,
    # 2016 CY
    "https://data.cms.gov/resource/yvpj-pmj2.json", 2016,
    # 2015 CY
    "https://data.cms.gov/resource/3z4d-vmhm.json", 2015,
    # 2014 CY
    "https://data.cms.gov/resource/465c-49pb.json", 2014,
    # 2013 CY
    "https://data.cms.gov/resource/4uvc-gbfz.json", 2013
  )

### API Docs: https://dev.socrata.com/foundry/data.cms.gov/mhdd-npjx
api_endpoints$endpoint %>%
  map_dfr(
    ~ mutate(
      read.socrata(
        paste0(., "?generic_name=ADALIMUMAB"),
        app_token = Sys.getenv("CMS_APP_TOKEN"),
        email     = Sys.getenv("CMS_EMAIL"),
        password  = Sys.getenv("CMS_PASSWORD")
      ),
      endpoint = .
    ) %>%
      mutate_at(vars(ends_with("_count"), starts_with("total_"), bene_count_ge65), as.double)
  ) %>%
  left_join(api_endpoints, by = "endpoint") %>%
  select(-endpoint) %>%
  mutate(
    specialty_description = if_else(is.na(specialty_description), specialty_desc, specialty_description),
    claim_year = as.character(claim_year)
  ) %>%
  select(-specialty_desc) %>%
  write_rds("data/cms_adalimumab.rds")

I saved the results locally so I wouldn't have to make repeated calls to the API. I ended up with a 67,813 by 25 data frame and, after some cleaning, was ready to start looking for variations in cost.

setwd("~/personal-website/content/post/humira-cost/")

df <-
  read_rds("data/cms_adalimumab.rds")

df_sum <-
  df %>%
  select(claim_year, total_claim_count, total_drug_cost) %>%
  group_by(claim_year) %>%
  summarise_if(is.numeric, ~ sum(., na.rm = TRUE)) %>%
  ungroup() %>%
  mutate(avg_cost = total_drug_cost / total_claim_count) %>%
  arrange(desc(claim_year))

Visualizations

As it turns out, the cost of Humira steadily rose year over year from 2013-2018. I was shocked to see that the average cost of a single month's supply was $2,898.74 in 2013, compared to $5,974.17 just five years later. Notably, the price increases didn't slow down after 2016 even though Abbvie's core patents expired, since their patent thicket strategy prevented any domestic competition from entering the market.

df %>%
  group_by(claim_year, npi) %>% 
  summarise(
    total_claims = sum(total_claim_count, na.rm = TRUE),
    avg_cost = sum(total_drug_cost, na.rm = TRUE) / total_claims
  ) %>%
  ungroup() %>% 
  ggplot(aes(claim_year, avg_cost)) + 
  geom_boxplot(outlier.color = "#b1005e", outlier.size = rel(.75)) +
  scale_y_continuous(labels = scales::dollar) +
  labs(
    x = "Claim year",
    y = "Avg. claim cost per provider",
    title = "The cost of Humira is steadily on the rise..."
  ) +
  theme(text = element_text(family = "Inter")) + 
  theme_minimal()

df %>% 
  group_by(claim_year, npi) %>% 
  summarise(
    total_claims = sum(total_claim_count, na.rm = TRUE),
    avg_cost = sum(total_drug_cost, na.rm = TRUE) / total_claims
  ) %>%
  ungroup() %>% 
  ggplot(aes(claim_year, avg_cost)) + 
  geom_boxplot(outlier.color = "#b1005e", outlier.size = rel(.75)) +
  scale_y_continuous(labels = scales::dollar) +
  coord_cartesian(ylim = c(2000, 7500)) + 
  labs(
    x = "Claim year",
    y = "Avg. claim cost per provider",
    title = "...doubling between 2013 and 2018"
  ) +
  theme(text = element_text(family = "Inter")) + 
  theme_minimal()

Here's a summary table, that shows the number of Medicare claims increasing year-over-year as well, built using reactable.

bar_chart_count <- function(label, width = "100%", height = "16px", fill = "#b1005e", background = NULL) {
  bar <- div(style = list(background = fill, width = width, height = height))
  chart <- div(style = list(flexGrow = 1, marginLeft = "8px", background = background), bar)
  div(style = list(display = "flex", alignItems = "center"), scales::comma(label), chart)
}

bar_chart_dollar <- function(label, width = "100%", height = "16px", fill = "#f0b8b8", background = NULL) {
  bar <- div(style = list(background = fill, width = width, height = height))
  chart <- div(style = list(flexGrow = 1, marginLeft = "8px", background = background), bar)
  div(style = list(display = "flex", alignItems = "center"), scales::dollar(label), chart)
}

df_sum %>%
  select(-total_drug_cost) %>%
  reactable(
    class = "summary-table",
    columns = list(
      claim_year = colDef(name = "Claim Year", headerClass = "header", align = "left"),
      total_claim_count = colDef(
        name = "Total Claims",
        align = "left",
        cell = function(value) {
          width <- paste0(value / max(df_sum$total_claim_count) * 100, "%")
          bar_chart_count(value, width = width)
        }
      ),
      avg_cost = colDef(
        name = "Avg. Cost",
        align = "left",
        format = colFormat(currency = "USD"),
        cell = function(value) {
          width <- paste0(value / max(df_sum$avg_cost) * 100, "%")
          bar_chart_dollar(value, width = width)
        }
      )
    )
  )

I was also curious about who was prescribing Humira, since the data set includes a specialty_description field that gives some insight into the prescribers. As I expected, rheumatologists are prescribing it most often. There was hardly any variation in the costs reported across practices, since the same medication is used to treat so many conditions.

# Colors from http://tsitsul.in/blog/coloropt/
custom_pal <-
  c(
    "#00b25d",
    "#ddb310",
    "#b51d14",
    "#00beff",
    "#fb49b0"
  )
## Prescriber specialties
df %>%
  mutate(
    specialty_description = if_else(
      specialty_description == "Student in an Organized Health Care Education/Training Program",
      "Student in an Organized Health Care\nEducation/Training Program",
      specialty_description
    )
  ) %>%
  group_by(specialty_description) %>%
  summarise(total_claims = sum(total_claim_count, na.rm = TRUE)) %>%
  ungroup() %>%
  top_n(n = 15, wt = total_claims) %>%
  ggplot(aes(reorder(specialty_description, total_claims), total_claims)) +
  geom_point(color = "#b1005e") +
  coord_flip() +
  scale_y_continuous(labels = scales::comma) +
  labs(
    x = NULL,
    y = "Total Claims",
    title = "Top 15 prescriber specialties for Humira claims"
  ) + 
  theme(text = element_text(family = "Inter")) + 
  theme_minimal()

# Total claims across specialties
df %>%
  mutate(claim_year = as.character(claim_year)) %>%
  mutate(specialty_description = as.character(fct_lump(specialty_description, n = 4))) %>%
  group_by(claim_year, specialty_description) %>%
  summarise(total_claims = sum(total_claim_count, na.rm = TRUE)) %>%
  ungroup() %>%
  ggplot(aes(claim_year, total_claims, color = fct_reorder(specialty_description, -total_claims), group = specialty_description)) +
  geom_line() +
  scale_y_continuous(labels = scales::comma) +
  scale_color_manual(values = custom_pal) +
  labs(
    x = "Claim year",
    y = "Total claims",
    color = "Prescriber specialty",
    title = "Total Medicare claims for Humira variants, 2013-2018"
  ) +
  theme(text = element_text(family = "Inter")) + 
  theme_minimal()

# Average cost across specialties
df %>%
  mutate(claim_year = as.character(claim_year)) %>%
  mutate(specialty_description = as.character(fct_lump(specialty_description, n = 4))) %>%
  group_by(claim_year, specialty_description) %>%
  summarise(
    total_claims = sum(total_claim_count, na.rm = TRUE),
    avg_cost = sum(total_drug_cost, na.rm = TRUE) / total_claims
  ) %>%
  ungroup() %>%
  ggplot(aes(claim_year, avg_cost, color = fct_reorder(specialty_description, -total_claims), group = specialty_description)) +
  geom_line() +
  scale_y_continuous(labels = scales::dollar) +
  scale_color_manual(values = custom_pal) +
  labs(
    x = "Claim year",
    y = "Average cost",
    color = "Prescriber specialty",
    title = "Average Medicare claim cost for Humira variants, 2013-2018"
  ) +
  theme(text = element_text(family = "Inter")) + 
  theme_minimal()

Lastly, I wanted to see if the cost varied by geography at all. Surprisingly, there's some fairly significant variation: Delaware has the highest average costs at $7,035.66, whereas in Hawai'i the average cost is only $5,336.44. I'm not sure I needed another reason to not move to Delaware, but duly noted.

Unsurprisingly, though, costs are rising uniformly across the country. And while some states have bumpier adoption curves, the market for Humira is clearly growing steadily, as well. With its anti-competitive market protections in place, it looks like Humira will continue to be a top dollar product for Abbvie, who have little incentive to lower costs under current circumstances — a cause for concern for the many thousands of Humira patients who rely on the drug for pain management.

df_geo <-
  df %>% 
  filter(!nppes_provider_state %in% c("XX", "ZZ", "GU", "VI", "PR")) %>% 
  group_by(state = nppes_provider_state, claim_year) %>% 
  summarise(
    total_claims = sum(total_claim_count, na.rm = TRUE),
    avg_cost = sum(total_drug_cost, na.rm = TRUE) / total_claims
  ) %>%
  ungroup() %>% 
  group_by(state) %>% 
  summarise(
    total_claims_18 = sum(if_else(claim_year == "2018", total_claims, 0)),
    avg_cost_18 = sum(if_else(claim_year == "2018", avg_cost, 0)),
    total_claims_change = (total_claims_18 - sum(if_else(claim_year == "2013", total_claims, 0))) / 
      sum(if_else(claim_year == "2013", total_claims, 0)),
    avg_cost_change = (avg_cost_18 - sum(if_else(claim_year == "2013", avg_cost, 0))) / 
      sum(if_else(claim_year == "2013", avg_cost, 0)),
    total_claims_year = list(total_claims),
    avg_cost_year = list(avg_cost)
  ) %>% 
  ungroup() %>% 
  select(state, total_claims_18, total_claims_change, total_claims_year, avg_cost_18, avg_cost_change, avg_cost_year)

  df_geo %>%
  reactable(
    class = "summary-table",
    pagination = FALSE,
    columns = list(
      state = colDef(name = "State", headerClass = "header", align = "left"),
      total_claims_18 = colDef(
        name = "Total Claims in 2018",
        align = "center"
      ),
      total_claims_change = colDef(
        name = "Change since 2013",
        format = colFormat(percent = TRUE, digits = 1),
        align = "center"
      ),
      total_claims_year = colDef(
        name = "Trend",
        cell = function(value, index) {
          sparkline(df_geo$total_claims_year[[index]], type = "bar", barColor = "#b1005e")
        },
        align = "center"
      ),
      avg_cost_18 = colDef(
        name = "Avg. Cost in 2018",
        align = "center",
        format = colFormat(currency = "USD")
      ),
      avg_cost_change = colDef(
        name = "Change since 2013",
        format = colFormat(percent = TRUE, digits = 1),
        align = "center"
      ),
      avg_cost_year = colDef(
        name = "Trend",
        cell = function(value, index) {
          sparkline(df_geo$avg_cost_year[[index]], lineColor = "#b1005e", fillColor = "#b1005e")
        },
        align = "center"
      )
    )
  ) %>% 
  frameWidget()

If Abbvie's patent thicket strategy continues to prevail (and they've already successfully defended it in court), the domestic cost of Humira is likely to keep rising. Like most conditions treated by Humira, my ankylosing spondylitis is chronic, meaning that I'll need to find ways to keep paying for Humira for the foreseeable future or else forgo care. While I'm in a strong position at the moment, it's not hard to imagine the potential constraints I'll face if anything about my current insurance situation changes:

  • If I switch jobs, I'll need to establish coverage on a new health plan immediately. This summer when I moved to Montana, it took me nearly two months to completely manage the transition to a new provider, insurance plan, and pharmacy, and I had to pause my medication while I sorted everything out.

  • If I lose a job or step away from work for an extended period of time, I'll need to either scramble to find temporary insurance or simply forgo my medication, since the out-of-pocket cost without insurance is already unmanageable.

  • If I were retired today and on Medicare (this is much more of a hypothetical, but bear with me) or needed to enroll in Medicaid, I would be ineligible for Abbvie's co-pay assistance and would likely pay nearly $5,200 out-of-pocket each year.

These are personal reasons, but they echo the concerns of the millions of Americans who are navigating a health landscape rife with anti-competitive market practices, skyrocketing out-of-pocket expenses, and tenuous guarantees of care outside the confines of the commercial insurance industry. I don't mean to discount the way Humira has changed my life for the better, but my experience with the drug has been equally colored by an up-close look at how broken and unjust America's health system can be.

Benjamin Chang Sorensen
Benjamin Chang Sorensen
Seattle, WA • he/him

Related