Is there going to be an avocado shortage or what?
It’s always a nice surprise to be reminded of my time at Thinkful. A friend of mine recently made the following exclamation:

Here’s a link to story.
Avocado Past
In and on the dataworld, there is an avocado dataset. Within the data science community, there are multiple version of this dataset floating around, primarily focused on the enticing task of using the data to predict the prices of avocados.
The scenario occasionally references the great avocadopaclypse of 2017.
Avocado Present
Past competitions and postings eventually led me to the Hass Avocado Board (HAB) website. After signing up, I am able to access data volume, pricing, and sales data going back to 2019. They also offer some projections to go along with it. The goal here is not to predict avocado prices, however. I just want to know if I’m going to be able to get me some avocados because frozen is a bad idea.
The Data
The data consists of several CSV files provided per year and broken into two segments:
- Segment 1: Volume and Pricing
- Per Unit Sales Volumes
- Per Type (Large/Extra Large/Small) Sales Volumes
- ASP
- Total US Amount Sold
- Breakdown select US regions amounts sold
- Segment 2: Production Projections
- Per Pound Sales Volumes
- Select Regional past production volumes
- Select Regional projected production volumes
The reports are provided on a weekly basis with an “end of week” intention set for Sunday.
The Process
Given a series of csv files, I make use of Excel PowerQuery to transform the data. There’s not a lot of it and Power Query is what’s readily available. While exploring the data, I resolved the following:
- One Bad Date: The first date reported is Monday, January 7, 2019. All future data points report on Sunday.
- This wouldn’t be a complication, but I want to look at the combined sales totals against the projected and actual production reports. The only primary key that I can reference is these dates.
- It’s one day out of somewhere in the neighborhood of 163, so – I adjusted it to 1/6/2019 from 1/7/2019. I now have a primary Key.
- Pound vs. Unit: The pricing data is provided by the unit. Production data is provided by the pound.
- I would prefer units, but there is no conversion factor easily found.
- Convert the number of units to pounds.
- I pulled the approximate weight by size for avocados from indexfresh.com.
- The California Avocados website provides size information related to the avocado’s PLU.
- Avocado sales relating to bulk or bagged units will have to suffer calculations based on the average weight. The average weight was determined by calculating the average weights based on all PLUs.
- It Didn’t Add Up: When validating the total avocados sold against the HAB website, the numbers didn’t match for 2022.
- Comparing our results in earlier reports matched.
- Closer inspection suggests that there is a glitch in the viz for 2022 as the csv file goes beyond January 9, 2022.

The Viz
My weekend was spent researching, procuring, and cleaning data, so it’s only natural that I stop thinking about the future and take a step back to look at what I’ve got.

Using pricing and production information provided by HAB, avocado size requirements based on reports from the California Avocados Website, and avocado size/weight requirements reported by indexfresh we learn:
- Mexico is the leading producer of avocados.
- The number of units sold in the US closely matches the number of units produced, globally within the regional production totals available.
- The rough estimated number of pounds sold inside the US is much lower than the weight of avocados produced within the regional production totals available.
Avocado Future
Whether the American demand for avocados can be met without Mexico is indeterminate.
- The data that I have on hand only represents California’s contribution to supply of avocados in the US.
- According to the Agriculture Marketing Solutions, California is not the only state to contribute to avocado production.
- The data does not entirely represent US production totals.
- There is a lack of consistency with regards to the regional production estimates.
Whether the total units of avocados sold is the equivalent of the metric tons that are produced, imported, and distributed in the US needs further research/analysis.
It might be worth comparing apples to avocados. Avocados are not the number one selling, leading, fruit in the US. Is that because of the supply?
I don’t know. It could be the price.
Leave a Reply