The economy has been big news during the pandemic. In recent months, we’ve had skyrocketing CPI prints, massive supply disruptions, shipping bottlenecks, warnings about consumer sentiment (consumer spending is 70% of US GDP), and many eyes glued on economic indicators. While some indicators (e.g. CPI) are published monthly, others are only available after the end of the quarter (e.g. retail sales) and sometimes with a sizeable delay.
Can we aggregate multiple sources of data to estimate how something like retail sales might be trending, months before the official numbers are released?
Here, I experiment with the use of US Google Trends data and some economic readouts to extrapolate the current health of the retail sector. I will be using from Google Trends data on e-commerce retailers, specifically, because there isn’t a good proxy for trips to brick-and-mortar stores. The biggest general purpose e-commerce retailers are Amazon, Walmart, and eBay, so I think it’s sufficient to focus on those three.Market share of leading retail e-commerce companies in the United States as of October 2021
Statista 2021. https://www.statista.com/statistics/274255/market-share-of-the-leading-retailers-in-us-e-commerce/
As to the question of “why not just directly look at the economic indicator for retail sales?”, the RETAILSMNSA indicator for September 2021 was only just released. It’s currently November 21st. While the US Census Bureau is undoubtedly the final authority, we might actually be curious about the current state of affairs.
I will use “hits” to refer to the scaled numbers out of Google Trends, since that’s what they call them. And for brevity and entertainment’s sake, any reference to combined numbers across the three retailers Amazon, Walmart, and eBay will be referred to as “AWE”.
There is a laundry list of limitations and assumptions to this analysis, which you can see at the end. Also, note that the more manipulations we do on data, the harder it is to interpret. We will be entering that territory.
This data was downloaded and analyzed on November 21, 2021.
To get an idea of what we’re working with, here’s a basic figure showing the raw hits on our three retailers over the past 11 years (it’s month-by-month data). Note that Google Trends scales the hits such that the month with the highest search activity across all search terms queried is always 100.
We see that Amazon and, to a lesser extent, Walmart, are seeing a growth in search activity, while eBay has been in decline. There’s always a spike in search activity around the end of the year, and a smaller spike around the end of summer (back-to-school shopping?). And of course, we see the COVID spike in the first half of 2020.
We can also create an aggregate sort of measure as seen below, by adding the hits across the three retailers across each month we have data on. This will be useful in the next section, when we combine this aggregate measure with e-commerce sales.
Non-seasonally adjusted e-commerce retail sales (in millions of $) are reported quarterly by the US Census Bureau as ECOMNSA. Here is what the past 11 years of data look like. There’s quite a stable trend and seasonal effect up until 2020.
E-commerce retail sales as a percentage of total sales are also reported quarterly by the US Census Bureau as ECOMPCTNSA. There’s also a stable trend and seasonal effect (more e-commerce in Q4), up until 2020.
Now for the fun part – combining the three data sets.
We can do this by first allocating quarterly e-commerce sales proportionally across the three months in that quarter and the aggregate hits from the previous section. When we do that, we get the following estimated retail sales per month:
Note that the red points (October and November 2021) are even more of an estimate than the other points, because we don’t have retail sales for Q4 yet. How did we guesstimate where the red points should be?
The trick was to do an intermediate estimation of how many sales $ each Google Trends hit corresponded to. We take quarterly sales and divide by aggregate hits across the retailers (“sales per hit”), producing the figure below. It suggests that over the past 11 years, the relationship between Google searches for a given retailer and amount of money spent at that retailer has increased by a lot. I’m not sure if that’s simply indicative of the growing reliance on e-commerce, or an artifact of the particulars of this data. Regardless, it looks like sales per hit isn’t super noisy and we can assume the Q3 value will be roughly valid for Q4, enabling us to estimate current Q4 sales by month. We will actually use a slightly higher estimate – 550 million $ per hit – indicated by the blue dot, since numbers have been steadily trending upwards.