Arnold Kling  

Open Source Forecasting

EconLog Book Club: For a Ne... Taleb's Solution...

Google's Hal Varian and Hyunyoung Choi write,

We find that Google Trends data can help improve forecasts of the current level of activity for a number of different economic time series, including automobile sales, home sales, retail sales, and travel behavior...

we expect that there are several other interesting ideas out there. So we suggest that forecasting wannabes download some Google Trends data and try to relate it to other economic time series.

Comments and Sharing

CATEGORIES: Economic Methods

COMMENTS (3 to date)
rpl writes:

I was pretty skeptical of the notion of "forecasting" anything using Google data, but the Google article is not really talking about forecasting as such, but rather getting data on current happenings more quickly than you could get it by waiting on the official statistics. That seems a lot more plausible.

My first thought was to wonder how well Google searches really correlate with actual behavior. I've only skimmed the paper, but it looks like their model contains some adjustable parameters, and the phrase "cross-validation" doesn't appear anywhere in the text. That isn't a good sign, but one would have to read the procedure more carefully to untangle the subtleties.

My second thought was, how easily could these statistics be manipulated? If people come to rely on Google Trends and its models, then the operators of, say, a large bot-net could generate a bunch of bogus searches to create the appearance of a fake recovery in, say, retail sales. Stocks of retailers would presumably surge, creating an opportunity for the perpetrators to profit using short sales or judicious options purchases.

Google is justifiably proud of its mammoth data set, but I don't think they've given too much thought to quality controlling the data. QC can be a real headache even in cases where the data source is well understood. For example, in meteorology bad ASOS and radiosonde observations sometimes make it into models, and good ones are sometimes erroneously rejected. Both types of error have been known to compromise forecasts, and I would aver that human users are even less predictable than weather instruments. Therefore, I would conjecture that the QC problem will be a deal-breaker for using Google Trends data as a significant economic indicator.

Bman writes:

Dear Dr. Kling,

Thanks for the excellent blog. Related to this post, you might want to have a look here:

Many of the series that correlate with Google trends data can often be forecast just as well or better using standard data and simple techniques.

Thank you.

Patri Friedman writes:

My old team at Google :). (Although I worked on other stuff - auctions, competitiveness of the search market)

rpl - you are totally wrong about the QC. Keep in mind that Google Searches => Google Ads => Google revenue and billing Google's advertisers. False searches mean false billing for ads. As a company that cares about providing long-term value, Google goes to enormous effort to identify many kinds of fraudulent search ("search spam", "ad spam") and eliminate it from their records.

Sure, it is imperfect, but QC is *not* ignored. Lots of effort goes into QCing that data because advertisers are billed based on it.

Comments for this entry have been closed
Return to top