Bryan Caplan  

Cross-Country Regressions and Population Weighting

PRINT
Right wing doesn't mean pro-fr... Bastiat Extended...
Whenever someone presents growth regressions, a robustness check pops into my head: "What happens if you weight for population?"  Treating China and Grenada as equally probative data points just seems crazy to me.  If you want to understand how human societies function, one gigantic society has far more to teach us than two tiny islands.

Is my intuition econometrically sound?  Bill Dickens, my Econ 1 teacher, has a paper subtitled "Is It Ever Worth Weighting" (Review of Economics and Statistics, 1990) - and originally subtitled "Why It's Never Worth Weighting."  This apparent absolutism prompted me to send Bill the following hypothetical:

Someone is doing a cross-country regression, counting each member of the European Union as a single data point. A critic says, "Bill Dickens showed that it is never worth weighting. We should just treat the whole EU as one data point despite its huge population."

 In your framework, how is the critic wrong? Why then wouldn't he be arguably wrong for any large, diverse country?

In conversation, Bill's response was modest: His paper only addressed a very different rationale for population weighting.  For cross-country regressions, population weighting might be entirely suitable.

Last year, Solon et al. published a more comprehensive piece on the topic, entitled "What Are We Weighting For?" (Journal of Human Resources, 2015).  While they raise multiple technical issues, their advice is straightforward: When population-weighting matters, researchers should alert their readers and reflect on the source of the contrast.

[W]e recommend reporting both the weighted and unweighted estimates because the contrast serves as a useful joint test against model misspecification and/or misunderstanding of the sampling process.

Who cares?  Early this year, I spent weeks reviewing research on ancestry and economic performance.  In this research, the three most populous countries on Earth - China, India, and the United States - are major outliers. 

This led me to wonder, "What would happen if we weight the results by population?," but I felt the need to review the econometrics first.  Since it now looks like population-weighting is technically acceptable as well as intuitively plausible, I'm now ready to share the results.  What's the bottom line?  Stay tuned.


Comments and Sharing






COMMENTS (3 to date)
Nathan W writes:

If you're trying to evaluate the effectiveness of different institutional approaches, legal systems, that sort of stuff, it might make sense to treat each polity as a single data point. You might find a lot of smaller countries bunched into one corner, and then you can observe this and question whether appropriate institutional setups depend on size ... or any number of other conclusions. The point is that each institutional setting constitutes a datapoint for many types of analysis regardless of the size of population or the economy.

But in matters where the global significance of the data point is important for extracting the relevance of the data, I think PPP-adjusted size of the economy makes sense. (While the perspective bothers me somewhat, from a realpolitick perspective, I'm not sure what methodological advantage there would be to correcting for population instead of economic clout.)

Doug C writes:

As regards your review of the research on ancestry and economic performance, you might find this interesting: http://andrewgelman.com/2015/12/19/a-replication-in-economics-does-genetic-distance-to-the-us-predict-development/

Doug C writes:

I had some further thoughts on this.

Just as regions within countries are likely to have data that are very highly correlated, and so weighting by population would not be so advisable, so it is that some countries with the same region that share cultural traits are also going to have incomes that are highly correlated. Europe west of the former iron curtain, for example, is basically all rich. Each of these countries is not really like a random observation however. You certainly don't have 15 independent observations. This is probably the main problem with cross-country regressions -- you essentially have far fewer independent observations that it at first appears. Regional fixed effects will help with this. This is why regional fixed effects tend to wreak all kinds of havoc on these types of cross-country regressions.

Comments for this entry have been closed
Return to top