Arnold Kling  

Macroeconometrics and Science

Recommended... Problems with Age-Testing...

Menzie Chinn writes,

the forecasts are generated using old-fashioned models in the spirit of the neoclassical synthesis (demand determined in short run, supply determined in the long run) with (as I understand it) backwards looking expectations rather than model-consistent expectations. I leave it to the readers whether these characteristics are the biggest sins of macro modelers in the run up to the latest crisis, and the ensuing Great Recession. (After all, one could reasonably argue that assuming perfect capital markets, or a unitary bond market, might be more problematic assumptions that adaptive expectations.)

He is discussing the macroeconometric models. These are models with hundreds of equations that are simulated with and without a fiscal stimulus in order to measure the effect of the stimulus. I want to explain why the models lack scientific merit.

Some of this criticism was spelled out in historical perspective here. I am currently revising that paper.

Chinn is correct that the economics profession turned away from macroeconometric models for reasons having to do with their theoretical specification, in particular their inconsistency with rational expectations. If that were their only problem, I would still be a model jockey, as I was early in my professional career.

Instead, my criticism of macroeconometric models is that the degrees of freedom belong to the modeler, not to the data. In Bayesian terms, the weight of the modeler's priors is very, very high, and the weight of the data is close to zero. The data are essentially there just to calibrate the model to the modeler's priors.

In nontechnical terms, this issue can be stated as follows. Consider two ways of getting a computer to print out that the stimulus created 1.6 million jobs. Method one is to set up an elaborate computer simluation that produces such a result. Method two is to type "the stimulus created 1.6 million jobs" into a word processer and hit the print key. The only difference between those two ways is the amount of computer processing time involved.

The scientific method is based on controlled experiments. In a controlled experiment, the experimenter creates an environment in which one factor changes and all other factors are held constant.

Economists cannot construct controlled experiments to test all of our interesting hypotheses. We have abundant data, but we did not create the circumstances that produced the data. In statistical jargon, we are making observational studies.

An observational study can be of scientific use if the conditions are right. One condition is that there are many observations relative to the number of factors that must be controlled for. In statistical jargon, this is known as the degrees of freedom.

In macroeconomics, there are more factors to be controlled for than there are observations. There are negative degrees of freedom, which should cause your statistical software to give you an error message.

Instead, the modeler limits the way that factors enter the model. For example, the modeler probably will not control for changes in the educational attainment of the labor force over time. That is not because the educational attainment over time does not matter. It is because the modeler does not want to put in so many factors that the computer spits out an error message.

There are thousands of ways to specify the "consumption function," which is the equation that predicts consumer spending. Should durable goods spending be separated from spending on nondurable goods and services? Should previous periods' income be used in addition to current income, and with what weight? Should a measure of anticipated future income be used? How should wealth enter the equation? Is there a way to account for the role of credit market conditions? How do tax considerations enter? Are there different propensities to consume out of wage income and out of transfer payments? How do consumers respond to changes in oil prices? How do they form expectations for oil prices in the future? What factors that are trending over time, such as population changes and shifts in the mix of consumption, need to be controlled for? Which time periods are affected by special factors, such as the recent snowstorms along the east coast?

If you have about 80 quarters of data to work with, and you have thousands of factors to control for, there is no conceivable way for the model's specification to reflect the data. Instead, the specification depends on the opinion of the modeler.

The conditions under which statistical techniques are scientifically valid are not satisfied with macroeconomic data. There is no reason to take model results as reflecting anything other than the opinion of the modeler.

What if the models performed well in out-of-sample forecasts? If that were the case, then I would have to concede that there might be some scientific validity to the models. However, that has never been the case. When I was a model jockey, the models were forever being tweaked with what were called "add factors" or "constant adjustments" in order to keep them on track with the most recent data. Formal studies of out-of-sample forecasts, by Stephen McNees of the Boston Fed and others, showed dismal performance. Even today, the models that are telling us how many jobs the stimulus saved are the same models that predicted that unemployment today would be close to 7 percent with the stimulus, when in reality it is 9.7 percent. So out-of-sample performance fails to boost one's confidence in the scientific status of these models.

Macroeconometric models satisfy a deep need to create the illusion that government can exercise precise control over output and employment. As long as people are determined to believe that such control is possible, the models will have a constituency. For better or worse.

Comments and Sharing

COMMENTS (24 to date)
Jeff Hallman writes:

We've known for a long time that the big macro models don't predict as well out of sample as atheoretic Bayesian VAR's, and for many series, they don't even predict as well as univariate time series models of the Box-Jenkins type.

A serious approach to macro modeling would start with the best-predicting atheoretic statistical model (e.g., a BVAR) and then ask, "What restrictions does my macro theory place on the coefficients of that model?" You then impose those restrictions and reestimate. If the out-of-sample forecasts of the theory-restricted model are no better than the forecasts from the atheoretic model, your theory is essentially worthless. Such is the state of most macro today.

Macro models are story-telling devices, nothing more.

rhhardin writes:

It's a curve fitter with economic variable names, akin to climate models with thermodynamic variable names.

They always match the data, and have zero predictive capablility.

You can formalize the curve fitting equivalence with Kalman filtering; whatever parameters you put in, you can solve for their values using the data.

david writes:

This is an argument for agnosticism, not markets-fail-but-government-fails-worse. That would obviously require a more precise characterization of markets and governments. A socialist could take exactly the same position and end with the charge that "[macroeconometric] models satisfy a deep need to"... think that markets work at all. Better the certainty of the command economy, aye?*

Besides that. There are other ways to introduce restrictions - obsessing over achieving actually-true microfoundations, for instance. That gives you a lot more data to fit. The drawback is that the model becomes rapidly more complicated. This may be something that economists are just going to have to accept, short of some miraculous advance in mathematically describing complex systems.

* this is sarcasm

This analysis of the problems with current modeling seems spot on.

But the analysis doesn't suggest that all modeling is useless. I think it can be read as an argument for simpler models. There must be some minimal macro variables that can be built up from micro observations, at least over short periods in limited markets (geographically or otherwise).

It's worth asking: "What are the simplest mathematical models that can be used to produce some of the macro phenomena we observe?"

A big problem is that we still don't really understand price theory at the micro level.

david writes:

Haha, I like that a comment with an exactly opposing stance was posted right after mine ;)

I doubt simpler modeling will get anywhere; "old" Keynesian hydraulic theory is pretty simple, and worked for a while, but everyone thinks it useless because of stagflation. So it doesn't matter if it applied over short periods earlier, it must apply everywhere and all the time.

Charlie Deist writes:

How accurately can we model political demand for macroeconomic snake oil?

Ben Lambert writes:

Two points:

1) Posts like this make my glad I switched from econ to engineering as an undergrad, for lots of reasons. I still love econ, but not as much as I love being able to prove/disprove the effectiveness of my models.

2) I think the link to the essay in progress is broken.

Milton Recht writes:

There is also the "Lucas Critique". Lucas said that economic models derived from historical data could not be used to recommend effective changes to government policies because the past relationships in the data are dependent on the effects of policies in place at the time of the data. Predictions based on past data will miss the effects of new policies and produce incorrect predictions unless the relationships in the model are calibrated for the new policy. One needs to build a model where the internal relationships (coefficients) vary depending upon policy recommendations. It requires an understanding (or at least an assumption) on how policy affects economic outcomes. It can become a tautology. The model predicts what it is set to predict. If a model is calibrated to predict that policies A, B and C will move the economy to long term trend, then a recommendation to use policies A, B and C will show in the model that the economy is moving towards its long term trend. The model will become the basis for a recommendation that was assumed as part of the model in building the model.

Additionally, when models are not validated against out of sample data, there is the problem of data mining and spurious results. When a 100 economists run a 100 different, independent models on historical data looking for economic and theoretical explanatory relationships, even at a 90 percent statistical significance level, there will result 1000 (100x100x.1), different models that meet model acceptance criteria.

Out of sample data tests would drastically reduce the number of acceptable models from 1000 to a much lower amount. Those that survive may again be spurious, because at the 90 percent significance level, the probability is that 100 out of the 1000 will survive and look meaningful. These 100 models exist based on luck without any need for there to be any economic meaning to their internal relationships. They can even be inconsistent with each other and known economic theories.

Economists then derive explanations to match the model results instead of vice versa. Data mining and spurious models can lead to inconsistent policy recommendations among economists.

It is like a gambler's hot streak at the roulette table, where the bettor develops superstitions about why he is winning, such as color of his shirt, etc. Economic policies based on the surviving models are equivalent to a gambler's idiosyncratic behavior that he thinks changes the odds at the roulette table and enables him to win. Each gambler has a different reason for the winning streak. Like different schools of economists.

The scientific method is based on the concepts of hypothesis testing against data and reproducible results. Model building is the reverse. The data is used to derive the hypothesis and it is not tested against new data (out of sample) to see if it is reproducible.

Both the Ptolemy (Earth centric) and Copernican (Sun centric) views of planetary motion were internally consistent with the data on planetary motion known at the time. Both were accurate in predicting future planetary position (actually, Ptolemy's method, revolving around the Earth, initially was more accurate than the Copernican method).

However, it is extremely unlikely that Sir Isaac Newton could have developed his theory of gravity under the Ptolemy system. Newton's gravity requires rotation around the larger mass body, the Sun, and not the Earth. While an equivalent gravitational system could probably have been mathematically built in a Ptolemy system, it would not be as simple to comprehend or visualize as the Newtonian system.

Expectation theory is in many ways equivalent to Copernican theory, but that is another long and controversial discussion. Suffice it to say, many economic models are inconsistent with expectation theory.

ThomasL writes:

david writes:

"This is an argument for agnosticism, not markets-fail-but-government-fails-worse."

Yep, that is the whole point. The models aren't arguments for anything, and pointing out that the models aren't reliable isn't an argument for anything.

Both schools of thought start off from a biased preference. One view is that markets are usually the best answer. The other is that government is usually the best answer.

The fundamental difference enters when one camp--usually the pro-government side, but it needn't necessarily be--adds the "evidence" of their model to the argument. "Seeeeee, this model proves that the {government|market} solution is the correct one."

In reality, since the model is governed more by the _assumptions_ used to create its structure than by the data that is input, that model has proved nothing more than the modeler's ability to create a model that supports his bias.

Boonton writes:

I agree with you on the potential problems with trying to make a real forecast. I suspect, though, that the models are a bit better for comparing. Policy A that says it will produce 1M jobs, Policy B says 3M. Policy B is probably better although you're never going to be able to point to 3M specific jobs and say "there, those are the exact paychecks produced by this policy!"

JPIrving writes:

the terrible thing is that even before all the stimulus talk there were plenty of federal and state programs which allocated funding and tax breaks based on retail "input-output" models like those from REDYN and Saw it first hand in an internship, it is backdoor central planning.

How do we stop this? It seems to me we should be able to convince people that these models are just complicated toys.

Craig Howard writes:

Why do climate scientists and econometricians insist on putting so much faith in computer programs that attempt to predict human behavior?

If models worked, we'd know who would win the Superbowl and could tell NBC how to pump up its ratings, too. Come to think of it, elections would be a thing of the past -- we'd already know who won.

It's supremely arrogant (or ignorant) to believe that economics can be reduced to mathematics.

Greg Ransom writes:

I guess Milton has never heard of Darwinian biology and a dozen other sciences which work _nothing_ like this:

"The scientific method is based on the concepts of hypothesis testing against data and reproducible results."

Greg Ransom writes:

This is Popper, not science:

"The scientific method is based on the concepts of hypothesis testing against data and reproducible results"

Michael M writes:

Popper IS science. Anyone who pretends differently is a charlatan.

Jim Vernon writes:

Thanks for your Bayesian explanation. I'm weak -- and quite rusty -- in that area, but your insight helps.

Mark JS writes:

Umm, this has to be wrong. After all, models can predict the risk of swaps, and asset backed debt obligations, and the risk of a financial there is no systemic risk from the interrelationships created by derivatives...what's that...oh...nevermind.

Actually, this is perfectly consistent with the first rule of empirical analysis I used to teach my students: The more manipulations one performs within a model or on a set of data, the more likely one is to find exactly what one is looking for.

George X writes:

Greg Ransom wrote:

I guess Milton has never heard of Darwinian biology and a dozen other sciences which work _nothing_ like this:

"The scientific method is based on the concepts of hypothesis testing against data and reproducible results."

This is Popper, not science...

...which led Michael M to write:

Popper IS science. Anyone who pretends differently is a charlatan.

Yeah, uh, no. Popper was a smart guy, an interesting guy, a good philosopher of science, and a friend of freedom, but (contra many practicing scientists) his views of science weren't perfect. For one thing, strict falsificationism has a lot of the same problems that verificationism has; the Bayesians have a way out by relaxing the requirement from "demonstrate to be false" to "provide major revision to prior probabilities".

Greg's not entirely right, either: plenty of sciences work something like this, even if you can't run experiments in, say, geology or astrophysics. To use his example, it was a big shot in the arm for evolutionary biologists to be able to sequence genes and measure rates of drift and diffusion — rates which different evolutionary theories predict different values for. Sorry if this leads people to dismiss all of economics as non-science; my impression is that some of it is, but large swaths aren't.

George X writes:

[Sorry about the double post. Kids and computers don't always mix well. Could the moderator please delete it, and this? Thanks.]

[Fixed. No problem. I'll delete this one in a later pass.--Econlib Ed.]

Mark JS writes:


Economics is a discipline, not a science -- it is really a perspective on the world that says: "incentives matter." You are basically right, however you look at it, economics is not a science. In search of "sciencehood" a number of economists now call economics a set of tools, by which they mean, basically, the ability to use statistics and mathematics. Theoretical consistency and the ability to "specify a model" -- by which they mean create an equation are all that matters. Some folks would call that science. But if it bears only a casual relation with the outside world, its hard to call it science (although the string theorists seem to be doing pretty well I guess).

As to your view of Popper, I think you are just wrong. Popper was an objectivist, even though he realized perfect objectivity was impossible. You point that his views are not science (from scientists) reflects the fact that most scientists accepted a Kuhnian view of science many years ago -- that science is a human activity with its rules defined socially by scientists, so whatever they say is "science" is science. That view is obviously correct from a sociological standpoint, but it is no less tautological or self-referential than objectivism. You are right, Popper was a philosopher, but his view of science remains influential and in my view better than the alternatives as a goal.

By the by, Popper would view the idea that the social sciences (including economics for the most part) are sciences as poppycock. Everyone who thinks they are should read The Poverty of Historicism and The Open Society and Its Enemies.

Rajan Lukose writes:

Thank you for a well-written post.

Every time this issue comes up (the efficacy of macro models), the following question pops into my head: Suppose we had *fantastic* macro models with nearly zero out of sample forecast errors...Is that sustainable or even an equilibrium?

For example, would such a scenario be consistent with an efficient markets hypothesis for asset prices?

Greg writes:

There are some good economic models out there. Ones that do not try to imply too many things but simply describe reality. A good example is the circuitist model of the macroeconomy which is used heavily by the MMT/Chartalist crowd.

Study their models and I think you'll find them explanatory.

They start with state created money. They also have models for endogenous,bank created money.

Next the relationship between central bank and treasury is explained on an operational level.

The purpose and effects of govt debt creation are also explained without any appeal to ideology. Just the facts.

How taxation functions, NOT as a source of govt funding but as an inflation regulation mechanism, is brought in next.

Understanding the idea that loans CREATE deposits allows you to dismiss notions of money multipliers and other fantasies of so many standard macroeconomic models.

Fractional reserve lending is then exposed as a complete myth when you realize that reserve positions are sought AFTER loans have been made.

Knowing that a loan is a DEMAND driven function helps you to dismiss doctrines like the loanable funds or crowding out.

Understanding that there are no "market" forces, only political forces, which determine the central bank rate you realize that a 0% interest policy is applicable ad infinitum. There are no economic reasons to raise central bank rates only ideological ones.

Basing your models on a floating exchange rate currency, like the one we ACTUALLY operate with, and forgetting old gold standard talk, helps to make SOME macroeconomic models more coherent than others. Are any perfect?? That goes without saying but clearly, some have the advantage of starting by explaining things as they ACTUALLY ARE.

Bob Layson writes:

It is more important to have an economy flexible enough to respond, without state assistance or direction, to changes in preferences and expectations than it is for economists to have a model good enough to predict future results of present preferences and expectations.

Economics should not be the intelligence arm of the welfare/warfare state striving to advise the Generalissimos Monetary and Fiscal.

Strat writes:

A model is a tool, not an oracle. That's why the geeks and suits should never be completely separated. As AK says, the model tells you what the effects would be under the modeler's assumptions. But, the difference between using a model and using a word processor is that the model makes those assumptions explicit. Making assumptions explicit moves the ball downfield because it allows us to understand why we disagree, and to understand the implications of those disagreements in detail.

I agree with Jeff Hallman that macro models are story-telling devices, nothing more. But they are nothing less, too. Telling stories can be useful. What else can we economists really aspire to, beyond telling coherent and precise stories about the world?

I believe that the market rewards models that tell useful stories. That's why anyone who believes in the efficacy of markets will give some credibility to the results of commercial models such as IHS, Moody's, or Macroeconomic Advisors when they tell us that the stimulus has had positive effects. It's kind of bizarre for free-market types to treat commercially successful models with disdain. And it's intellectually dishonest to treat them with disdain because they give answers you don't want to hear.

I would also point out that it's wrong to equate poor predictions with poor modeling of policy effects. It's easier to create a model that is successful at showing the effect of policy changes than one that is successful at predicting the future.

Comments for this entry have been closed
Return to top