I try to use the 2000 election in Florida and the question of Iraq's weapons of mass destruction to illustrate statistical concepts.

In statistics, a parameter (not to be confused, as it is often is by laymen, with perimeter), is an unknown quantity. In this case, the unknown quantity might be described as "the proportion of voters in Florida who intended to vote for Gore when they cast their ballots." This parameter is unknown for a variety of reasons. Partially-punched ballots made voters' intentions less than clear. Moreover, even if there had been no "hanging chad" issues with the vote counting, the "butterfly ballot" raised the question of whether voters intending to vote for Gore wound up accidentally voting for Buchanan.

My article also describes Type I and Type II error.

On another statistical topic, Stephen T. Ziliak and Deirdre N. McCloskey contrast material significance with statistical significance.

a merely statistical significance cannot substitute for the judgment of a scientist and her community about the largeness or smallness of a coefficient by standards of scientific or policy oomph...
Of the 137 relevant papers [published in the *American Economic Review*] in the 1990s, 82% mistook statistically significant coefficients for economically significant coefficients.

When I teach Advanced Placement statistics in high school, I like to give an exam question in which students are asked "as a statistician" to recommend a diet pill to a relative. One pill has reduces weight on average by 20 pounds with a standard deviation of 10 pounds, and another pill reduces weight on average by 4 pounds with a standard deviation of 1 pound. The second pill achieves results that are more significant statistically. However, the first pill achieves results that are more significant materially.

*For Discussion*. In the diet pill example, how high would the standard deviation have to be on the first pill in order to lead you to recommend the second pill?

Arnold,

I personally advocate simple Dieting as the most basic preference, because drug-induced weight loss brings weight gain, upon cessation of the drug. Now to answer your question: The second Pill must compete within the same weight loss range as the first Pill. This requires taking the second Pill 2.5 times as long to achieve the statistical results of the first Pill. In neither case was it stipulated as to Time frame, or was it stipulated how much weight was to be lost.

The statistical significane will not translate to medical significance, until these two factors are known. The constraint of Time is the primary factor, with total weight loss secondary. The debate is apples and oranges without the constraints. lgl

Hmmm..

Your post on probability seems to have missed that it's not enough to compare worst possible outcomes, you need to look at the probability of each outcome.

Arnold,

May I differ with your definition? My understanding is that a parameter describes some aspect of a

population, while a statistic describes some aspect of asample. Though parameters are often unknown, that is not what makes them parameters.We often use statistics to try to estimate unknown parameters, but some parameters are known. For example, suppose we want to know how many people in Idaho own computers. We survey 1000 people and find that 680, or 68%, own computers. (I made that number up). If we know the population of Idaho - a parameter- we can multiply it by .68, a statistic, to estimate how many computer owners there are in the state - another parameter.

My view on parameters vs. statistics is that if we knew what the parameters were, we would not need to take statistical estimates. The whole point of doing statistics is to try to estimate unknown parameters.

In class, it is very important to emphasize the difference between what is known and what is unknown. If students confuse the two, they will never understand what a confidence interval really means or what a hypothesis test really does.

"My view on parameters vs. statistics is that if we knew what the parameters were, we would not need to take statistical estimates. "

Of course that's right.

But I still think being more careful about terminology would be a good idea. In my opinion one of the things that makes statistics a difficult subject is that it's often taught in a sort of cookbook fashion - lots of recipes, but no broad picture of what's going on. For me at least being able to fit things into an overall structure helps a lot.

>>In my opinion one of the things that makes statistics a difficult subject is that it's often taught in a sort of cookbook fashion - lots of recipes, but no broad picture of what's going on.

That is my experience as well (in the field of statistical process control). I wonder how many people really understand the underlying theory of statistics, vs. just mechanically going through the motions and, perhaps, making incorrect assumptions and bad conclusions.

It is interesting how Arnold has set up the test. The null hypothesis is that Iraq has weapons of mass destruction (WMD) and the alternate hypothesis then is that Iraq does not have WMD. In a criminal context, this would say the defendant is guilty until proven innocent. To my mind, he should have switched the null and the alternative hypothesis.

With the null hypothesis being what it is, data is gathered in the form of observations on the existence or non-existence of WMD. In this context, the accused is guilty until proven innocent. The defendant (Iraq) must present enough evidence in contradiction of the null hypothesis. That means that Iraq has to present evidence that the WMD do not exist. As you might see, I think it would be difficult for Iraq to prove that something does not exist.

In this case, a type I error is the guilty is set free. Iraq has WMD but no invasion. A type II error is the innocent is convicted. Iraq does not have WMD but is invaded.

Given the null hypothesis, Iraq has a difficult time contradicting the null hypothesis and proving WMD do not exist. Guilt would be found assuming Iraq couldn’t prove WMD do not exist and would be invaded.

The test is set up by Arnold to assure an invasion and a type II error if no WMD are found.

>>As you might see, I think it would be difficult for Iraq to prove that something does not exist.

That is not exactly what was being asked of Iraq. What was being asked of Iraq is what Libya is now doing in reaction to the Iraq invasion. Come clean.

Of course, it is a little early to say that the Libyan experience is a success. But it looks promising.

Also, I think the Kurds and the Iranians would take issue with the statement that Iraq never had WMDs. Sadaam had used poison gas in the past.

WMDs are so dangerous, and dictators like Sadaam are so reckless and unreliable, assuming guilt until innocence is proven is the more rational choice.

There is a deeper methodological issue here, which concerns the conclusions drawn from statistical analysis. Is it appropriate for economists use statistical methods to confirm theories that seek to explain? This issue concerns both Friedman's fifty year old proposition that economists ought to limit themselves to theories that predict and Arrow's forty year old observation that if economists allow learning through experience, the marginalist world-view falls apart.

Decision science tells us that we use predictions to help us evaluate alternatives. In decision tree terms, we use predictions to assign probabilities to uncertain events. In contrast, we use explanations to help us formulate alternatives. In decision tree terms, we use explanations to help us draw the tree. Better predictions help us become more efficient and better explanations help us become more effective.

The rub is that we ought to base models that explain not on what we expect other economic agents to do but rather on what we would do if we had complete knowledge of what we ought to seek: We ought to predict based on how we expect others to act. We ought to explain based on how we would act if we had greater knowledge. How can we know what we don’t know? We can learn by doing.

Hayek understood the dangers of Friedman’s solution to the problems raised by the evolutionary nature of economies. In his intellectual autobiography,

Hayek on Hayek, he stated: “one of the things I most regret is not having returned to a criticism of Keynes’s treatise, but it is as much true of not having criticized Milton Friedman’s[Essays in] Positive Economics, which in a way is quite as dangerous a book.””Lakatos provides a way in his idea of a scientific research program. Whether we recognize it or not, we humans are compelled by our internal programming to learn to live good lives. We are compelled to be both researchers and research subjects in an ongoing human program to discover and test knowledge useful in living good lives. Until all of us act wisely, managing this program well requires distinguishing between theories designed to help us predict, which help us evaluate alternatives, and theories designed to help us explain, which help us formulate alternatives. It’s just good common sense.

For more on this line of thinking, visit http://www.recursionist.org.

PS:

My comments address the issues raised by the Ziliak / McCloskey paper.

What I stated in my opening paragraph was challenging enough without a missing word and phrase. The paragraph should read:

There is a deeper methodological issue here: Is it appropriate for economists to test predictions (using statistical methods) in order to confirm theories that seek to explain? This issue concerns both Friedman's fifty year old proposition that economists ought to limit themselves to theories that predict and Arrow's forty year old observation that if economists allow learning through experience the marginalist world-view falls apart.