Bryan Caplan  

Crime, Education, and the NLSY: The Role of Ability Bias

What's Wrong with the Taylor R... Some Election Day Reading, Wat...
I doubt The Case Against Education will spend more than two pages on the effect of education on crime.  But I've already spent a month getting ready to write those two pages.  Why so long?  Because (a) so much has been written on the topic, yet (b) researchers rarely report the precise numbers I want, so I'm also supplementing some of their statistical work to get a better handle on what's going on.

What am I looking for?  Estimates of the effect of education on crime that take both ability bias and signaling seriously.  So first, I want to see regressions of criminality on education and a wide variety of control variables - cognitive, attitudinal, behavioral, and social.  Second, I want to see "sheepskin" regressions that estimate criminality as a function of both discrete credentials and continuous educational attainment measures.

Let me share some of the main ability bias results I've been extracting from the NLSY.  Whenever this famous, long-running data set re-interviews respondents - typically every one or two years - it notes their current place of residence.  One of those residential options is "Jail."  If you regress the total number of times the respondent was interviewed in jail on his years of education, you get:


Naively interpreted, every extra year of education leads you to spend .07 fewer interviews in jail.  Adding in demographics, plus a measure of observations with missing residential information, makes little difference:


What about controlling for measured intelligence in the form of the AFQT?


So far, the effect of education on criminality looks pretty robust - two-thirds of the initial effect remains.  But what if we add a bunch of "non-cognitive" controls?  In particular, what if we adjust for scores on the Pearlin Mastery Scale, as well as suspensions from school, drinking, marijuana use, sex, and running away from home? 

Aside: For many of these variables, the NLSY measures not just what you did, but how early and/or how often you did it.  SUSPENDED is 1 if you were ever suspended; SUSPENDNUM is the number of times you were suspended.  SEXAGE is the age you first had sex; VIRGIN is whether you had ever had sex at the time of the survey.  You get the idea.


Behold.  With a little statistical elbow grease, the estimated effect of education on incarceration falls by over 2/3rds.  Critics of The Bell Curve will eagerly point out that, adjusting for everything else, measured intelligence (AFQTREV) is only a marginal issue.  But this doesn't mean that education is all-important, or that ability bias can be safely ignored.  Instead, there are a bunch of high-risk teen behaviors that simultaneously lead to educational failure and the slammer - most notably suspension, running away from home, and having sex.  If you're the kind of kid who defies adult expectations, whether you actually stay in school is much less important than it looks.

Coming soon: Crime and the sheepskin effect in the NLSY

COMMENTS (10 to date)
LemmusLemmus writes:

Um, you're explaining time in jail by controlling for underage drinking and marijuana smoking? Congrats! You've just demonstrated that past crime predicts jailtime.

Gorgasal writes:

@LemmusLemmus: you got it wrong. Of course past crime predicts jail. The question here is whether education predicts jail after controlling for all these effects.

That said, I'm not too sure I trust all of these fancy regressions. (I'm a statistician, I have earned the right to be skeptical of statistics.) Too many of these variables influence each other. You can't really "control for" anything in a setting with this much confounding. Then again, you won't be able to actually run experiments for this kind of question, so regressions on survey data, together with a giant dose of healthy skepticism, is the best we can do.

Diz writes:

[Comment removed for supplying false email address. Email the to request restoring this comment. We'd be happy to publish your comment. Nevertheless, a valid email address is required to post comments on EconLog and EconTalk.--Econlib Ed.]

David R. Henderson writes:

Very nice work, Bryan.

Nathan W writes:

You can`t use intelligence or ability as an explanatory factor if you don`t know what causes the observed outcome, natural ability or cumulative summed interplays between opportunity, training, and probably only to a minor degree actual differences in "natural" abilities.

Nathan W writes:

Since opportunity is at least as relevant as ability in many outcomes, then I hope you will also account for some measure opportunity effect. Otherwise you will conflate lack of opportunity with average stupidity in a group.

Consider the cases of 100 hoodlums, ten of whom had parents who read to their children daily up to the age of five using the Queen's English, and the remainder of whom were raised by slightly negligent single mothers who rarely read to their children and for similar historical reasons used English in a way more readily associated with Ebonics than the Queen's English.

If 10 of these 100 hoodlums break out of their quintile, my guess is that eight or nine of them would have been regularly read to as a child in the Queen's English (or equivalent).

I.e., I would read lack of education more so as an indicator of lack of opportunity and less so as an indicator of ability.

PJ writes:

I'm struggling to understand the economic significance of 0.07 versus 0.02. What is the sample average of the dependent variable? Does controlling for ability reduce the point estimate from "too good to be true" to "believable and economically significant"? Or did we go from "marginally relevant" to "too small to care"

Can't wait to read this book!

gwern writes:

Indeed, Gorgasal. In particular, "Critics of The Bell Curve will eagerly point out that, adjusting for everything else, measured intelligence (AFQTREV) is only a marginal issue"

Well, yes, after you have indulged in overadjustment bias/'controlling for intermediate variables' multiple times (education, marijuana use, suspensions, self-control, hispanic/black, virginity - all of these are going to correlate with intelligence), it's unsurprising that the coefficients no longer have much at all to do with the underlying causal story!

Bostonian writes:

If I were employing a high school graduate to work in my store, I'd rather hire someone who had never been suspended in high school. But "ban the box" laws prohibit employers from asking even about criminal convictions on a job application. This encourages employers to locate in areas where the crime rate is low, as an indirect way of screening out employees with criminal records.

Will writes:

It seems pretty clear that when you threw the kitchen sink at it, you started controlling for intermediate variables, which means your regression coefficients aren't telling you anything about the causal story.

Comments for this entry have been closed
Return to top