A Whole Lot of (Very Expensive) Chopping; Chips? Not so Much

So seems to be the conclusions suggested by the charts produced by the Cato Institute here

I find these charts tantalizing on several levels.  One thing I’d like to know is whether the SAT scores were normed to account for the test having been (famously) dumbed down some years ago.  If they have not been, would the trend lines be even as level as they are?  Would they not much more likely have a downward lurch to them?  I of course took the “old” test more years ago than I care to admit any more, but from what little I’ve seen, heard, and read of the “new” test, the raw scores are not equivalent.

Secondly, ought “scores” to be rising at all?  I mean, you can only teach a given person so much in the setting of a classroom, and the fact that you might be able to get Little Johnny over the hump if you exposed him to the full force of one-on-one or tiny group tutoring really doesn’t tell you a whole lot about what you can reasonably expect from him in a classroom, from a teacher with a room full of other kids also to attend to.  The SAT scores are also way too much subject to being gamed.  Back when I took the test, if you wanted a prep course and you didn’t live in a larger city, you were pretty much out of luck.  Nowadays, when an SAT prep course is only a few clicks away on the internet, and huge amounts of money and energy are spent reading the tests’ tea leaves, to say that SAT scores are flat may only be telling us that we’ve run up against the upper limit on what can be done with gamesmanship.  Subject to the self-selection bias noted in the article, every test cohort will have a statistical distribution of ability, and that distribution will be reflected in the test scores.

The national whatever-it-is scores are the ones which suggest the more important questions.  Why are the scores largely flat?  Are the scores flat because the complexity of the test has been adjusted upward?  Is this national level of aggregation even useful?  How about breaking the scores down by some of other benchmarks: class size, median teacher salary (expressed relative to local median income), school size, gross population of local school district, local median income as a proportion of national median income, number of post-bachelor’s hours of course work per teacher, percentage of classroom teachers as proportion of total school system non-maintenance employment, amount of school funds spent on marquee sports (football, baseball, softball, basketball, soccer) as proportion of overall school system spending?

I’d also like to figure out a way to put a reliable metric on some of the intangibles, like average minutes per week spent learning about global warming, or “diversity,” or “inclusiveness,” or “community service.”  It seems to this lay person that with all the other nonsense that teachers are being required to teach these days, instead of their field — I mean, what the hell does “diversity” have to do with chemistry?  Either a student can calculate the amount of heat that will be released in a particular reaction or he can’t.  Might it not bear investigating?  Subject/verb agreement is not inclusive or exclusive, and in any event the solution of an integral function has zero to do with whether this year’s arctic ice cap is changing faster or slower than the antarctic ice cap.  Picking up litter, or scrubbing graffiti out of housing project stairwells, or raking some little old lady’s front yard are all laudable tasks, but how do they teach a student to distinguish between correlation and causation?

What these charts do convey, however, is the negative conclusion that net marginal return on additional gross spending per pupil is a number soberingly close to zero.  Reasonable minds can and will differ as to why the two trend lines diverge so dramatically, but what we can’t dispute is that shovelling additional dollars willy-nilly into the system is simply not moving measurable outcomes.

On a related note, these charts do jibe with the results of a study that was done in Germany some years ago (saw a report of it in the FAZ, but didn’t print it off).  What they found was that all the traditional nostrums, all the way from per-pupil spending to teacher salary to class size had no measurable effect on student outcomes.  What they did end up recommending was delaying the triage of the German education system for a year or two, and keep all three strata of students together longer.