Friday, September 25, 2009

Determining Current Level of Intellectual Functioning: The Courts Dropped the Ball (guest post by K. Foley)

On month ago the US 5th Circuit Court of Appeals rendered a psychometrically puzzling (and troubling) decision in an Atkins mental retardation death penalty case, in favor of the defendant...Eric Lynn Moore.  It was brought to my attention by Kevin Foley who wanted to share his observations regarding the decision (in a guest blog post), particuarly since the ruling hinged on the unusal procedure of mathematically averaging three different IQ scores obtained across decades.  According to the records, the courts averaged a group IQ score administered when the defendent was 7 years old, a 1991WAIS-R, and a 2004 WAIS-III.  Although the scores were very consistent, I have never heard of such a simple mathematical method being applied to scores from different tests across such a long period of time.  The measurement questions raised by this simplistic approach are beyond the scope of a single blog post. 

My initial amazement is echoed in a  32 page dissenting opinion by Appeals Judge Jerry Smith.  As per an AP story
  • Appeals Judge Jerry Smith, in a 32-page dissent that was twice as long as the court's decision, called the majority ruling "intellectually sluggish" and chastised his colleagues on the court for using "haphazardly-applied standards of review, casually-read caselaw, and superficially-scrutinized evidence." 
 Judge Smith's dissenting opinion starts on page 16 of the final court ruling.

Kevin Foley was similarly struck by the manner in which the court invoked the simplistic approach to establishing mental retardation.  As a result, he wrote the following guest blog post which I'm posting on his behalf "as is."  In addition, I've located copies of the final ruling (click here), and three prior appeals (click here, here, and here). 

I've only skimmed the final ruling but would urge all psychologists  involved in Atkins cases, or psychologists who do intellectual testing of any kind, to read it.  The ruling is an interesting window into a courts logic re: how intelligence testing and scores can be viewed...and how  measurement and psychometric principles can be ignored.   Aside from the simple arithmetic averaging of three different IQ scores, other interesting comments center on the WAIS being the "standard" IQ test for Atkins cases and the argument for determining an MR diagnosis based on IQ scores only (dismissal of adaptive behavior evidence and testimony).  I will be reading it more and may offer additional comments upon greater reflection. 

Below is the Kevin Foley's unedited guest post:

Moore v. Quarterman, Case No. 05-70038, 5th Cir., Aug. 21, 2009 (unpublished).

Eric Lynn Moore, an African American inmate under a sentence of death, sought to escape the death penalty by invoking Atkins. The federal district court ruled in Moore’s favor, exempting him from the death penalty. The state appealed. According to the Fifth Circuit Court of Appeals, Moore obtained the following IQ scores: a 76 on the WAIS-R; a 66 on the WAIS-III; and a 74 on the Primary Mental Abilities test. Moore also scored in the bottom eight-tenths of a percentile on the TONI-2. The dissenting opinion in Moore states that the PMA was given to Moore when he was in the first grade; the WAIS-R was given to Moore, in prison, in 1991; and the WAIS-III was given to Moore in 2004. The district court resolved the issue of Moore’s current intellectual functioning by averaging the three scores to come up with an average score of 72, to which he applied the standard error of measurement. The appeals court approved this approach, stating, “In averaging the test scores and relying on the five-point margin of error . . . the district court attempted to find a way to reconcile all three test scores.”

The lone dissenting judge issued a scathing dissent, including biting comments accusing the majority of using “[h]aphazardly-applied standards of review, casually-read case law, and superficially-scrutinized evidence [which] make for an unfortunate combination; here, they result in shallow analysis and the wrong result. The only mitigation is that the majority opinion is unpublished, so it is not binding on anyone or any court.” Ouch!

The dissent correctly, in my opinion, took the trial and appeals courts to task on the issue of using an average of IQ scores from as far back as 1973 to determine current level of functioning. First, the trial court neglected to address issues surrounding the accuracy of the three scores. According to the dissent,

“All three of those test results were called into question at the evidentiary hearing. The PMA score, for example, is only a number; there is no evidence that it was properly scored or whether it was administered individually, as the test protocol requires, or to an entire school class. The vocabulary section of the WAIS-R, according to [defense expert] Llorente, was improperly scored, and in a way that may have slightly inflated the score. Llorente also testified concerning the ‘Flynn Effect,’ the apparent increase in the average IQ scores in populations over time, as measured by a given IQ test. Because the WAIS-R was an older test when it was administered to Moore, Llorente suggested adjusting Moore’s score of 76 downward by about four points.”

In addition, the state’s expert asserted that Moore’s expert improperly scored the WAIS-III test. The dissent correctly complained that the trial court took the easy way out by averaging the scores.

“Instead of grappling with those conflicting upward and downward adjustments, the district court gave the three scores equal weight, averaged them, reached an IQ of 72, applied the ‘five-point standard error of measurement,’ and therefore concluded that Moore had borne his burden of proof. It is that finding, and the district court’s actual reasoning in making it, that the panel must consider, and yet the majority refuses to address it at anything resembling an acceptable level of detail. . . .

“For one thing, there is no legal or record support for taking an average of Moore’s IQ scores. Averaging IQ scores is, to say the least, a creative approach to their analysis and comparison and is highly unusual. Neither expert suggested, employed, or endorsed it. The district court assumed, without any evident backing, that averaging is a meaningful way to compare scores from different IQ testing protocols administered years apart and that the margin of error was the same for all three and was the same after the averaging as before. All of those assumptions are facially implausible, and the district court had no apparent reason to think any of them is correct.”

Although the dissent erred in other respects (some of which may make for other interesting blog entries), it hit the nail on the head on this issue. Although these issues – assertions of invalid results, mis-scoring of tests, application of the Flynn Effect, and determining what weight to give to certain evidence - might be complicated and hard to resolve, that is what courts do. They should educate themselves and use the experts to provide the information necessary to properly decide the case. Moreover, federal judges can appoint an independent expert to offset the parties experts, it the judge feels the need for some impartial testimony to help guide the court.

Technorati Tags: , , , , , , , , , , , , , , , ,