Sunday, November 18, 2012

AP 101 Brief # 17: Misunderstanding and misuse of achievement test scores in Atkins MR/ID death penalty cases: Part 1--Range of expected standard scores

AP 101 Brief #17:  Misunderstanding and misuse of achievement test scores in Atkins MR/ID death penalty cases:  Part 1 -- Range of expected standard scores

Kevin S. McGrew, PhD.
Institute for Applied Psychometrics (IAP)

Individually administered comprehensive intelligence tests (IQ) demonstrate strong and significant correlations with individually administered achievement tests (ACH).  However, the magnitude of the IQ/ACH correlation is not at the level that allows for precise prediction of expected achievement for individuals.  Unfortunately, many educators, lay persons, and psychologists have a false understanding of the IQ/ACH relationship—what I call the IQ-ACH fallacy.  The IQ-ACH fallacy can be misunderstood and misused in the diagnosis of MR/ID.  The goal of this IAP Applied Psychometrics Brief report (which will be a 2 or 3 part series) is to educate professionals and non-professionals on the scientific evidence regarding IQ/ACH relations.  The focus is on Atkins MR/ID contexts, but the information is relevant to all situations where IQ and ACH test scores are compared.  I have previously written about this topic at the ICDP blog (that prior post may be worth reading before reading the rest of the current brief report - Can a mild MR/ID person fail to be formally diagnosed before the age of 18? Do Forrest Gump's exist?)

What is the typical correlation between IQ and achievement test scores?

First, what is the typical correlation between measured IQ and ACH?  I have frequently seen a value of .50 referenced for adult populations and values from .60 to .65 for school-age populations.  I decided, given that most IQ tests have been revised as per contemporary neurocognitive and psychometric (CHC theory) research during the past 25 years, that these correlations needed to be re-verified or revised. 
Given the current focus on adult forensic settings (Atkins cases), I turned to the WAIS-IV technical manual.  Table 5.13 (page 87) reports correlations between the WAIS-IV scales and achievement scales from the WIAT-II in a sample of 93 subjects.  The WAIS-IV FS IQ correlated .76, .84, and .65 with the WIAT-II Reading, Mathematics, and Written Language Composites.  Given the small sample size of this validity study (n = 93), I, as a coauthor of the WJ III, was able to access the norm data of the WJ III Battery (NU norms) and calculated the correlation between the WJ III NU General Intellectual Ability—Standard (GIA-Std) and the WJ III NU Broad Reading, Math, and Written Language clusters in adults from ages 20 thru 45 (sample sizes ranged from 733 to 751 subjects).  Correlations were .74, .68, and .71 between the WJ III GIA and WJ III Broad Reading, Math, and Written Language clusters.  These WJ III IQ/ACH correlations were similar to those for the WAIS-IV/WAIT-III ACH correlations.  Taking all six correlations together, I calculated the average (median) value which was approximately .75.  This .75 value is much higher than the typical .50 to .65 values often cited in the literature.  It is my conclusion that contemporary IQ batteries (e.g., WAIS-IV, WISC-IV, WJ III, Stanford-Binet IV) are better predictors of ACH (than their earlier counterparts) and the typical IQ/ACH correlation used in adult (Atkins) settings should be approximately .70 to .75.

When predicting ACH scores from IQ scores, how much error in prediction is present?

Given that the most common IQ/ACH analysis is to determine if a person’s measured ACH scores are within the expected range for a person’s measured IQ, I next calculated the Standard Error of Estimate (SEest) for each of the six correlations measured above (ranged from 8.2 to 11.4).[1]  In simple terms, when using a specific correlation between two variables to predict one variable from the other, there will be error in the prediction.  More importantly, this error, just like the SEM around a single score, is in the form of a normal distribution with a mean (zero is the average or mean IQ/ACH expected differences) and standard deviation of predicted/expected scores.  This SD of expected or predicted scores is the SEest.  I then calculated the median SEest of these six values and obtained a value of 9.15 points, which I rounded to 9 points (for ease of computation and discussion). 

An SEest of 9 means that for any specific IQ score there is an expected/predicted score that has a 68% confidence band of prediction of + 9 points (from 9 points lower to 9 points higher than the expected/predicted score).  The 95% prediction confidence band is twice the 68% value—18 points.  Thus, if one wants to use a 95% confidence band, which has become the accepted standard of precision in life or death Atkins cases, then a person’s expected/achievement achievement score needs to be bounded by a range of scores from 18 points lower to 18 points higher—a span of 36 standard score points!

What is the range of expected/predicted achievement scores (in standard scores) for a person with an IQ of 70?

For illustrative purposes, I took an IQ score of 70 as a hypothetical person’s measured IQ.  Using the IQ/ACH correlation of .75 (and assuming both the IQ and ACH score are on a standard score scale with M = 100, SD = 15), I then calculated an expected/predicted ACH score, a calculation that must take into account the phenomena of regression to the mean (see Cahan et al., 2012 for detailed discussion of history and critical analysis of IQ-ACH regression procedures).   This calculation is of the form [(IQ-100) x IQAch r] +100.  Thus, [(70-100) x .75] +100 = 77.5 (rounded to 78 for discussion purposes).  Thus, for a person with an IQ of 70, the best single point estimate of their expected achievement is a standard score of 78.

This expected/predicted score of 78 must now be bracketed with the 95% SEest (+ 18).  This produces a 95% confidence band of expected/predicted ACH standard scores from 60 to 96!  This means that individuals with mild MR/ID (in this case IQ = 70) can obtain achievement standard scores well below their measured IQ score (with 95 % confidence down to as low as 60).  More importantly, and often misunderstood and misused in MR/ID determination, individuals with mild MR/ID (defined here as an IQ of 70) can, when using a 95% prediction confidence band and accounting for regression to the mean effects, obtain ACH standard scores up in the normal range (low average).  Of course, most expected/predicted achievement scores will bunch around the point-specific predicted score of 78 in the same manner that scores bunch around the mean in a normal curve.)  

Thus, ACH standard scores (based on a psychometrically sound individually measured ACH test) can be significantly higher than an individual’s IQ score, due to the less than perfect correlation between IQ and ACH.  For further explanation and detail review see McGrew and Evans (2004).  The presence of ACH test standard scores above a person’s measured IQ score, when that IQ score is in the mild MR/ID range, should not be used as clear and reliable evidence that the person is not MR/ID.  The IQ-Ach fallacy does not allow for such conclusions.  The only scientifically sound interpretation is that assuming that IQ/Ach test correlate at approximately .75, after the best regression-to-the-mean expected/predicted score is calculated, this score must be bounded by a range of standard scores 18 points lower and 18 points higher (95% confidence band of prediction/estimation).


            The essence of the above information is summarized in the figure below.  [Click on the figure to enlarge]

What about the range of expected grade equivalents?

            In the next installment of this series, the above hypothetical scenario will be presented in the form of range of expected (95% confidence) grade equivalent (GE) scores.  Given the unequal interval characteristics of GE’s and the non-linear growth curves of cognitive and achievement abilities, the results may shock many professionals and non-professionals.  Stay tuned.

[1] A brief description and the relatively simple formula for calculating the SEest is available in Anastasi and Urbina (1997).