Showing posts with label IQ-Achievement fallacy. Show all posts
Showing posts with label IQ-Achievement fallacy. Show all posts

Tuesday, November 20, 2012

AP 101 Brief #18: Misunderstanding and misuse of achievement test scores in Atkins MR/ID death penalty cases: Part 2--Range of expected grade equivalents



AP 101 Brief #18:  Misunderstanding and misuse of achievement test scores in Atkins MR/ID death penalty cases:  Part 2 -- Range of expected grade equivalents

Kevin S. McGrew, PhD.
Institute for Applied Psychometrics (IAP)





            In Part 1 of this AP 101 Brief Report (AP 101 Brief #17:  Misunderstanding and misuse of achievement test scores in Atkins MR/ID death penalty cases:  Part 1 -- Range of expected standard scores), the 95% confidence band around the expected/predicted achievement standard score for an individual with an IQ of 70 was calculated to be 36 points (+18), assuming a correlation between IQ and achievement tests of .75.[1]  The point-specific expected/predicted achievement standard score (that accounted for regression-to-the-mean) was 78 (+18; 60 to 96 95% confidence band for expected/predicted scores).  If the reader has not read the first installment in this series, I strongly recommend you stop reading the current brief and read the first brief.  Part 1 provides considerable background information upon which this second part in the series is based.  Below is the visual-graphic summary of Part 1 of this series. [Click on all images to enlarge them for better viewing]

            In Part 2 of this series, the expected/achievement score of 78, as well as the expected range (@ 95% confidence) of standard scores of 60 to 90 are converted to grade equivalents (GE) for Broad Reading, Math and Written Language at ages 25, 35, and 45 years of age in the WJ III NU norm data.  The following general procedure was followed by accessing the WJ III norm tables.  The WJ III NU norm tables were used as they provide data-based values associated with expected standard scores (and GEs) and not values based on prediction equations not based on real data or statistical simulations.
·         For each of the three WJ III achievement clusters, the specific WJ III  W-score[2] associated with a standard score (based on age norms) of 78 was identified.  This W-score was then entered in the WJ III NU grade norm tables to identify the specific GE associated with the W-score.  This step was repeated for the lower (60) and upper (96) standard scores of the 95% confidence band scores—resulting in GE values for both standard scores for each of the three achievement clusters.  This resulted in three GE values at each of the three selected age groups (GE for achievement SS = 78; GE for achievement SS = 60; GE for achievement SS = 96).  These three sets of values were then plotted on graphs and lines connecting each corresponding GE/SS value connect.
The three resulting figures are presented below.  The Broad Reading figure is discussed with the general interpretation being the same for Broad Math and Written Language, although the specific GE values in each figure should be substituted for those discussed with regard to Broad Reading. [Click on images to enlarge for better viewing].



            The Broad Reading GE values associated with the expected/predicted SS of 78 is 5.5 (25 years of age), 4.1 (35 years of age), and 4.4 (45 years of age)—ranging from the beginning of 4th grade to the middle of 5th grade.  This is the bold middle line.  The Broad Reading GE values associated with the expected/predicted SS of 60 is 3.0 (25 years of age), 2.0 (35 years of age), and 2.7 (45 years of age).  This is the bottom line in each figure.  The Broad Reading GE values associated with the expected/predicted SS of 96 is 10.9 (25 years of age), 11.2 (35 years of age), and 11.4 (45 years of age).  This is the top line in each figure.
             Thus, for a person with an IQ score of 70, the expected WJ II Broad Reading achievement GE’s range between 4.1 and 5.5, depending on age.  However, given the large range of standard scores associated with the 95% prediction confidence band (range of 36 points), it is not surprising that this range, when converted to GE’s, can vary from between 2.0 /3.0 to the end of 10th grade and the beginning 11th grade (10.9 to 11.4). 
            A quick review of the figures most likely raises many questions.  For example, why is the distance between the bottom line (GE’s associated with SS=60) and the middle line (GE associated with expected/predicted score of 78) much narrower than the distance between the same middle line and the top line (GE associated with expected/predicted score of 96).  Also, why are the three lines not consistently linear?  The answers to these questions would require excessive detail, statistical explanations, more graphs, etc., that would likely confuse readers.  The answer lies in the fact that (a) standard scores are equal interval metrics and GE’s are not, (b) standard scores are partially derived from the standard deviation (SD) of the W-scores at each age within each achievement domain, and these values are not the same across achievement domains nor across ages, and (c) W-score growth score curves show differential rates of rapid growth during the early ages/grades, then a plateau, and then a much slower rate of decline.  Enough said.
Summary
            Similar to the conclusion from Part 1 of this report, which dealt with expected standard scores, the expected range of GE’s for adults (ages 25 to 45) with an IQ of 70 can, for some individuals, vary tremendously.  The presence of some achievement scores significantly above expectations for an IQ associated with mild MR/ID (70), possibly into the junior and senior high grade levels, are possible when the less-than-perfect correlation between IQ and achievement scores is acknowledged.  One must recognize that although correlations in the .70’s are high and statistically significant, they indicate that IQ scores can only account for up to approximately ½ (50% of tested achievement scores).[3] 
Too many lay persons and, unfortunately many educators and psychologists, have fallen prey to the IQ-Ach fallacy, which is the non-science based assumption or belief that individuals can only achieve at or below their measured achievement.  The appropriate scientific fact is that for any IQ score there is a symmetrical range of possible expected achievement scores which, whether reported in terms of standard scores or GE’s, can be large.  Achievement scores that are above predicted levels based on measured IQ scores will occur with some degree of regularity for individuals with mild MR/ID and should not be incorrectly interpreted as a knee-jerk indication that a person may not considered for diagnoses as MR/ID, assuming they meet all relevant criteria or prongs.
Finally, the all the calculations in Part 1 and 2 of this series are based on the WJ III NU norm data.  The extent to which the results, especially the GE results, generalize to other achievement tests is unknown.  However, I am reasonably confident that although the specific GE's that would be obtained by completing the same methods with the norm data from different achievement tests (e.g., WIAT series) might vary slightly, the overarching conclusion that  "achievement scores that are above predicted levels based on measured IQ scores will occur with some degree of regularity for individuals with mild MR/ID and should not be incorrectly interpreted as a knee-jerk indication that a person may not considered for diagnoses as MR/ID, assuming they meet all relevant criteria or prongs" would generalize. 

[1] If a lower level of IQ/ACH correlation is assumed then the range of expected standard scores (Part 1 report) or grade equivalents (GE) will be larger.
[2] The WJ III scales are based on Rasch Item Response Theory (IRT) scaling methods that results in raw scores being converted to the equal interval W-score growth metric, which is then used to calculate all derived scores (AE, GE, SS, etc.)
[3] This percent figure represents the coefficient of determination which is calculated by squaring a correlation (e.g., r = .70 squared is .49) and then multiplying the value by 100% (thus, 49%).

Sunday, November 18, 2012

AP 101 Brief # 17: Misunderstanding and misuse of achievement test scores in Atkins MR/ID death penalty cases: Part 1--Range of expected standard scores



AP 101 Brief #17:  Misunderstanding and misuse of achievement test scores in Atkins MR/ID death penalty cases:  Part 1 -- Range of expected standard scores

Kevin S. McGrew, PhD.
Institute for Applied Psychometrics (IAP)


Individually administered comprehensive intelligence tests (IQ) demonstrate strong and significant correlations with individually administered achievement tests (ACH).  However, the magnitude of the IQ/ACH correlation is not at the level that allows for precise prediction of expected achievement for individuals.  Unfortunately, many educators, lay persons, and psychologists have a false understanding of the IQ/ACH relationship—what I call the IQ-ACH fallacy.  The IQ-ACH fallacy can be misunderstood and misused in the diagnosis of MR/ID.  The goal of this IAP Applied Psychometrics Brief report (which will be a 2 or 3 part series) is to educate professionals and non-professionals on the scientific evidence regarding IQ/ACH relations.  The focus is on Atkins MR/ID contexts, but the information is relevant to all situations where IQ and ACH test scores are compared.  I have previously written about this topic at the ICDP blog (that prior post may be worth reading before reading the rest of the current brief report - Can a mild MR/ID person fail to be formally diagnosed before the age of 18? Do Forrest Gump's exist?)

What is the typical correlation between IQ and achievement test scores?

First, what is the typical correlation between measured IQ and ACH?  I have frequently seen a value of .50 referenced for adult populations and values from .60 to .65 for school-age populations.  I decided, given that most IQ tests have been revised as per contemporary neurocognitive and psychometric (CHC theory) research during the past 25 years, that these correlations needed to be re-verified or revised. 
  
Given the current focus on adult forensic settings (Atkins cases), I turned to the WAIS-IV technical manual.  Table 5.13 (page 87) reports correlations between the WAIS-IV scales and achievement scales from the WIAT-II in a sample of 93 subjects.  The WAIS-IV FS IQ correlated .76, .84, and .65 with the WIAT-II Reading, Mathematics, and Written Language Composites.  Given the small sample size of this validity study (n = 93), I, as a coauthor of the WJ III, was able to access the norm data of the WJ III Battery (NU norms) and calculated the correlation between the WJ III NU General Intellectual Ability—Standard (GIA-Std) and the WJ III NU Broad Reading, Math, and Written Language clusters in adults from ages 20 thru 45 (sample sizes ranged from 733 to 751 subjects).  Correlations were .74, .68, and .71 between the WJ III GIA and WJ III Broad Reading, Math, and Written Language clusters.  These WJ III IQ/ACH correlations were similar to those for the WAIS-IV/WAIT-III ACH correlations.  Taking all six correlations together, I calculated the average (median) value which was approximately .75.  This .75 value is much higher than the typical .50 to .65 values often cited in the literature.  It is my conclusion that contemporary IQ batteries (e.g., WAIS-IV, WISC-IV, WJ III, Stanford-Binet IV) are better predictors of ACH (than their earlier counterparts) and the typical IQ/ACH correlation used in adult (Atkins) settings should be approximately .70 to .75.

When predicting ACH scores from IQ scores, how much error in prediction is present?

Given that the most common IQ/ACH analysis is to determine if a person’s measured ACH scores are within the expected range for a person’s measured IQ, I next calculated the Standard Error of Estimate (SEest) for each of the six correlations measured above (ranged from 8.2 to 11.4).[1]  In simple terms, when using a specific correlation between two variables to predict one variable from the other, there will be error in the prediction.  More importantly, this error, just like the SEM around a single score, is in the form of a normal distribution with a mean (zero is the average or mean IQ/ACH expected differences) and standard deviation of predicted/expected scores.  This SD of expected or predicted scores is the SEest.  I then calculated the median SEest of these six values and obtained a value of 9.15 points, which I rounded to 9 points (for ease of computation and discussion). 

An SEest of 9 means that for any specific IQ score there is an expected/predicted score that has a 68% confidence band of prediction of + 9 points (from 9 points lower to 9 points higher than the expected/predicted score).  The 95% prediction confidence band is twice the 68% value—18 points.  Thus, if one wants to use a 95% confidence band, which has become the accepted standard of precision in life or death Atkins cases, then a person’s expected/achievement achievement score needs to be bounded by a range of scores from 18 points lower to 18 points higher—a span of 36 standard score points!

What is the range of expected/predicted achievement scores (in standard scores) for a person with an IQ of 70?

For illustrative purposes, I took an IQ score of 70 as a hypothetical person’s measured IQ.  Using the IQ/ACH correlation of .75 (and assuming both the IQ and ACH score are on a standard score scale with M = 100, SD = 15), I then calculated an expected/predicted ACH score, a calculation that must take into account the phenomena of regression to the mean (see Cahan et al., 2012 for detailed discussion of history and critical analysis of IQ-ACH regression procedures).   This calculation is of the form [(IQ-100) x IQAch r] +100.  Thus, [(70-100) x .75] +100 = 77.5 (rounded to 78 for discussion purposes).  Thus, for a person with an IQ of 70, the best single point estimate of their expected achievement is a standard score of 78.

This expected/predicted score of 78 must now be bracketed with the 95% SEest (+ 18).  This produces a 95% confidence band of expected/predicted ACH standard scores from 60 to 96!  This means that individuals with mild MR/ID (in this case IQ = 70) can obtain achievement standard scores well below their measured IQ score (with 95 % confidence down to as low as 60).  More importantly, and often misunderstood and misused in MR/ID determination, individuals with mild MR/ID (defined here as an IQ of 70) can, when using a 95% prediction confidence band and accounting for regression to the mean effects, obtain ACH standard scores up in the normal range (low average).  Of course, most expected/predicted achievement scores will bunch around the point-specific predicted score of 78 in the same manner that scores bunch around the mean in a normal curve.)  

Thus, ACH standard scores (based on a psychometrically sound individually measured ACH test) can be significantly higher than an individual’s IQ score, due to the less than perfect correlation between IQ and ACH.  For further explanation and detail review see McGrew and Evans (2004).  The presence of ACH test standard scores above a person’s measured IQ score, when that IQ score is in the mild MR/ID range, should not be used as clear and reliable evidence that the person is not MR/ID.  The IQ-Ach fallacy does not allow for such conclusions.  The only scientifically sound interpretation is that assuming that IQ/Ach test correlate at approximately .75, after the best regression-to-the-mean expected/predicted score is calculated, this score must be bounded by a range of standard scores 18 points lower and 18 points higher (95% confidence band of prediction/estimation).

Summary

            The essence of the above information is summarized in the figure below.  [Click on the figure to enlarge]



What about the range of expected grade equivalents?

            In the next installment of this series, the above hypothetical scenario will be presented in the form of range of expected (95% confidence) grade equivalent (GE) scores.  Given the unequal interval characteristics of GE’s and the non-linear growth curves of cognitive and achievement abilities, the results may shock many professionals and non-professionals.  Stay tuned.





[1] A brief description and the relatively simple formula for calculating the SEest is available in Anastasi and Urbina (1997).