### IAP AP101 Brief # 5:The Wechsler-like IQ subtest scaled score metric: The potential for misuse, misinterpretation and impact on critical life decisions

This is a revised post of a previous post (which has now been deleted).  The earlier post indicated that the report brief described below was in draft form---and I was seeking feedback and comments.  A number of individuals did provide some constructive feedback.  As a result, I revised the report (only slightly) and have posted the final version at the link mentioned below.  Thanks for the feedback.  This is now listed under the IAP AP101 Brief section of the blog sidebar.

Below are the introductory paragraphs to IAP AP Brief #5.  The complete report is available for online viewing or downloading by clicking here.  Enjoy.

I've recently been skimming James Flynn's new book (What is Intelligence:  Beyond the Flynn Effect) to better understand the methodology and interpretation of the Flynn effect. Of particular interest to me (as an applied measurement person) is his analysis of the individual subtest scores from the various Wechsler scales across time. As most psychologists know, Wechsler subtest scaled scores (ss) are on a scale with a mean (M) = 10 and a standard deviation (SD) = 3. The subtest ss range from 1 to 19.  In Appendix 1 of his book, Flynn states "it is customary to score subtests on a scale in which the SD is 3, as opposed to IQ scores which are scaled with SD set at 15. To convert to IQ, just multiply subtest gains by five, as was done to get the IQ gains in the last column."  At first glance, this statement makes it sound as if the transformation of subtest ss to IQ SS is an easy (“just multiply….”; emphasis added by me) and mathematically acceptable procedure without problems. However, on close inspection this transformation has the potential to introduce unknown sources of error into the precision of the transformed SS scores.  It is the goal of this brief technical post to explain the issues involved when making this ss-to- IQ SS conversion.

The ss 1-19 scale has a long history in the Wechsler batteries. For sample, in Appendix 1 of Measurement of Adult Intelligence (Wechsler, 1944), Wechsler described the steps used to translate subtest raw scores to the new ss metric. The Wechsler batteries have continued this tradition in each new revision, although the methodology and procedures to calculate the ss 1-19 values have become more sophisticated over time.   Although the methods used to develop the Wechsler ss 1-19 scale may have become more sophisticated, the resultant underlying scale for each subtest has not…scores still range from 1-19 (M=10; SD=3).  Also, the most recent Stanford-Binet—5th Edition (SB5; Roid, 2003) and Kaufman Assessment Battery for Children-2nd Edition (KABC-II) have both adopted the same ss 1-19 scale for their respective individual subtests.

Why is this relatively crude (to be defined below) scale metric still used in some intelligence batteries when other contemporary intelligence batteries provide subtest scale metrics with finer measurement resolution?  For example, the DAS-II (Elliott, 2007) places individual test scores on the T-scale (M=50; SD=10), with scores that range from 10-90.  The WJ III (McGrew & Woodcock, 2001) places all test and composite scores on the standard score (SS) metric associated with full scale and composite scores (M=100; SD=15).  The critical question to be asked is “are there advantages or disadvantages to retaining the historical ss 1-19 scale or, are their real advantages to having individual test scales with finer measurement resolution (DAS-II; WJ III)?”

