Monday, July 16, 2018

What is an applied psychometrician?

I wear a number of hats within the broad filed of educational psychology.  One is that of an applied psychometrician.  Whenever anyone asks what I do, I receive strange looks when that title rolls out of my mouth.  I then always need to provide a general explanation.

I've decided to take a little time and generate a brief explanation.  I hope this helps.

The online American Psychological Association (APA) Dictionary of Psychology defines psychometrics as: n. the branch of psychology concerned with the quantification and measurement of mental attributes, behavior, performance, and the like, as well as with the design, analysis, and improvement of the tests, questionnaires, and other instruments used in such measurement. Also called psychometric psychology; psychometry.

The definition can be understood from the two components of the word. Psycho refers to “psyche” or the human mind. Metrics refers to “measurement.” Thus, in simple terms, psychometrics means psychological measurement--it is the math and science behind psychological testing.  Applied psychometrics is concerned with the application of psychological theory, techniques, statistical methods, and psychological measurement to applied psychological test development, evaluation, and test interpretation. This compares to more pure or theoretical psychometrics which focuses on developing new measurement theories, methods, statistical procedures, etc. An applied psychometrician uses the various theories, tools and techniques developed by more theoretical psychometricians in the actual development, evaluation, and interpretation of psychological tests. By way of analogy, applied psychometrics is to theoretical psychometrics, as applied research is to pure research.

The principles of psychometric testing are very broad in their potential application., and have been applied to such areas as intelligence, personality, interest, attitudes, neuropsychological functioning, and diagnostic measures (Irwing & Hughes, 2018). As noted recently by Irwing and Hughes (2018), psychometrics is broad as “It applies to many more fields than psychology, indeed biomedical science, education, economics, communications theory, marketing, sociology, politics, business, and epidemiology amongst other disciplines, not only employ psychometric testing, but have also made important contributions to the subject” (p. 3).

Although there are many publications of relevance to the topic of test development and psychometrics, the most useful and important single source is “the Standards for Educational and Psychological Testing” (aka., the Joint Test Standards; American Educational Research Association [AERA], American Psychological Association [APA], National Council on Measurement in Education [NCME], 2014). The Joint Test Standards outline standards and guidelines for test developers, publishers, and users (psychologists) of tests. Given that the principles and theories of psychometrics are generic (they cut across all subdisciplines of psychology that use psychological tests), and there is a standard professionally accepted set of standards (the Joint Test Standards), an expert in applied psychometrics has the skills and expertise to evaluate the fundamental, universal or core measurement integrity (i.e., quality of norms, reliability, validity, etc.) of various psychological tests and measures (e.g., surveys, IQ tests, neuropsychological tests, personality tests), although sub-disciplinary expertise and training would be required to engage in expert interpretation by sub-disciplines. For example, expertise in brain development, functioning and brain-behavior relations would be necessary to use neuropsychological tests to make clinical judgements regarding brain dysfunction, type of brain disorders, etc. However, the basic psychometric characteristics of most all psychological and educational tests (e.g., neuropsychological, IQ, achievement, personality, interest, etc.) assessment can be evaluated by professionals with expertise in applied psychometrics.

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (2014). Standards for Educational and Psychological Testing. Washington, DC: Author.

 Irwing, P. & Hughes, D. J. (2018). Test development. In P. Irwing, T. Booth, & D. J. Hughes (Eds.), The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development (pp. 3-49. Hoboken, NJ: John Wiley & Sons

Thursday, July 12, 2018

Great psychometric resource: The Wiley Handbook of Psychometric Testing

I just received my two volume set of this excellent resource on psychometric testing.  There are not many good books that cover such a broad array of psychometric measurement issues.  This is not what I would call "easy reading."  This is more like a "must have" resource book to have "at the ready" when seeking to understand contemporary psychometric test development issues.

National Academies Press: Neuroforensics: Exploring the legal implications of emerging technologies

This new publication is now available from the National Academies Press.

Court decision: Moore v Texas (2018) after SCOTUS vacated decison based on Briseno AB standards

I'm a bit behind in posting information regarding recent Atkins-related court decisions.

Despite SCOTUS recently vacating Moore v Texas based on Texas's Briseno standards not being consistent with prevailing medical and professional standards, Moore was still found to not be ID in the latest decision from Texas.  The majority opinion can be found here.  The dissenting opinion can be found here.

Wednesday, July 11, 2018

"Intellectual Disability, The Death Penalty, and Jurors"

"Intellectual Disability, The Death Penalty, and Jurors"
// Sentencing Law and Policy

The title of this post is the title of this new paper on SSRN authored by Emily Shaw, Nicholas Scurich and David Faigman. Here is its abstract:

In Atkins v. Virginia (2002), the United States Supreme Court held that intellectually disabled defendants cannot be sentenced to death; but since then, the Court has continued to grapple with how intellectual disability should be legally defined. Typically, however, it is jurors who determine whether a defendant is intellectually disabled and therefore categorically ineligible for the death penalty. Very little is known empirically about how jurors reason about and make these decisions.

This Article presents the results of a novel experiment in which venire jurors participated in an intellectual disability hearing and a capital sentencing hearing. The diagnosis of a court-appointed expert was experimentally manipulated (defendant is or is not intellectually disabled), as was the provision of information about the crime (present or absent). Jurors were considerably more likely to find the defendant not disabled when the expert opined that the defendant was not disabled.  They were also more likely to find the defendant not disabled when they learned about the details of the crime. Similarly, jurors were more likely to sentence the defendant to death after learning about the details of the crime, which increased perceptions of both the defendant's blameworthiness and his mental ability.  These findings highlight the reality that jurors' assessments of intellectual disability are influenced by crime information, contrary to pronouncements made by the United States Supreme Court, and they support the use of bifurcated disability proceedings, as some states have recently adopted.


Sunday, July 8, 2018

Practice or retest effects in measures of working memory capacity (Gwm): A meta-analysis

Retest effects in working memory capacity tests: A meta-analysis
Jana Scharfen, Katrin Jansen, Heinz Holling. Article link

© Psychonomic Society, Inc. 2018


The repeated administration of working memory capacity tests is common in clinical and research settings. For cognitive ability tests and different neuropsychological tests, meta-analyses have shown that they are prone to retest effects, which have to be accounted for when interpreting retest scores. Using a multilevel approach, this meta-analysis aims at showing the reproducibility of retest effects in working memory capacity tests for up to seven test administrations, and examines the impact of the length of the test-retest interval, test modality, equivalence of test forms and participant age on the size of retest effects. Furthermore, it is assessed whether the size of retest effects depends on the test paradigm. An extensive literature search revealed 234 effect sizes from 95 samples and 68 studies, in which healthy participants between 12 and 70 years repeatedly performed a working memory capacity test. Results yield a weighted average of g = 0.28 for retest effects from the first to the second test administration, and a significant increase in effect sizes was observed up to the fourth test administration. The length of the test-retest interval and publication year were found to moderate the size of retest effects. Retest effects differed between the paradigms of working memory capacity tests. These findings call for the development and use of appropriate experimental or statistical methods to address retest effects in working memory capacity tests.

Keywords Meta-analysis · Retest effect · Practice effect · Working memory

Tuesday, June 12, 2018

Researchers find IQ scores dropping since the 1970s— Scandinavian countries

This reverse Flynn effect has been found primarily in Scandinavian countries and not the US.  

Researchers find IQ scores dropping since the 1970s

Saturday, June 2, 2018

Can emotional intelligence (Gei) be trained: A meta-analysis

Can emotional intelligence be trained? A meta-analysis

Please cite this article as: Mattingly, V., Human Resource Management Review (2018),

Victoria Mattingly, Kurt Kraiger

Keywords: Emotional intelligence, Training Meta-analysis


Human resource practitioners place value on selecting and training a more emotionally in-telligent workforce. Despite this, research has yet to systematically investigate whether emo-tional intelligence can in fact be trained. This study addresses this question by conducting a meta-analysis to assess the effect of training on emotional intelligence, and whether effects are mod-erated by substantive and methodological moderators. We identified a total of 58 published and unpublished studies that included an emotional intelligence training program using either a pre-post or treatment-control design. We calculated Cohen's d to estimate the effect of formal training on emotional intelligence scores. The results showed a moderate positive effect for training, regardless of design. Effect sizes were larger for published studies than dissertations. Effect sizes were relatively robust over gender of participants, and type of EI measure (ability v. mix-edmodel). Further, our effect sizes are in line with other meta-analytic studies of competency-based training programs. Implications for practice and future research on EI training are discussed.

See prior Gei posts here and here.

Evidence of a Flynn Effect in Children's Human Figure Drawings (1902-1968).

Evidence of a Flynn Effect in Children's Human Figure Drawings (1902-1968).

Saturday, May 19, 2018

The Relation between Intelligence and Adaptive Behavior: A Meta-Analysis 

Very important meta-analysis of AB IQ relation. Primary finding on target with prior informal synthesis by McGrew (2015)

The Relation between Intelligence and Adaptive Behavior: A Meta-Analysis   
Ryan M. Alexander 
Intelligence tests and adaptive behavior scales measure vital aspects of the multidimensional nature of human functioning. Assessment of each is a required component in the diagnosis or identification of intellectual disability, and both are frequently used conjointly in the assessment and identification of other developmental disabilities. The present study investigated the population correlation between intelligence and adaptive behavior using psychometric meta-analysis. The main analysis included 148 samples with 16,468 participants overall. Following correction for sampling error, measurement error, and range departure, analysis resulted in an estimated population correlation of ρ = .51. Moderator analyses indicated that the relation between intelligence and adaptive behavior tended to decrease as IQ increased, was strongest for very young children, and varied by disability type, adaptive measure respondent, and IQ measure used. Additionally, curvilinear regression analysis of adaptive behavior composite scores onto full scale IQ scores from datasets used to report the correlation between the Wechsler Intelligence Scales for Children- Fifth edition and Vineland-II scores in the WISC-V manuals indicated a curvilinear relation—adaptive behavior scores had little relation with IQ scores below 50 (WISC-V scores do not go below 45), from which there was positive relation up until an IQ of approximately 100, at which point and beyond the relation flattened out. Practical implications of varying correlation magnitudes between intelligence and adaptive behavior are discussed (viz., how the size of the correlation affects eligibility rates for intellectual disability).
Other Key Findings Reported
McGrew (2012) augmented Harrison's data-set and conducted an informal analysis including a total of 60 correlations, describing the distributional characteristics observed in the literature regarding the relation. He concluded that a reasonable estimate of the correlation is approximately .50, but made no attempt to explore factors potentially influencing the strength of the relation.
Results from the present study corroborate the conclusions of Harrison (1987) and McGrew (2012) that the IQ/adaptive behavior relation is moderate, indicating distinct yet related constructs. The results showed indeed that the correlation is likely to be stronger at lower IQ levels—a trend that spans the entire ID range, not just the severe range. The estimated true mean population is .51, and study artifacts such as sampling error, measurement error, and range departure resulted in somewhat attenuated findings in individual studies (a difference of about .05 between observed and estimated true correlations overall).
The present study found the estimated true population mean correlation to be .51, meaning that adaptive behavior and intelligence share 26% common variance. In practical terms, this magnitude of relation suggests that an individual's IQ score and adaptive behavior composite score will not always be commensurate and will frequently diverge, and not by a trivial amount. Using the formula Ŷ = Ȳ + ρ (X - X ̅ ), where Ŷ is the predicted adaptive behavior composite score, Ȳ  is the mean adaptive behavior score in the population, ρ  is the correlation between adaptive behavior and intelligence, X is the observed IQ score for an individual, and X ̅ is the mean IQ score, and accounting for regression to the mean, the predicted adaptive behavior composite score corresponding to an IQ score of 70, given a correlation of .51, would be 85 —a score that is a full standard deviation above an adaptive behavior composite score of 70, the cut score recommended by some entities to meet ID eligibility requirements. With a correlation of .51, and accounting for regression to the mean, an IQ score of 41 would be needed in order to have a predicted adaptive behavior composite score of 70. Considering that approximately 85% of individuals with ID have reported IQ scores between 55 and 70±5 (Heflinger et al., 1987; Reschly, 1981), the eligibility implications, especially for those with less severe intellectual impairment, are alarming. In fact, derived from calculations by Lohman and Korb (2006), only 17% of individuals obtaining an IQ score of 70 or below would be expected to also obtain an adaptive behavior composite score of 70 or below when the correlation between the two is .50. 
The purpose of this study was to investigate the relation between IQ and adaptive behavior and variables moderating the relation using psychometric meta-analysis. The findings contributed in several ways to the current literature with regard to IQ and adaptive behavior. First, the estimated true mean population correlation between intelligence and adaptive behavior following correction for sampling error, measurement error, and range departure is moderate, indicating that intelligence and adaptive behavior are distinct, yet related, constructs. Second, IQ level has a moderating effect on the relation between IQ and adaptive behavior. The correlation is likely to be stronger at lower IQ levels, and weaker as IQ increases. Third, while not linear, age has an effect on the IQ/adaptive behavior relation. The population correlation is highest for very young children, and lowest for children between the ages of five and 12. Fourth, the magnitude of IQ/adaptive behavior correlations varies by disability type. The correlation is weakest for those without disability, and strongest for very young children with developmental delays. IQ/adaptive behavior correlations for those with ID are comparable to those with autism when not matched on IQ level. Fifth, the IQ/adaptive correlation when parents/caregivers serve as adaptive behavior respondents is comparable to when teachers act as respondents, but direct assessment of adaptive behavior results in a stronger correlation. Sixth, an individual's race does not significantly alter the correlation between IQ and adaptive behavior, but future research should evaluate the influence of race of the rater on adaptive behavior ratings. Seventh, the correlation between IQ and adaptive behavior varies depending on IQ measure used—the population correlation when Stanford-Binet scales are employed is significantly higher than when Wechsler scales are employed. And eighth, the correlation between IQ and adaptive behavior is not significantly different between adaptive behavior composite scores obtained from the Vineland, SIB, and ABAS families of adaptive behavior measures, which are among those that have been deemed appropriate for disability identification. Limitations of this study notwithstanding, it is the first to employ meta-analysis procedures and techniques to examine the correlation between intelligence and adaptive behavior and how moderators alter this relation. The results of this study provide information that can help guide practitioners, researchers, and policy makers with regard to the diagnosis or identification of intellectual and developmental disabilities.

Tuesday, May 8, 2018

NEUROSCIENCE & SOCIETY: Ethics, Law, and Technology Confrence - Neuroethics & Law Blog

NEUROSCIENCE & SOCIETY: Ethics, Law, and Technology Confrence - Neuroethics & Law Blog

NEUROSCIENCE & SOCIETY: Ethics, Law, and Technology Confrence

NEUROSCIENCE & SOCIETY: Ethics, Law, and Technology
24-25 August 2018
Sydney, NSW, Australia

Advances in brain scanning and intervention technologies are transforming our ability to observe, explain, and influence human thought and behaviour. Potential applications of such technologies (e.g. brain-based pain detection in civil lawsuits, medications to help criminal offenders become less impulsive, prediction of future behaviour through neuroimaging) and their ethical, clinical, legal, and societal implications, fuel important debates in neuroethics. However, many factors beyond the brain – factors targeted by different emerging technologies – also influence human thought and behaviour. Sequencing the human genome and gene-editing technologies like CRISPR Cas-9 offer novel ways to explain and influence human thought and behaviour. Analysis of data about our offline and online lives (e.g. from fitness trackers, how we interact with our smartphone apps, and our social media posts and profiles) also provide striking insights into our psychology. Such intimate information can be used to predict and influence our behaviour, including through bespoke advertising for goods and services that more effectively exploits our psychology and political campaigns that sway election results. Although such methods often border on manipulation, they are both difficult to detect and potentially impossible to resist. The use of such information to guide the design of online environments, artifacts, and smart cities lies at the less nefarious – and potentially even socially useful and morally praiseworthy – end of the spectrum vis à vis the potential applications of such emerging "moral technologies".

At this year's Neuroscience & Society conference we will investigate the ethical, clinical, legal, and societal implications of a wide range of moral technologies that target factors beyond, as well as within, the brain, in order to observe, explain, and influence human thought and behaviour. Topics will include, but are not limited to:

  • cognitive and moral enhancement
  • neurolaw and neuro-evidence
  • brain-computer interfaces
  • neuro-advertising
  • neuromorphic engineering and computing
  • mental privacy and surveillance
  • social media and behaviour prediction/influence
  • implicit bias and priming
  • technological influences on human behaviour
  • nudging, environment and technology design, and human behaviour
  • artificial intelligence and machine learning
  • technology and the self
  • (neuro)technology and society

We invite abstracts from scholars, scientists, technology designers, policy-makers, practitioners, clinicians and graduate students, interested in presenting talks or posters on any of the above or related topics.

Abstracts of 300 words should be emailed to Cynthia Forlini <> in Microsoft Word format by Thursday, 31 May 2018. Submissions will be peer reviewed, and authors of successful submissions will be notified via email by Friday, 15 June 2018.

In addition to keynote presentations (to be announced shortly), contributed talks, and a poster session, the conference program will also include three sessions on the following topics:

  • highlights from- and information about enhancements to the Australian Neurolaw Database
  • book symposium on Neuro-Interventions and The Law: Regulating Human Mental Capacity
  • panel on the topic of remorse
For enquiries about matters other than abstract submission, please email Adrian Carter <> or Jeanette Kennett <>
Neuroscience & Society is supported by the ARC Centre of Excellence for Integrative Brain Function Neuroethics Program, and the Centre for Agency Values and Ethics at Macquarie University.

Saturday, April 28, 2018

Stability of intelligence from infancy through adolescence

Stability of intelligence from infancy through adolescence: An autoregressive latent variable model (article link)

Huihui Yua, D. Betsy McCoach, Allen W. Gottfriedc, Adele Eskeles Gottfriedd

a Yale University, United States b University of Connecticut, United States c Fullerton Longitudinal Study, California State University, Fullerton, United States d California State University, Northridge, United States


This study examined the stability of the latent construct of intelligence from infancy through adolescence, using latent variable modeling to account for measurement error. Based on the Fullerton Longitudinal Study data, the present study modeled general intelligence across four developmental periods from infancy through adolescence. The Fullerton Longitudinal Study included twelve assessments of intellectual performance over a sixteen-year interval. Three assessments of intellectual performance at each of four developmental periods served as in-dicators of latent intelligence during infancy (1, 1.5, and 2 years old), preschool (2.5, 3, and 3.5 years old), childhood (6, 7, and 8 years old), and adolescence (12, 15, and 17 years old). Intelligence exhibited a high degree of stability across the four developmental periods. For instance, infant intelligence revealed a strong cross-time correlation with preschool intelligence (r = 0.91) and moderate correlations with childhood and adolescent intelligence (r = 0.69 and 0.57, respectively). Intelligence followed a stage-autoregressive pattern whereby correlations between IQ scores decreased as the timespan between assessment waves increased. Further, from infancy to adolescence, the effect of intelligence during earlier periods was completely mediated by intelligence during the adjacent developmental period. In contrast to much prior research, this study demonstrated the stability of general intelligence, beginning in infancy.

Click on images to enlarge.

Thursday, April 26, 2018

Practice effects and progressive error practice effects on speeded tests

Journal of Intelligence

Response Time Reduction Due to Retesting in Mental Speed Tests: A Meta-Analysis (article link)

Jana Scharfen, Diego Blum and Heinz Holling


As retest effects in cognitive ability tests have been investigated by various primary and meta-analytic studies, most studies from this area focus on score gains as a result of retesting. To the best of our knowledge, no meta-analytic study has been reported that provides sizable estimates of response time (RT) reductions due to retesting. This multilevel meta-analysis focuses on mental speed tasks, for which outcome measures often consist of RTs. The size of RT reduction due to retesting in mental speed tasks for up to four test administrations was analyzed based on 36 studies including 49 samples and 212 outcomes for a total sample size of 21,810. Significant RT reductions were found, which increased with the number of test administrations, without reaching a plateau. Larger RT reductions were observed in more complex mental speed tasks compared to simple ones, whereas age and test-retest interval mostly did not moderate the size of the effect. Although a high heterogeneity of effects exists, retest effects were shown to occur for mental speed tasks regarding RT outcomes and should thus be more thoroughly accounted for in applied and research settings.

Keywords: meta-analysis; mental speed; processing speed; retest effect; practice effect; response time; reaction time; automatization

Thursday, April 19, 2018

The Flynn Effect and IQ Disparities Among Races, Ethnicities, and Nations: Are There Common Links? | Psychology Today

The Flynn Effect and IQ Disparities Among Races, Ethnicities, and Nations: Are There Common Links? | Psychology Today

The Flynn Effect and IQ Disparities Among Races, Ethnicities, and Nations: Are There Common Links?

Connecting the Flynn Effect to racial, ethnic, and national disparities in IQ

The 20th century witnessed a dramatic increase in IQ, as much as 3 points per decade (see Are you smarter than Aristotle? Part I). The fact that IQ scores increased so much in such a short amount of time has raised many issues about the nature of intelligence, and what intelligence tests are measuring. For instance, while an individual's IQ test performance within a particular generation tends to be relatively stable and is determined by a complex mix of nature and nurture, such dramatic increases across generations demonstrates the potent influence of the environment on the development of cognitive abilities.

Multiple researchers have proposed theories to explain the Flynn effect. One of the most elaborate is Dickens and Flynn's 'social multiplier effect'. Their proposed effect takes into account the importance of culture in influencing what particular forms of intelligence it educates, spotlights, and nurtures.


I like to use breakdancing as an example (see IQ Bashing, The Flynn Effect, and Genes). Within a particular generation, really athletic individuals will tend to score higher on a wide variety of tests that require athleticism (a trait that is influenced both by genetic and environmental factors). Athletic individuals will tend to run faster, life heavier weights, swim faster, and probably even look better breakdancing. But imagine that breakdancing suddenly became an Olympic sport (I can only dream). In this imaginary world, society suddenly shifts interest in basketball to breakdancing. We drop more money into educating everyone in the fine art of the baby-freeze, the windmill, and the headstand. Breakdancing becomes a craze, appearing in grade school classrooms, on streets, and on all sorts of job applications. What would come about as a result?

This sort of situation would up the ante on breakdancing skills. Sure, those naturally inclined toward athleticism would still have a breakdancing advantage, but the average standard of breakdancing performance would be greatly increased. In order to remain competitive, aspiring breakdancers would have to step their game up and learn increasingly complex moves. Given enough generations with such high levels of breakdancing training, you would start to see a rise in mean scores on tests of breakdancing ability.

This breakdancing example also applies to the rise seen in IQ scores across generations. Within each generation, people who tend to do well on one test of cognitive ability will tend to do well on other tests that tap to some extent complex reasoning ability. But across generations, the particular types of tests that show the most dramatic increases indicate to a considerable degree our cultural priorities. The Flynn Effect serves as a reminder that when we give people more opportunities to prosper, more people do prosper. We've come quite a long way since the pre-industrial revolution in terms of our cultural emphasis on reading, writing, abstract reasoning, and scientific thinking. The Flynn Effect is a partial indicator of this progress (see Are you smarter than Aristole?: On the Flynn Effect and the Aristotle Paradox).

Over the years, various 'social multipliers' (Dickens & Flynn, 2006) have been proposed to account for the Flynn Effect, including increased nutrition, increased test familiarity, heterosis, increased scientific education, video games, TV show complexity, modernization, and more. Surely a combination of factors contributed to the rise. In this post, I want to focus though on a few changes over the course of the past 100 years that have particular implications for understanding race, ethnic, and national disparities in IQ. First let's look at literacy.

Literacy involves the ability to write, read, and comprehend information of varying levels of complexity. It is estimated that there are 774 million illiterate adults in the world, 65% whom are women (UNESCO Institute for Statistics, 2007). In the United States alone, 5% of the adult population is completely nonliterate (Kirsch, Jungeblut, Jenkins, & Kolstad, 1993). Self-reported literacy skills of both White and Black populations of the U.S. have been increasing steadily since 1870, however (National Center for Education Statistics, 1993). One study showed that the IQ and literacy scores of Blacks increased in parallel from 1980 to 2000 (Dickens & Flynn, 2006).

The importance of being able to read for performance on an IQ test cannot be understated. Instead of measuring 'intelligence' in an illiterate test-taker, the test is measuring that person's inability to read. While 'intelligence' may certainly influence an individual's ability to read, society has a lot of influence on how many inhabitants even get the chance to read in the first place regardless of the intelligence level of any single individual. Therefore, reading skills may exert important effects on particular races and nationalities that have historically undergone much discrimination and as a result, limited opportunity for literacy development.

An enormous body of evidence collected over the past 50 years shows that different ethnicities and races within a country tend to show substantial differences in their average level of IQ. Some researchers argue that this gap is narrowing (Dickens & Flynn, 2006) whereas others argue that the IQ gap has remained stable (Murray, 2006). IQ test score discrepancies are also found between nations. For instance, sub-Saharan African countries have demonstrated statistically significantly lower IQs than other nations (Lynn, 2006, 2008). These findings have led some researchers to propose that such IQ gaps found across ethnicities, races, and nationalities suggests a difference in innate brain capacity (see Lynn & Vanhanen, 2006).

Until recently, the phenomenon of the Flynn Effect, and IQ gaps found between different ethnicities, races, and nationalities have not been tied together. For the first time ever, Psychologist David F. Marks systematically analyzed the association between literacy skills and IQ across time, nationality, and race (Marks, 2010).

If increasing literacy were really explaining a number of seemingly different IQ trends, then you would expect to see a few things. First, within a population you should expect increased education of literacy skills to be associated with an increase in the average IQ of that population. Second, IQ gains should be most pronounced in the lower half of the IQ bell curve since this is the section of the population that prior to the education would have obtained relatively lower scores due to their inability to comprehend the intelligence test's instructions. With increased literacy, you should expect to see a change in the skewness of the IQ distribution from positive to negative as a result of higher rates of literacy in the lower half of the IQ distribution (but very little change in the top half of the distribution). You should also expect to see differences on the particular intelligence test subscales, with increased literacy showing the strongest effects on verbal tests of intelligence and minimal differences on other tests of intelligence. If all these predictions hold up, there would be support for the notion that secular IQ gains and race differences are not different phenomena but have a common origin in literacy.

To test these predictions, Marks looked at samples representative of whole populations (rather than individuals), and used ecological methods to calculate statistical associations between IQ and literacy rates across different countries. Were Marks' findings consistent with the predictions?

Strikingly, yes. He found that the higher the literacy rate of a population, the higher that population's mean IQ, and the higher that population's mean IQ, the higher the literacy rate of that population. When literacy rates declined, mean IQ also declined. Marks also found evidence for unequal improvements across the entire IQ spectrum: the greatest effects of increased literacy rates were on those in the lower half of the IQ distribution. Interestingly, he also found that both the Flynn Effect and racial/national IQ differences showed the largest effects of literacy on verbal tests of intelligence, with the perceptual tests of intelligence showing no consistent pattern.

It must be noted that literacy wasn't the only factor responsible for the Flynn effect. Adopting the Cattell-Horn-Carroll (C-H-C) framework (McGrew, 2005, 2009) Marks found that Visual processing (Gv) and Processing Speed (Gs) also made important contributions.

It should also be noted that Mark's findings only speak to populations (not individuals) and do not say much about causation. The findings can only definitively say that some not-yet-identified variable is causing both literacy and IQ scores to change. To really test for causation, future experimental studies should be conducted to look at the effect of literacy intervention on IQ scores in comparison with a control group not receiving literacy intervention and should also investigate intervening variables that affect both literacy and IQ. Still, the result that population level literacy changes with population IQ is suggestive that increased literacy is causing increased IQ.

Even though there is still much work to be done, their findings have some very strong implications for our understanding of the Flynn effect, the nature of intelligence, and the origin of race and secular differences in intelligence.


In Hernstein & Murray's 1994 book The bell curve: intelligence and class structure in American life, most of their controversial claims about IQ differences, ethnicity, and social issues came from the United States Department of Labor's National Longitudinal Survey of Youth. This survey includes the Armed Forces Qualifications Test, which was developed by the Department of Defense and measures the ability of potential recruits to learn how to perform military duties. Since many of Hernstein & Murray's conclusions were based on this test, it's important to really examine what that test measures.

Marks did just that by scanning the literature for datasets containing test estimates for populations of groups taking both the Armed Forces Qualifications Test and tests of literacy. One study on nine groups of soliders differing in job and reading ability found a correlation of .96 between the Armed Forces Qualifications Test and reading achievement (Sticht, Caylor, Kern, & Fox, 1972). Another study showed significant improvements among Black and Hispanic populations in their Armed Forces Qualifications Test scores between 1980 and 1992 while Whites only showed a slight decrement (Kilburn, Hanser, & Klerman, 1998). Another study obtained reading scores for 17-year olds for those same ethnic groups and dates (Campbell et al., 2000) and found a correlation of .997 between reading scores and Armed Forces Qualifications Test scores. This nearly perfect correlation was based on six pairs of data points from six independent population samples evaluated by two separate groups of investigators. As Marks notes,

"On the basis of the studies summarized here, there can be little doubt that the Armed Forces Qualifications Test is a measure of literacy."

The Flynn Effect was intriguing all by itself. Now that researchers have shown common linkages between The Flynn Effect, race, ethnic, and nationality disparities, there are even more questions to be answered and potential research avenues to be explored. The Marks study suggests a crucial environmental factor is literacy. If this is so, then interventions that increase literacy will also narrow the IQ gap found between different races and nationalities.

Literacy intervention can take many forms though, both directly and indirectly. Researchers should consider not just improved access to schooling but also lots of other conditions that may affect literacy rates. For instance, recent research shows the important effects of parasites and pathogens on a nation's intelligence (see recent article in The Economist called Mens sana in corpore sano). Christopher Eppig and colleague's argue in their recent article in Proceedings of the Royal Society that the Flynn effect may be caused in part by the decrease in the intensity of infectious diseases as nations develop. Looking at data from 192 countries and 28 infectious diseases in those countries, they found that the higher the disease burden of that population, the lower that population's mean IQ level, with robust correlations ranging from -0.76 to -0.82. The chance that this correlation came about at random is reported by The Economist to be less than 10,000. Interestingly, when Eppig and colleagues controlled for other contributing variables to national differences in IQ (temperature, distance from Africa, gross domestic product per capita and various measures of education), infectious disease remained the most powerful predictor of average national IQ.

These results suggest that infections and parasites such as intestinal worms, malaria, and perhaps most importantly (according to Eppig and colleagues) bugs that cause diarrhea, can all have important effects on both literacy rates and IQ scores. The good news is that disease interventions such as vaccinations, clean water and proper sewage can have quite outstanding effects on multiple areas of cognition.


This latest research on the environmental effects of nutrition (Colom et al., 2005, but see Flynn, 2009), disease, literacy, and more on both the rise in IQ and ethnic, racial, and national disparities in IQ point to the importance of the environment for developing intelligence as well as the importance for researchers to be very careful when they use intelligence test performance (especially verbal tests) to make inferences about hereditary differences between different ethnic groups and nationalities.

© 2010 by Scott Barry Kaufman

Note: Portions of this post originally appeared as a guest post on the blog Intelligent Insights on Intelligence Theories and Test (see original post here), which is run by legendary IQ test maker, theorist, and researcher Kevin McGrew. I'm a long time follower of his blog and am honored to guest post for him.

Acknowledgments: Thanks to Louisa Egan for bringing The Economist article to my attention.

***Update*** Over at Kevin McGrew's blog, Bob Williams wrote an extensive reply to my post. You can read his very different perspective here.

For more on the Flynn Effect, see:


Campbell, J. R., Hombo, C. M., & Mazzeo, J. (2000) Trends in academic progress: three decades of student performance, NCES 2000-469. Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement, National Center for Education Statistics, NAEP 1999.

Colom, R., Lluis-Font, J. M., & Andrés-Pueyo, A. (2005) The generational intelligence gains are caused by decreasing variance in the lower half of the distribution: supporting evidence for the nutrition hypothesis. Intelligence, 33, 83-91.

Dickens, W. T., & Flynn, J. R. (2006) Black Americans reduce the racial IQ gap: evidence from standardization samples. Psychological Science, 17, 913-920.

Eppig, C., Fincher, C.L., & Thornhill, R. (2010). Parasite prevalence and the worldwide distribution of cognitive ability. Proceedings of the Royal Society B, doi: 10.1098/rspb.2010.0973.

Flynn, J. R. (2009) Requiem for nutrition as the cause of IQ gains: Raven's gains in Britain 1938 to 2008. Economics and Human Biology, 7, 18-27.

Herrnstein, R. J., & Murray, C. (1994) The bell curve: Intelligence and class structure in American life. New York: Free Press.

Kilburn, M. R., Hanser, L. M., & Klerman, J. A. (1998) Estimating AFQT scores for National Educational Longitudinal Study(NELS) respondents. Santa Monica, CA: RAND Distribution Services.

Kirsch, I. S., Jungeblut, A., Jenkins, L., & Kolstad, A. (1993) Adult literacy in America: A first look ook at the results of the National Adult Literacy Survey. Princeton, NJ: Educational Testing Service.

Lynn, R. (2006) Race differences in intelligence: an evolutionary analysis. Augusta, GA: Washington Summit.

Lynn, R. (2008) The global bell curve. Augusta, GA: Washington Summit.

Lynn, R., & Vanhanen, T. (2002) IQ and the wealth of nations. Westport, CT: Praeger.

Marks, D.F. (2010). IQ variations across time, race, and nationality: An artifact of differences in literacy skills. Psychological Reports, 106, 3, 643-664.

McGrew, K. S. (2005) The Cattell-Horn-Carroll theory of cognitive abilities: past, present, and future. In D. P. Flanagan & P. L. Harrison (Eds.), Contemporary intellectual assessment: theories, tests, and issues. (2nd ed.) New York: Guilford. Pp. 136-182.

McGrew, K. (2009). Editorial. CHC theory and the human cognitive abilities project. Standing on the shoulders of the giants of psychometric intelligence research, Intelligence, 37, 1-10.

Murray, C. (2006) Changes over time in the Black-White difference on mental tests: evidence from the children of the 1979 cohort of the National Longitudinal Survey of Youth. Intelligence, 34, 527-540.

National Center for Education Statistics. (1993) 120 years of American educ ation: a statistical portrait. (T. Snyder, Ed.) Washington, DC: U.S. Department of Education, Institute of Education Sciences, NCES 1993.

Sticht, T. G., Caylor, J. S., Kern, R. P., & Fox, L. C. (1972) Project REALISTIC: determination of adult functional literacy skill levels. Reading Research Quarterly, 7, 424-465.

Tuesday, April 17, 2018

The Flynn Effect Reference Project document was been updated

The Flynn Effect Reference Project document has just been updated.  It now includes 296 references.  Access can be found at this prior post (click here)

Court Decision: Russell v Mississippi (2018).

Russell v Mississippi (2018) is now available.This is a revision of an April 5:4 denial of a rehearing from a original loss in December.  This case has been remanded back to the state for evaluation by the states expert.

Saturday, April 14, 2018

Speed and the Flynn effect research study

Speed and the Flynn Effect (article link)

Olev Must and Aasa Must

Keywords: Flynn Effect NIT Speed Tork Estonia


We investigated the role of test-taking speed on the Flynn Effect (FE). Our study compared two cohorts of Estonian students (1933/36, n = 888; 2006, n = 912) using 9 subtests from the Estonian adaptation of the National Intelligence Tests (NIT). The speededness of the items and the subtests was found by determining the proportion of unreached items from among the total number of errors (Stafford, 1971). The test-taking speed of the younger cohort was higher in all 9 of the subtests. This suggests that the younger cohort is able to solve more items than the older one. The lack of measurement invariance at the item and subtest level was quantitatively estimated using a method proposed by Dimitrov (2017). The test-taking speed and the non-invariance of the items was strongly, yet inversely correlated (up to - 0.89). The subtests versions that consisted of only invariant items showed no, or a small positive, FE. The subtest versions consisting of only speeded items showed a large positive FE, with cohort differences of up to 50%. If the requirement of measurement invariance is ignored then this effect becomes apparent. The rise in test-taking speed between cohorts can be attributed to an increase in automated responses, which is an outgrowth of modern education (differences in the mandatory age of school attendance, and in the student's readiness to solve abstract items also affected the test-taking speed of the cohorts). We were able to conclude that the younger cohort is faster than the older one.

- Posted using BlogPress from my iPad

Friday, April 6, 2018

AJT CHC Intelligence Test launch in Jakarta - a measure of 9 broad CHC abilities

Yesterday’s AJT CHC cognitive test launch in Jakarta was a big success. I was taken aback by the special “event” flavor. Extremely professional. As I’ve stated before, the AJT is based on an Indonesia norm sample of 4,800 and will be one if the most comprehensive intelligence tests in the world (on par with the WJ IV COG). It measures 9 broad CHC domains (Gf, Gc, Gwm, Ga, Gv, Gs, Gl, Gr, and some of Gp-separate from cognitive). This has been the most personally rewarding and important project I have worked on in my 40+ years in psychology and education. It is bringing the core concept of individual differences to the education system of the fourth largest country in the world.

George and Laurel Tahija (see picture below), and their YDB foundation, are the visionaries behind this project and other projects focused on trying to help unique learners in their country. In my five years on this project I can say that I’ve never worked with so many nice people . It was a grand effort by many. I am very impressed how together we built such a comprehensive and technically sound battery of tests from scratch. I have developed a fondness for Indonesia and the people of this wonderful country. The genuine warmth and enthusiasm of the participants was personally moving.

For more information check out these two links (one; two)

Click images to enlarge.

Tuesday, April 3, 2018

NEW RESOURCES: University of Virginia Interactive Database Maps the Modern Death Penalty

NEW RESOURCES: University of Virginia Interactive Database Maps the Modern Death Penalty
// Death Penalty Information Center

 The University of Virginia School of Law has created a new interactive web resource (click on map) that allows researchers and the public to visually explore death-sentencing practices in the United States from 1991 through 2017. The interactive map provides county-level data on death sentences imposed across the United States, drawing from a new database created by University of Virginia Law Professor Brandon Garrett (pictured) for his recent book, End of Its Rope: How Killing the Death Penalty Can Revive Criminal Justice. The interactive map, which is a web supplement to the book, permits users to view where and how many death sentences were imposed in the U.S. each year, and to contrast and compare sentencing patterns over time in states, counties, and the U.S. as a whole. Using a slider to view chronological shifts in sentencing patterns, the map illustrates how death sentences have declined nationwide and become increasingly isolated to a few outlier counties. "This is the first resource to map out modern death sentencing in the United States," Garrett said. "The mapping vividly shows how geographically isolated death sentencing has become." The data forms the backbone of End of Its Rope, in which Garrett analyzes the dramatic decline in the use of the death penalty over the last 25 years. The publicly available database contains information on more than 5,000 death sentences, allowing researchers and lawyers to analyze patterns and trends. "Several researchers, in addition to those of us at UVA, have already made use of the data, and we hope that more do so in the future," Garrett said. Garrett worked with a UVA Law librarian, law students, and undergraduates to compile the data from government records, court rulings, and other sources.
(Eric Williamson, Mapping the Modern Death Sentence, University of Virginia School of Law, March 27, 2018.) Explore the interactive map here. See Books and Sentencing.
  • 173 reads


Kevin McGrew, PhD
Educational Psychologist 
Institute for Applied Psychometrics

Monday, March 12, 2018

CHC intelligence theory update: Live chat (this Sunday evening) or later viewing on YouTube

I am looking forward to talking about the Cattell-Horn-Carroll (CHC) model of intelligence on the #psychedpodcast this sunday evening.

I will present material largely based on the forthcoming CHC chapter coauthored with Dr. Joel Schneider.  Tune shall be fun. Or, watch the discussion later on YouTube, and eventually as an audio podcast on iTunes

Friday, March 2, 2018

BB (blatant brag): McGrew CHC 2009 article in Intelligence #1 (2008-2015) and top #10 all time

This was a pleasant surprise. I knew my 2009 Intelligence article was cited frequently but I never knew it was number one from 2008-2015 and it made the top 10 all time list for the journal Intelligence. I believe this is a reflection of the impact the CHC taxonomy has had. This should make my mom proud. Here is a link to the original article.

Bibliometric analysis across eight years 2008–2015 of Intelligence articles: An updating of Wicherts (2009). Article link.

Bryan J. Pesta


I update and expand upon Wicherts' (2009) editorial in Intelligence. He reported citation counts of papers pub-lished in this journal from 1977 to 2007. All these papers are now at least a decade old, and many more new articles have been published since Wichert's analysis. An updated study is needed to help (1) quantify the journal's more recent impact on the scientific study of intelligence, and (2) alert researchers and educators to highly-cited articles; especially newer ones. Thus, I conducted a bibliometric analysis of all articles published here from 2008 to 2015. Data sources included both the Web of Science (WOS), and Google Scholar (GS). The eight-year set comprised 619 articles, published by 1897 authors. The average article had 17.0 (WOS), and 32.9 (GS) citations overall (2.75, and 5.33 citations per year, respectively). These metrics compare favorably with those from other psychology journals. In addition, a list of the most prolific authors is provided. Also reported is a list showing many articles in this set with counts greater than one hundred, and an updated top 25 list for the history of this journal.

“Also noteworthy is that nine of the articles in the old list (not shown here) dropped off the new list. Of their replacements, only three of the nine were published within the last decade: Deary, Strand, Smith, and Fernandes (2007); McGrew (2009), and Strenze (2007). The McGrew (2009) paper is again notable. It is the only article in my newer set (2008–2015) to make the all-time list. The paper ranks ninth on the all-time list with 281 citations, just eight years after being published.”

More recent Google Scholar citation info indicates that the article is still going strong from 2016-2017.

Click on images to enlarge.

Monday, February 19, 2018

Does the rot start at the top? On a current Flynn effect study argument

Does the rot start at the top?

As readers of this blog will know, it is usually Woodley of Menie who darkens these pages with talk of genetic ruin, while James Flynn is the plucky New Zealander…

Sunday, January 28, 2018

Research Byte: Psychological and Cognitive Aspects of Borderline Intellectual Functioning : A Systematic Review

Psychological and Cognitive Aspects of Borderline Intellectual Functioning: A Systematic Review

Contena, B., & Taddei, S. (2017). Psychological and Cognitive Aspects of Borderline Intellectual Functioning. European Psychologist. Article link.
Bastianina Contena and Stefano Taddei


Borderline Intellectual Functioning (BIF) refers to a global IQ ranging from 71 to 84, and it represents a condition of clinical attention for its association with other disorders and its influence on the outcomes of treatments and, in general, quality of life and adaptation. Furthermore, its definition has changed over time causing a relevant clinical impact. For this reason, a systematic review of the literature on this topic can promote an understanding of what has been studied, and can differentiate what is currently attributable to BIF from that which cannot be associated with this kind of intellectual functioning. Using Preferred Reporting Items for Systematic Review and Meta-Analyses( PRISMA) criteria, we have conducted a review of the literature about BIF. The results suggest that this condition is still associated with mental retardation, and only a few studies have focused specifically on this condition.
Keywords: borderline intellectual functioning, borderline mental retardation, intellectual disability, systematic review

- Posted using BlogPress from my iPad

Wednesday, January 17, 2018

Validity, Interrater Reliability, and Measures of Adaptive Behavior: Concerns Regarding the Probative Versus Prejudicial Value

Validity, Interrater Reliability, and Measures of Adaptive Behavior: Concerns Regarding the Probative Versus Prejudicial Value

Psychology, Public Policy, and Law. Article link.

Karen L. Salekin,The University of Alabama
Tess M. S. Neal,Arizona State University
Krystal A. Hedge, Federal Medical Center, Devens, Massachusetts

The question as to whether the assessment of adaptive behavior (AB) for evaluations of intellectual disability (ID) in the community meet the level of rigor necessary for admissibility in legal cases is addressed. AB measures have made their way into the forensic domain, in which scientific evidence is put under great scrutiny. Assessment of ID in capital murder proceedings has garnished a lot of attention, but assessments of ID in adult populations also occur with some frequency in the context of other criminal proceedings (e.g., competence to stand trial, competence to waive Miranda rights), as well as eligibility for social security disability, social security insurance, Medicaid/Medicare, government housing, and postsecondary transition services. As will be demonstrated, markedly disparate findings between raters can occur on measures of AB even when the assessment is conducted in accordance with standard procedures (i.e., the person was assessed in a community setting, in real time, with multiple appropriate raters, when the person was younger than 18 years of age), and similar disparities can be found in the context of the unorthodox and untested retrospective assessment used in capital proceedings. With full recognition that some level of disparity is to be expected, the level of disparity that can arise when these measures are administered retrospectively calls into question the validity of the results and, consequently, their probative value.

Keywords: adaptive behavior measures, Atkins, forensic evaluations, validity, interrater reliability

- Posted using BlogPress from my iPad

Friday, January 12, 2018

Five Factor Model personality disorder scales: An introduction to a special section on assessment of maladaptive variants of the five factor model.

Five Factor Model personality disorder scales: An introduction to a special section on assessment of maladaptive variants of the five factor model.
// Psychological Assessment - Vol 22, Iss 2

The Five-Factor Model (FFM) is a dimensional model of general personality structure, consisting of the domains of neuroticism (or emotional instability), extraversion versus introversion, openness (or unconventionality), agreeableness versus antagonism, and conscientiousness (or constraint). The FFM is arguably the most commonly researched dimensional model of general personality structure. However, a notable limitation of existing measures of the FFM has been a lack of coverage of its maladaptive variants. A series of self-report inventories has been developed to assess for the maladaptive personality traits that define Diagnostic and Statistical Manual of Mental Disorders (fifth edition; DSM–5) Section II personality disorders (American Psychiatric Association [APA], 2013) from the perspective of the FFM. In this paper, we provide an introduction to this Special Section, presenting the rationale and empirical support for these measures and placing them in the historical context of the recent revision to the APA diagnostic manual. This introduction is followed by 5 papers that provide further empirical support for these measures and address current issues within the personality assessment literature. (PsycINFO Database Record (c) 2018 APA, all rights reserved)

Sunday, January 7, 2018

Research Byte: False Confessions: How Can Psychology So Basic Be So Counterintuitive?

False Confessions: How Can Psychology So Basic Be So Counterintuitive?

American Psychologist © 2017 American Psychological Association 2017, Vol. 72, No. 9, 951–964 0003-066X/17/$12.00 Article link.

Saul M. Kassin John Jay College of Criminal Justice of CUNY

Recent advances in DNA technology have shined a spotlight on thousands of innocent people wrongfully convicted for crimes they did not commit—many of whom had been induced to confess. The scientific study of false confessions, which helps to explain this phenomenon, has proved highly paradoxical. On the one hand, it is rooted in reliable core principles of psychology (e.g., research on reinforcement and decision-making, obedi-ence to authority, and confirmation biases). On the other hand, false confessions are highly counterintuitive if not inconceivable to most people (e.g., as seen in actual trial outcomes as well as studies of jury decision making). This article describes both the psychology underlying false confessions and the psychology that predicts the counter-intuitive nature of this same phenomenon. It then notes that precisely because they are so counterintuitive, false confessions are often “invisible,” resulting in a form of inatten-tional blindness, and are slow to change in the face of contradiction, illustrating belief perseverance. This article concludes by suggesting ways in which psychologists can help to prevent future miscarriages of justice by advocating for reforms to policy and practice and helping to raise public awareness.

Keywords: interrogation, false confessions, confirmation bias, social influence, wrongful convictions

Monday, December 18, 2017

Atkins court decision: Farad Roland v USA (NJ; 2018)

Today the opinion regarding the Atkins ID decision for Farad Roland was issued.  As per my policy, having served as an expert witness in this particular case, I offer no comments.  The opinion can be found here.

Friday, December 15, 2017

Does the rot start at the top? New different Flynn effect research

Does the rot start at the top?

From Twitter, a Flipboard magazine by James Thompson

As readers of this blog will know, it is usually Woodley of Menie who darkens these pages with talk of genetic ruin, while James Flynn is the plucky New Zealander…

