Week 37: Russell Gray et al (2003) Evolutionary Psychology and the Challenge of Adaptive Explanation

Russell Gray, Megan Heaney, and Scott Fairhall note that most biologists would be happy to agree that humans are animals, and that humans are the products of evolution. But they note that many are uncomfortable with the implications of this and would answer in the negative to the following question: “if humans are the products of evolution then can we explain human behaviour in evolutionary terms?”. So, they ponder, what is going on here? Do biologists suddenly become creationists when it comes to the human mind?

Gray and colleagues argue in this paper that whilst there is nothing wrong in principle with taking an evolutionary approach to human behaviour and cognition, the current field of Evolutionary Psychology – as espoused by the Santa Barbara school – has not been done very well. Even to the point that, as Gray and colleagues state, it is “embarrassing” in regards to the empirical research (see below in the conclusion of this review for a particularly galling example of this).

Here Gray and colleagues are drawing a distinction between evolutionary psychology in a broader sense as simply any view that takes an evolutionary approach to cognition; and this is contrasted with Evolutionary Psychology (or EP). EP is a research program associated with “the Santa Barbara church of evolutionary psychology” (whose most famous proponents are John Tooby, Leda Cosmides, and Steven Pinker). EP is a nativist position that proposes that the mind is composed of a large collection of modules that evolved through natural selection in the Pleistocene to tackle the everyday problems of our ancestors.

This second aspect is referred to as “massive modularity” (Carruthers 2006) and is an important distinction from Jerry Fodor’s original statement of the modular mind thesis. It is worth starting with Fodor (1983) because, as Max Coltheart (1999) has argued, there is much misconception about what he meant by modules. Fodor originally proposed that the mind was a partially modular system – in regards to sensory modalities and what are, according to this view, peripheral systems (i.e. no central processing). And he proposed a set of defining features to identify modules: they operate very fast and automatic (i.e. they don’t require conscious effort); they are domain-specific (e.g. vision, memory, etc.); they are informationally encapsulated (i.e. higher-level beliefs and desires do not impact on how they operate); and they are innate (in that they are not acquired from a cultural niche through ontogeny). However, as Coltheart stresses, these were not to be seen as necessary and sufficient conditions, but rather as typical characteristics.

There has since been much discussion about whether any cognitive domains have these features (e.g. see Prinz 2006 for a critical discussion). But for our purposes here, the key point is that Fodor thought that modules were only a partial description of the functional properties of the mind. In contrast, EP proponents took the idea of innate modules and extended their application to all aspects of the human mind. Hence, the name massive modularity. Importantly, whereas for Fodor the defining aspect of a module was information encapsulation, for EP proponents the most important aspect is that they are domain-specific. Their claim being that it would be easier for evolution to evolve a wide variety of differing subsystems that can achieve one goal, rather than building an expensive domain-general system. The nice metaphor tagline for the resulting position is: ‘the human mind is like a Swiss army knife’.

modules as swiss army knives

Having established the philosophical background of the debate, we can now return to discussing Gray and colleagues paper. They note that the critique they are putting forth here sits alongside a wide range of other attacks on EP for being, amongst things:

  • A monomorphic view of the mind (i.e. only having one form in the population)
  • Massive modularity (see week 9 for a critique of this in regards to neural reuse and the relationship of functions and evolutionary history)
  • A “cartoon” view of the Pleistocene (see work by Richard Potts for a nice discussion of the volatility of our ancestor’s past)

Their aim here is to supplement this discussion by examining how EP handles adaptive explanation. EP takes the view that in order to understand a particular cognitive trait one does not try to explain its function but its current context but rather engages in what Pinker (1997) labels “reverse engineering”: i.e taking the current features of human cognition and then speculating on how they have been shaped by natural selection. This involves considering how these traits evolved in what Tooby & Cosmides (1990) label the “environment of evolutionary adaptedness” (EEA) – this is a smorgasbord amalgamation of the average conditions of the Pleistocene two million years ago. The problems our ancestors faced in the EEA, so the EP proponent claims, is the selection pressure for how our current cognitive profile.

Gray and colleagues note that in regards to adaptationist explanations, they can only work if three criteria are met:

  1. All traits are adaptations
  2. The traits to be given an adaptive explanation could be easily characterised
  3. Plausible adaptive explanations were difficult to come by

They go on to note that if all three of these criteria were correct, then the plausible explanation of a particular trait would have a high probability of being correct. The problem is that these are all frequently not true.

Firstly, Evolutionary biology is not strictly adaptationist – a position that came to be known as “Panglossian” (Gould & Lewontin 1979). And most biologists now accept a wide variety of evolutionary processes besides natural selection. They give the examples of genetic drift, pleiotropy, epistasis, spandrels, exaptations, developmental constraints, and phenotypic integration. And add that the debate within evolutionary biology is about the relative importance of these other processes in conjunction with natural selection.

Secondly, traits are very hard to define for multiple reasons: [1] it can be hard to pick out a discreet phenomenon; and [2] it can be hard to identify the problem or environmental issue that trait is a response to (instead there is an “ongoing mosaic” of problems).

Thirdly, adaptive explanations are ten-a-penny and are only really constrained by the inventiveness of the author. To demonstrate this Gray and colleagues give the examples of a stegosaurus’ bony plates (they could be adaptation to avoid predators, cool the body, or sex related, etc.) and large hominin brains (they could be adaptations for ecological problem solving, throwing objects, thermoregulation, or the social brain hypothesis, or the aquatic ape hypothesis, etc.). Just conceiving of an adaptive explanation does not mean it is correct!

To deal with this heterogeneity, evolutionary biologists employ a combination of

  • Developmental studies and Phylogenetic analyses: These can help identify appropriate traits. And they identify the range of phenotypic variation of the trait that can then have forces of selection act upon
  • Engineering style optimality models: can help identify aspects of “good design” in a trait in a quantitative manner.
  • Comparative tests: comparative tests on explicit phylogenetic methods = used to discriminate between competing adaptive as well as non-adaptive explanations.

They then demonstrate how these can work in relation to some examples from evolutionary biology and draw out several key points: that adaptation is a special concept that should only be used when necessary (G. C. Williams); and in contrast they think that most EP accounts are just folk-story telling that has not properly considered the complexity of the multiple issues here. To avoid this they propose two tests:

  1. The Grandparent test: filter of folk wisdom and plausible post hoc stories – i.e. “Does this work give us any insight into human behaviour and cognition beyond popular knowledge?”
  2. The Lesser Spotted Gerbil test: “would this research be publishable in major international journals is the species was a small non-charismatic mammal rather than our own?”

They assert that most EP claims fail these two tests but then discuss what they see as two of the strongest cases – to show they are not just attacking a straw man. Gray and colleagues state that two of the strongest examples in EP are: Hip-to-waist ratio (WHR) and Cheater-detection. They go through these examples in extreme detail comparing various versions of the experiments and the data.


WHR: young men appear to choose a certain WHR when judging attractiveness of their partners (0.7) according to Devendra Singh and colleagues. Additionally, they go on to claim that this is a cross-cultural finding after looking at just a few non-WEIRDs (Western, Educated, Industrialised, Rich, and Democratic). Gray and colleagues note that WHR passes the lesser spotted gerbil test. But they also note that there are some peculiarities with the experimental findings. Singh and colleagues got participants to rate stick and line drawings (see picture above) which are not very ecologically salient. But furthermore, there is a methodological flaw with these drawings because when altering the WHR in the drawings the Body Mass Index (BMI) is also altered. And Gray and colleagues note that a further study looking at trying to identify whether BMI or WHR is the crucial factor here found that BMI is more important.

Another study found that men actually prefer women with a WHR of less than 0.7 which is not physically possible. But since there are examples of ‘super normal’ behaviour in other species (e.g. Bees will follow random yellow-orange circles rather than the sun; baby seagulls will preferentially peck random large red circles rather than the smaller red spot on their parent’s beak, etc.) so this is not too much of an issue. But a bigger issue pressuring whether WHR is a legitimate modular trait regards whether a domain-general mechanism could achieve the same result. As Gould and Lewontin (1979) note “not all useful outcomes are produced by specific adaptations”. So, in this case, there is no need for a specific psychological mechanism aimed at monitoring WHR. Instead, a generic mechanism can do the job. As such, there is just a domain-general mechanism and no module.

Lastly, recent evidence has shown that, despite Singh and colleagues’ claims, WHR is not actually culturally invariant. Importantly: Gray and colleagues are not saying that the WHR theory is sloppy science. Indeed, they assert that it is much better than the majority of EP research which they accuse of being post-hoc. But what is poor about the WHR hypothesis is the lack of critical evaluation of the evidence. And Gray and colleagues see this as an endemic problem because the EP literature consistently holds WHR up as an exemplar of EP research and explanation in action despite the fact that a properly thorough analysis shows it to be flawed.

Cheater detection: EP proponents claim that the human mind is massively modular – i.e. that the mind is composed of myriad mini-organs that have specific functions. When it comes to social situations, Cosmides has used a variant of the Wason Selection Task to make the claim that there is a cheater detection module.

The Wason selection task tests how bad people are at abstract “if P then Q” logic tasks. Mercier and Sperber have recently noted that pass rates for individuals are around 10% (2011, p. 63). But pass rates for groups are at around 80% (see weeks 14 and 27 for more discussion of this). Cosmides altered the test to replace the abstract formal tokens with social placeholders. For instance, rather than: which two cards do you need to turn to test the rule “if a card shows an even number on one face, then the opposite face is red”?

wason selection task

The test is: which two cards must you turn over to check the rule “if you are drinking alcohol that you must be 18”?

wason 2

Whereas pass rates for the former are only around 10%, a majority of participants can easily pass the social version of the Wason selection task. Cosmides claims that humans face a persistent pressure of cheaters and so have evolved streamlined mechanisms to face this challenge. Cosmides claims that the cheater detection module will only operate in social exchange situations and that it can detect cheaters even in novel social exchange scenarios. Gray and colleagues compliment this work by Cosmides because it brings together and synthesises a wide range of material and offers novel insights into human psychology. But although it passes both the grandfather and lesser spotted gerbil tests, it has a number of issues:

  • Using the Wason selection task in terms of cheater detection alters the nature of the task from its original setting of logical reasoning.
  • It could be that social exchange scenarios have contexts that make the logical underpinnings of the set up more affordable as opposed to other settings. This does not support a cheater detection module, it just shows that context impacts on domain general logical reasoning.
  • There are various ways to enhance P and not-Q selection which are not specific to cheater detection (see Sperber, Cara, & Girotto 1995)
  • Relevance theory: p-and-not-q = easier than p-and-q

Gray and colleagues conclude that these results and theorising show that one can think about and explain cheater detection experiments without postulating a cheater detection module. So, why is it, they ask, that this theory is often seen as the jewel in EP’s crown? To answer this query, they turn to an examination of “the cassava root problem” which reveals the confounds. There are several differences between cheating and normal scenarios showing that they are not equivalent:

  1. In non-cheating scenarios there is no specific violating case
  2. Rules for cheating are absolute whereas rules for non-cheating are more like heuristics
  3. Non-cheating rules are more easily interpreted as bidirectional – this contrasts with the unilinear direction of cheating relations

When these confounds were reversed then the results flipped and people suddenly did better in non-cheating scenarios. When rigorous experiments are conducted that make sure that cheating and non-cheating scenarios are properly equivalent then the effect of improved performance in the cheating scenario completely cancels out. Indeed they managed to flip the passing/failing rates.


Gray and colleagues have not merely attacked a straw man version of EP (such as this stupendously ignorant 2007 paper which claims that women are genetically predisposed to liking the colour pink – for a brutal critical commentary see Ben Goldacre’s post over at Bad Science). Instead, they have struck EP at one of its “crown jewels” with a thorough analysis that demonstrates the level of rigor required to actually make claims about how human psychology is sculpted by processes of genetic and cultural evolution. Indeed, I think the discussion over the variable ways in which one has to properly test a claim is instructive of the lengths researchers interested in this topic should strive for.

On a final related note I would like to finish by pointing out that recent evidence indicates that our species has genetically evolved significantly just in the last 10kya (Laland et al 2010). And there is even evidence of accelerating genetic evolution over the last 7kya (Hawks et al 2007). This strongly suggests that the notion of ‘stone age minds in a modern world’ is far from the correct picture we should be seeking. Instead, we need to pay attention to the way human cognition is transformed and shaped by ongoing cultural and genetic evolution.




Carruthers, P. (2006) The Architecture of the Mind Massive Modularity and the Flexibility of Thought. Oxford: Oxford University Press.

Coltheart, M. (1999) Modularity and cognition. Trends in Cognitive Sciences 3 (3), 115-120.

Fodor, J. (1983) The Modularity of Mind. Cambridge, MA: The MIT Press.

Gould, S. J. & Lewontin, R. C. (1979) The Spandrels of San Marco and the Panglossian paradigm: a critique of the adaptationist programme. Proceedings of the Royal Society B 205, 581-598.

Gray, R. D., Heaney, M. & Fairhall, S. (2003) Evolutionary Psychology and the Challenge of Adaptive Explanation. (pp. 247-268) in K. Sterelny & J. Fitness (eds.) From Mating to Mentality: Evaluating Evolutionary Psychology. New York: Psychology Press.

Hawks, J., Wang, E. T., Cochran, G. M., Harpending, H. C., & Moyzis, R. K. (2007) Recent acceleration of human adaptive evolution. PNAS 104 (52), 20753-20758.

Laland, K. N., Odling-Smee, J., & Myles, S. (2010). How culture shaped the human genome: bringing genetics and the human sciences together. Nature Reviews Genetics 11, 137-148.

Mercier, H. & Sperber, D. (2011) Why do humans reason? Arguments for an argumentative theory. Behavioral and Brain Sciences 34, 57-111.

Pinker, S. (1997) How the Mind Works. New York: W. W. Norton & Company.

Prinz, J. J., 2006. Is the mind really modular? In R. Stainton (ed.), Contemporary Debates in Cognitive Science, Oxford: Blackwell, pp. 22–36.

Sperber, D., Cara, F., & Girotto, V. (1995) Relevance theory explains the selection task. Cognition 57, 31-95.

Tooby, J. & Cosmides, L. (1990) The Past Explains the Present Emotional Adaptations and the Structure of Ancestral Environments. Ethology and Sociobiology 11, 375-424.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s