An Investigation Exploring Whether the Implicit Association Test is a Valid and Reliable Measurement of Socially Sensitive Attitudes


This study aimed to test the reliability and the validity of the Implicit Association Test, by using already established compound themes (cuteness preferences and food preference) in order to see if it is appropriate to use for socially sensitive topics. 60 participants were asked to describe characteristics of a good food, and something cute (dependent variable). The independent variable was what stimulus accompanied the tasks, as half the participants were given excitatory stimuli (e.g. photos of a cute dog/tasty-looking food), and half were given inhibitory stimuli (e.g. photos of a scary dog/unappealing-looking food). The results found that number of characteristics were significantly higher for excitatory stimuli, compared to inhibitory stimuli in the food condition. In contrast, there was no significant difference for the cuteness condition. As a result, this indicates that implicit association is topic specific, and IATs are only accurate in certain contexts. Therefore, this suggests that IATs are not necessarily a valid and reliable measurement of socially sensitive attitudes. For future research, it has been suggested a new tool (The Simple Implicit Procedure: SIP) may be a more valid and reliable method of measuring socially sensitive attitudes, so this should be investigated further.


As first evidenced in LaPiere’s (1934) study on the inconsistency of attitude behaviour, a major problem in psychological research is the assessment of attitudes. Research has relied on explicit self-report techniques (e.g. questionnaires) in order to measure attitudes. Such techniques can be heavily scrutinised, particularly given they have been used for the measurement of attitudes on socially sensitive topics. For example, research has found people lie about their sexuality (Nicholas, Durrheim & Tredoux, 1994), people of ethnic origin are sentenced more severely in a world where the law claims to be unprejudiced (Steffensmeier & Demuth, 2006), and people who suffer from severe mood disorders claim they are stable, yet soon after discharge attempt suicide (Goldacre & Seagroatt, 1993). This demonstrates the extent to which explicit self-report techniques, particularly on socially sensitive issues, are immensely vulnerable to social desirability bias. For these reasons, self-report methods in modern research have generally declined, as humans can be very unreliable sources of information. Ultimately, the measurement of attitudes needs to rely on tests that do not suffer from these biases.

As a result, more modern research has turned its focus to the measure of implicit attitudes. Currently, the most accepted measure of this is the Implicit Association Test (IAT), which aims to measure implicit attitudes by measuring underlying automatic evaluation (Rothermund & Wentura, 2004). The idea behind the IAT is to measure the extent in which a participant associates two concepts (Greenwald, McGhee, & Schwartz, 1998). IATs measure an individual’s implicit memory, allowing researchers to understand attitudes that cannot be physically measured using explicit methods (Greenwald & Banaji, 1995). It is believed the IAT is an effective way to measure an individual’s truthful attitude which they would not be willing to reveal publicly.

While research using IATs do appear to have important applications, their use has been under much controversy. First of all, research has found that the IAT is not as reliable and valid as originally claimed. For example, Steffens (2004) discovered that participants can fake their answers in the IAT and give bias results. Furthermore, Schnabel, Banse & Asendorpf (2006) claimed the IAT demonstrates partially low reliability and is susceptible to context effects. Moreover, several studies have shown the IAT suffers from a variety of bias’, including biases at a pre-attentive level, cultural biases, and cognitive biases (Elbers, 2005; Arkes, Tetlock, 2004; Messner & Vosgerau, 2010).

Despite these limitations, the IAT has been used for a number of sensitive topics, including sexuality (Snowden & Gray, 2013; Snowden, Wichter & Gray, 2008; Banse, Seise & Zerbes, 2001), racism (Levinson, Cai, Young, 2009; McConnell, and Leibold, 2001), child sexual abuse (Nunes, Firestone & Baldwin, 2007), and suicide rates (Nock, Park, Finn, Deliberto, Dour & Banaji, 2010). An additional issue is the IAT is often compared to explicit tests (Greenwald, Poehiman, Uhlrnann & Banaji, 2009). The problem with this is that results therefore provide solely concurrent validity with explicit measures, which could also be incorrect. In these aforementioned fields (e.g. sexuality) it is not possible to know a concrete answer. By comparing these means therefore, we can only assume that measures like the IAT are valid because they are concurrent with another explicit test. Adding the claim that implicit measures are only weakly predictive of behaviours, and are no better than explicit measures (Oswald, Mitchell, Blanton & Tetlock, 2004). Therefore, this demonstrates the extent in which using the IAT specifically for socially sensitive topics could lack validity.

Demonstrating the validity and reliability of the IAT for socially sensitive topics therefore, is crucial. A possible way of achieving this is to test the IAT’s validity on non-controversial topics that have a concrete answer. Two possible examples are food preference and cuteness preference. Research has demonstrated that colourful food is the most appealing (Lee, Lee, Lee, & Song, 2013), whereas blue-coloured food is considered to be unappealing (Paakki, Sandell, & Hopia, 2016). In addition, research has shown Kindchenschema (physical characteristics such as big eyes, round cheeks e.t.c, often exhibited in infant animals) elicit positive feelings (Kringelbach, Stark, Alexander, Bornstein, & Stein, 2016; Nenkov & Scott, 2014). Measuring implicit association in food preference should therefore result in a greater preference for colourful foods compared to blue-coloured foods. In addition, measuring implicit association for cuteness preference should result in a greater preference for stimuli exhibiting the aforementioned Kindchenschema, compared to stimuli not associated with high-cuteness (e.g. frightening animals).

This study therefore has the aim of establishing whether the IAT is a valid and reliable measure of socially sensitive topics (e.g. sexuality). Therefore, the hypothesis of this study is that participants presented with excitatory stimuli for cuteness and food preference will perform better on the IAT than participants presented with inhibitory stimuli for cuteness and food preference. If the hypothesis is found to be accurate and significant, then it can be shown that the IAT is an effective measure of implicit attitudes. However, if the opposite is reality, then this would demonstrate that implicit association is inaccurate, and such measures are an invalid and unreliable way of measuring socially sensitive attitudes.



Participants were selected using convenience sampling on a voluntary basis, and were offered no physical incentives for taking part. Participants were required to be over the age of 18, and their level of English proficiency had to be at least fluent. This study was conducted with two other 1st year Psychology undergraduates, so each member of the group was asked to collect data from 20 participants (n=60). Participants were of a mixed age and gender. Of the 60 participants, 26 (42.6%) were male and 34 (55.7%) were female. In addition, the age range of participants was 63, with a mean of 25.3 and a standard deviation of 15.4.


A between-subjects design was used in this study. There were two different conditions, an excitatory condition (which featured photos of an excitatory stimulus, such as a cute kitten e.t.c), and an inhibitory condition (which featured photos of an inhibitory stimulus, such as a scary dog e.t.c). The primary independent variable in this study was whether participants were presented with the excitatory stimulus, or the inhibitory stimulus. In addition, the secondary independent variable was the topic presented (either food, or cuteness) The dependent variable was the number of characteristics stated by the participant in each individual condition. Two different analyses were used on the data obtained. The primary analysis was whether people stated more characteristics when presented with an excitatory stimulus, or an inhibitory stimulus. The secondary analysis was whether the pattern of stimuli obtained (e.g. more characteristics in the excitatory condition) is topic specific (e.g. more prevalent in the cute condition), and whether such obtained patterns are generalisable to both conditions.


Participants were allocated by convenience to three different lab conditions (dependent on the experimenter). The lab condition consisted of one participant engaging in a one-to-one session in a silent room, and a computer-based presentation. The participant’s response was recorded, and the presentation began on the control of the participant. Each slide of the presentation had a 30 second timer. One slide asked participants to describe characterises of something cute, featuring photos of a human, a cat, a dog, and an otter. The other slide asked participants to describe characteristics of a good food, featuring photos of a cake, potatoes, a salad, and a chicken and rice-based dish.


1. Participants were given a consent form to read which explained that they would be asked to say as many characteristics of a topic in 30 seconds. It further stated there would be two rounds, and each 30 second round would be recorded. The consent form also asked the participants for their age, sex, and English proficiency. If giving fully informed consent, participants were asked to fill out the form, and sign it.

2. Participants were then exposed to either of the two conditions. These were excitatory stimuli accompanied with the cute condition, and inhibitory stimuli accompanied with the food condition, or excitatory stimuli accompanied with the food condition, and inhibitory stimuli accompanied with the cute condition. Participants were recorded from when they signed the consent form, and they had control as to when the presentation started. Once started, participants were given 30 seconds (controlled by a timer on the presentation) for each condition to say as many characteristics about a topic as they could.

3. Once the study was completed, the audio recording was ended and allocated with a participant number (e.g. 203) for anonymity purposes. Participants were then given a full debrief, with the opportunity to withdraw themselves and their data from the study.

4. Each recording was then listened to, with the number of characteristics counted.


Table 1 summarises the descriptive statistics for the differences in the number of characteristics stated by participants between the two conditions. An independent samples t-test found that participants in the excitatory food condition performed significantly better compared to the food inhibitory condition, t(58) = 2.54, p=0.14. However, a further independent samples t-test found that there was no significant difference between the excitatory cute condition and inhibitory cute condition, t(58) = -1.90, p <.05. This therefore demonstrates that excitatory stimuli have a significant positive effect on the number of characteristics of food people can produce compared with inhibitory stimuli, whereas type of stimulus has no significant effect on the number of characteristics of something cute people can produce.

Table 1: Descriptive Statistics for each Condition

Std. Deviation


The results of this study demonstrated two key findings. First, excitatory stimuli in the food condition led to a higher number of characteristics than inhibitory stimuli. This therefore supports the hypothesis in the expectation that excitatory stimulation leads to greater implicit association. However, the second key finding was that this significant difference found in the food condition, was not found in the cute condition. This has important implications, as this indicates that while the IAT is accurate in some cases, it is not accurate in all cases. Therefore, the results of this study appear to show that the type of stimulus plays a role in implicit association, however, this pattern is not necessarily topic specific, and may not be generalisable to all contexts.

The results present both similar and contradicting findings to previous literature. For example, the finding that excitatory food stimuli leads to a greater implicit association than inhibitory food stimuli support Lee et al. (2013) and Paakki et al.’s (2016) literature. However, the findings in this study found no significant difference between excitatory cute stimuli, and inhibitory cute stimuli which contradicts Kringelbach et al.’s (2016) finding that implicit association would be higher in the excitatory condition. One potential explanation for this is that cuteness as a concept is perhaps more subjective than preference for food. For example, Lorenz (1971) demonstrated that ‘cuteness’ is a very subjective term, and that while commonly associated with Kindchenschema (e.g. big eyes), it is not always. Another possible explanation for these contradicting findings, in support of the Steffens (2004) study, is that participants can fake their answers on the IAT. As the stimuli for the cuteness condition featured photographs of humans, participants may have been more careful about what words they used for characteristics. For example, several studies have shown that some people may need to think more carefully and process more information about others before they can effectively describe them, particularly those individuals who have a high Need for Cognition (Cacioppo & Petty, 1982). Therefore, it may be that individual differences in participants, and the tendency to be more careful when describing people, are the reasons why the IAT for the cute condition is potentially bias.

The findings from this study demonstrate a real-world implication, as if the IAT is only effective in determining implicit attitudes for certain topics, this reinforces the degree of uncertainty surrounding its controversial use in socially sensitive subjects. For example, if the IAT is not valid, then in situations such as psychiatric evaluation of potential suicide victims before discharge (e.g. Goldacre & Seagroatt, 1993), is it possible to be certain through implicit association that they are not lying about their psychological condition? (e.g. Steffens, 2004). For something as important as the threat of loss of human-life, there needs to be a concrete method to evaluate someone’s mental condition, which the IAT may not necessarily provide.

In terms of further research, a study by O’Shea, Watson, & Brown (2015) looked at how implicit attitudes can best be measured. These researchers discovered that both the IAT and the IRAP (Implicit Relational Assessment Procedure) were not the most effective methods of measuring implicit attitudes. However, they proposed a new method entitled the SIP (Simple Implicit Procedure), which claims to be a way of measuring absolute, rather than relative implicit attitudes. While this tool is in its infancy and has had little in terms of research support, it demonstrates initial improved validity and accuracy compared to other measures, such as the IAT. Therefore, in the aim of gaining a concrete answer as to the most valid and reliable way of measuring implicit attitudes on socially sensitive issues, future research should test implicit association using the SIP. This should enable the establishment of whether the SIP is a way of validly and reliably measuring attitudes to social sensitive issues.

In conclusion, this study found that while there is evidence for the effectiveness of the IAT, it is not generalisable. As a result, this makes it very difficult to determine whether the IAT is a valid and reliable way to measure attitudes to socially sensitive topics. Therefore, further research should be conducted using the SIP (a proposed, more valid measure of implicit association) in order to fully establish a measure of attitudes to socially sensitive issues.

This Sociology essay was submitted to us by a student in order to help you with your studies.

