One of the most frequently-asked questions about the BayesFactor package is how to do multiple comparisons; that is, given that some effect exists across factor levels or means, how can we test whether two specific effects are unequal. In the next two posts, I'll explain how this can be done in two cases: in Part 1, I'll cover tests for equality, and in Part 2 I'll cover tests for specific order-restrictions.
Before we start, I will note that these methods are only meant to be used for pre-planned comparisons. They should not be used for post hoc comparisons.
An ExampleSuppose we are interested in the basis for feelings of moral disgust. One prominent theory, from the embodied cognition point of view, holds that feelings of moral disgust are extensions of more basic feelings of disgust: disgust for physical things, such as rotting meat, excrement, etc (Schnall et al, 2008; but see also Johnson et al., 2014 and Landy & Goodwin, in press). Under this theory, moral disgust is not only metaphorically related to physical disgust, but may share physiological responses with physical disgust.
Suppose we wish to experimentally test this theory, which predicts that feelings of physical disgust can be “transferred” to possible objects of moral disgust. We ask 150 participants to fill out a questionnaire that measures the harshness of their judgments of undocumented migrants. Participants are randomly assigned to one of three conditions, differing by the odor present in the room: a pleasant scent associated with cleanliness (lemon), a disgusting scent (sulfur), and a control condition in which no unusual odor is present. The dependent variable is the score on the questionnaire, which ranges from 0 to 50 with higher scores representing harsher moral judgment.
Hypothetical data, simulated for the sake of example, can be read into R using the
# Read in the data from the learnbayes.org disgust_data = read.table(url('http://www.learnbayes.org/disgust_example.txt'),header=TRUE)
A boxplot and means/standard errors reveal that the effects appear to be in the predicted direction:
(note that the axes are different in the two plots, so that the standard errors can be seen)
And we can perform a classical ANOVA on these data:
# ANOVA summary(aov(score ~ condition, data = disgust_data))
## Df Sum Sq Mean Sq F value Pr(>F) ## condition 2 263 131.4 2.91 0.058 . ## Residuals 147 6635 45.1 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The classical test of the null hypothesis that all means are equal just fails to reach significance at \(\alpha=0.05\).
A Bayes factor analysisWe can easily perform a Bayes factor test of the null hypothesis using the
BayesFactorpackage. This assumes that the prior settings are acceptable; because this post is about multiple comparisons, we will not explore prior settings here. See
?anovaBFfor more information.
anovaBFis a convenience function to perform Bayes factor ANOVA-like analyses. The code for the Bayes factor analysis is almost identical to the code for the classical test:
library(BayesFactor) bf1 = anovaBF(score ~ condition, data = disgust_data)
The Bayes factor in favor of a condition effect is about
## Bayes factor analysis ## -------------- ##  condition : 0.7738 ±0.01% ## ## Against denominator: ## Intercept only ## --- ## Bayes factor type: BFlinearModel, JZS
1/0.774 = 1.3in favor of the null (the “Intercept only” model). This is not strong evidence for either the null or the alternative, which given the moderate p value is perhaps not surprising. It should be noted here that even if the p value had just crept in under 0.05, the Bayes factor would not be appreciably different, which shows the inherent arbitrariness of significance testing.
Many possible hypotheses?This analysis is not the end of the story, however. The hypothesis tested above — that all means are different, but with no further specificity — was not really the hypothesis of interest. The hypothesis of interest was more specific. We might consider an entire spectrum of hypotheses, listed in increasing order of constraint:
- (most constrained) The null hypothesis (control = lemon = sulfur)
- (somewhat constrained) Unexpected scents cause the same effect, regardless of type (lemon = sulfur ≠ control; this might occur, for instance, both “clean” and “disgusting” scents prime the same underlying concepts)
- (somewhat constrained) Only disgusting scents have an effect (control = lemon ≠ sulfur)
- (somewhat constrained) Only pleasant scents have an effect (control = sulfur ≠ lemon)
- (unconstrained) All scents have unique effects (control ≠ sulfur ≠ lemon)
Testing equality constraintsTo test equality constraints, we must first consider what an equality constraint means. Claiming that an equality constraint holds is the same as saying that your predictions for data would not change if the two conditions are supposed to be the same had exactly the same label. If want to to impose the constraint that lemon = sulfur ≠ control, we merely have to give lemon and sulfur the same label.
In practice, this means making a new column in the data frame with the required change:
# Copy the condition column that we will change # We use 'as.character' to avoid using the same factor levels disgust_data$lemon.eq.sulfur = as.character(disgust_data$condition) # Change all 'lemon' to 'lemon/sulfur' disgust_data$lemon.eq.sulfur[ disgust_data$condition == "lemon" ] = 'lemon/sulfur' # Change all 'sulfur' to 'lemon/sulfur' disgust_data$lemon.eq.sulfur[ disgust_data$condition == "sulfur" ] = 'lemon/sulfur' # finally, make the column a factor disgust_data$lemon.eq.sulfur = factor(disgust_data$lemon.eq.sulfur)We now have a data column, called
lemon.eq.sulfur, that labels the data so that lemon and sulfur have the same labels. We can use this in Bayes factor test:
bf2 = anovaBF(score ~ lemon.eq.sulfur, data = disgust_data)
The null hypothesis is now preferred by a factor of
## Bayes factor analysis ## -------------- ##  lemon.eq.sulfur : 0.1921 ±0% ## ## Against denominator: ## Intercept only ## --- ## Bayes factor type: BFlinearModel, JZS
1/0.192 = 5.2, which is expected given that lemon and sulfur were the least similar pair of three means. The null hypothesis accounts for the data better than this constraint.
One of the conveniences of using Bayes factors is if we have two hypotheses that are both tested against the same third hypothesis, we can test the two hypotheses against one another. The
BayesFactorpackage makes this easy; any two
BayesFactorobjects compared against the same denominator — in this case, the intercept-only null hypothesis — can be combined together:
bf_both_tests = c(bf1, bf2) bf_both_tests
## Bayes factor analysis ## -------------- ##  condition : 0.7738 ±0.01% ##  lemon.eq.sulfur : 0.1921 ±0% ## ## Against denominator: ## Intercept only ## --- ## Bayes factor type: BFlinearModel, JZS
We could, for instance, put all equality-constraint tests into the same object, and then compare them like so:
bf_both_tests / bf_both_tests
The fully unconstrained hypothesis, represented by
## Bayes factor analysis ## -------------- ##  condition : 4.029 ±0.01% ## ## Against denominator: ## score ~ lemon.eq.sulfur ## --- ## Bayes factor type: BFlinearModel, JZS
condition, is preferred to the lemon = sulfur ≠ control hypothesis by a factor of about 4.
In the next post, we will use the
posterior()function to draw from the posterior of the unconstrained model, which will allow us to test ordering constraints.