## Saturday, March 28, 2015

### Two things to stop saying about null hypotheses

There is a currently fashionable way of describing Bayes factors that resonates with experimental psychologists. I hear it often, particularly as a way to describe a particular use of Bayes factors. For example, one might say, “I needed to prove the null, so I used a Bayes factor,” or “Bayes factors are great because with them, you can prove the null.” I understand the motivation behind this sort of language but please: stop saying one can “prove the null” with Bayes factors.

I also often hear other people say “but the null is never true.” I'd like to explain why we should avoid saying both of these things.

 Null hypotheses are tired of your jibber jabber

### Why you shouldn't say “prove the null”

Statistics is complicated. People often come up with colloquial ways of describing what a particular method is doing: for instance, one might say a significance tests give us “evidence against the null”; one might say that a “confidence interval tells us the 95% most plausible values”; or one might say that a Bayes factor helps us “prove the null.” Bayesians often are quick to correct misconceptions that people use to justify their use of classical or frequentist methods. It is just as important to correct misconceptions about Bayesian methods.

In order to understand why we shouldn't say “prove the null”, consider the following situation: You have a friend who claims that they can affect the moon with their mind. You, of course, think this is preposterous. Your friend looks up at the moon and says “See, I'm using my abilities right now!” You check the time.

You then decide to head to the local lunar seismologist, who has good records of subtle moon tremors. You ask her whether about what happened at the time your friend was looking at the moon, and she reports back to you that lunar activity at that time was stronger than it typically is 95% of the time (thus passes the bar for “statistical significance”).

Does this mean that there is evidence for your friend's assertion? The answer is “no.” Your friend made no statement about what one would expect from the seismic data. In fact, your friend's statement is completely unfalsifiable (as is the case with the typical “alternative” in a significance test, $\mu\neq0$).

But consider the following alternative statements your friend could have made: “I will destroy the moon with my mind”; “I will make very large tremors (with magnitude $Y$)”; “I will make small tremors (with magnitude $X$).” How do we now regard your friend's claims in light of the what happened?
• “I will destroy the moon with my mind” is clearly inconsistent with the data. You (the null) are supported by an infinite amount, because you have completely falsified his statement that he would destroy the moon (the alternative).
• “I will make very large tremors (with magnitude $Y$)” is also inconsistent with the data, but if we allow a range of uncertainty around his claim, may not be completely falsified. Thus you (the null) are supported, but not by as much in the first situation.
• “I will make small tremors (with magnitude $X$)” may support you (the null) or your friend (the alternative), depending on how the magnitude predicted and observed.
Here we can see that the support for the null depends on the alternative at hand. This is, of course, as it must be. Scientific evidence is relative. We can never “prove the null”: we can only “find evidence for a specified null hypothesis against a reasonable, well-specified alternative”. That's quite a mouthful, it's true, but “prove the null” creates misunderstandings about Bayesian statistics, and makes it appear that it is doing something it cannot do.

In a Bayesian setup, the null and alternative are both models and the relative evidence between them will change based on how we specify them. If we specify them in a reasonable manner, such that the null and alternative correspond to relevant theoretical viewpoints or encode information about the question at hand, the relative statistical evidence will be informative for our research ends. If we don't specify reasonable models, then the relative evidence between the models may be correct, but useless.

We never “prove the null” or “compute the probability of the null hypothesis”. We can only compare a null model to an alternative model, and determine the relative evidence.

[See also Gelman and Shalizi (2013) and Morey, Romeijn and Rouder (2013)]

### Why you shouldn't say “the null is never true”

A common retort to tests including a point null (often called a 'null' hypothesis) is that “the null is never true.” This backed up by four sorts of “evidence”:
• A quote from an authority: “Tukey or Cohen said so!” (Tukey was smart, but this is not an argument.)
• Common knowledge / “experience”: “We all know the null is impossible.” (This was Tukey's “argument”)
• Circular: “The area under a point in a density curve is 0.” (Of course if your model doesn't have a point null, the point null will be impossible.)
• All models are “false” (even if this were true --- I think it is actually a category error --- it would equally apply to all alternatives as well)
The most attractive seems to be the second, but it should be noted that people almost never use techniques that allow finding evidence for null hypotheses. Under these conditions, how is one determining that the null is never true? If a null were ever true, we would not be able to accumulate evidence for it, so the second argument definitely has a hint of circularity as well.

When someone says “The null hypothesis is impossible/implausible/irrelevant”, what they are saying in reality is “I don't believe the null hypothesis can possibly be true.” This is a totally fine statement, as long as we recognize it for what it is: an a priori commitment. We should not pretend that it is anything else; I cannot see any way that one can find universal evidence for the statement “the null is impossible”.

If you find the null hypothesis implausible, that's OK. Others might not find it implausible. It is ultimately up to substantive experts to decide what hypotheses they want to consider in their data analysis, and not up to methodologists or statisticians to decide to tell experts what to think.

Any automatic behavior — either automatically rejecting all null hypothesis, or automatically testing null hypotheses — is bad. Hypothesis testing and estimation should be considered and deliberate. Luckily, Bayesian statistics allows both to be done in a principled, coherent manner, so informed choices can be made by the analyst and not by the restrictions of the method.

1. Great post and I appreciate that "the null is never true" is not an appropriate statement. But researchers should be pushed to reflect whether the point null is actually an interesting thing to reject, and this is the spirit in which I think about the statement "the null is never true". For example, one could say in a priming experiment "Will you really have accomplished something if you reject the null, if the difference in population means is actually very tiny, such as 0.000001? Would such a difference really have consequences for psychological theory?" I can imagine someone usefully saying "The null is never true" in such a discussion, e.g. in a priming experiment there might be some subjects who consciously see the prime, ruminate on it, and then it affects their behavior. So here, contamination means the null is never true. But that way in which the null isn't true, leading to a tiny effect, isn't what the researcher was interested in.

This is part of the broader problem that psychologists so often only attempt to reject the null rather than a more cumulative science of effect sizes and theories that actually predict them. But of course that's a great thing about Bayesianism! you have to specify an alternative hypothesis, which often leads one to concentrate on coming up with a predicted effect size.

1. Hi Alex,

I broadly agree with you (which is why I said that data analysis should be a deliberate process), but I'd like to point out that when you say "I can imagine...contamination means the null is never true" is itself a theoretical statement. It makes theoretical assumptions about how the system works that may be completely reasonable, or may be wrong. Biological systems have low- and high- pass filters; it is not the case that just because something has an effect on one part of a biological system, that it must have an effect on another. If we knew so much about a biological system that we could confidently say that the null is never true in a particular circumstance, we'd be far further along in our understanding of how things work (in psychology, in particular).

It is important that we recognise that these sorts of arguments are scientific, and they must be *at least* testable, if not actually tested. And if they are to be testable, we must have the statistical machinery to test them. The best we can do is ask people to be deliberate. If there is not good reason to think the null is false and someone is doing parameter estimation, we should ask "Do you really know enough to assume that the null is false?"; if there is good reason to think the null is false, and they're doing point-null hypothesis testing, we should ask "Do you really want to assume that the null hypothesis can be true?"

2. نجار ابواب بالرياض نجار ابواب بالرياض
تركيب ستائر بالرياض افضل شركة تركيب ستائر بالرياض
تنظيف مكيفات بالرياض شركة غسيل مكيفات بالرياض
شركة تنظيف افران الغاز بالرياض تنظيف افران بالرياض

3. A sort of speculation utilized in insights that suggests that no factual importance exists in a lot of given perceptions. The invalid speculation endeavors to show that no variety exists between factors, or that a solitary variable is the same than zero. in style jackets.

4. Thanks for sharing this piece of information. I really enjoyed it. keep up the good work and all the very best of luck!
Men Shearling Bomber Jacket

5. Keep your everyday practice. I ordinarily wake up between 4:00 - 4:30 am and start the day. Plainly, I don't work out - however I do devour heaps of espresso, survey the messages that came in for the time being, and watch the news on TV.
household cleaning services

6. Setting your end time. The excellence of telecommuting isn't voyaging, and without the drive I generally feel that I can work somewhat more and complete more.
ISO 14001 Jalajil

7. Rather than expecting to return to the workplace to enter notes or maybe include a new position request, you can do it while in a hurry to your next customer. This expands your profitability and capacity to fill occupations snappier and at last gives you a bit of leeway over your opposition.
uk offshore company registration

8. A Tier 3 and 4 server farm offers the most security with an uptime of around 99.982% and 99.995% separately. What is uptime? It is the measure of time a server keeps awake and running without power issues or different issues. Pay Someone To Do Your Assignment

9. This comment has been removed by the author.

10. A kind of theory used in bits of knowledge that recommends that no truthful significance exists in a ton of given discernment. The invalid hypothesis attempts to show that no assortment exists between factors, or that a lone variable is the equivalent than zero.du unlimited data package