BayesFactor: Software for Bayesian inference: Bayes

Showing posts with label Bayes. Show all posts

Thursday, November 12, 2015

Neyman does science, part 2

In part one of this series, we discussed the different philosophical viewpoints of Neyman and Fisher on the purposes of statistics. Neyman had a behavioral, decision based view: the purpose of statistical inference is to select one of several possible decisions, enumerated before the data have been collected. To Fisher, and to Bayesians, the purpose of statistical inference is related to the quantification of evidence and rational belief. I agree with Fisher on this issue, and I was curious how Neyman -- with his pre-data inferential philosophy -- would actually tackle a problem with real data. In this second part of the series, we examine Neyman's team's analysis of the data from the Whitetop weather modification experiment in the 1960s.

Neyman does science, part 1

On reading Neyman's statistical and scientific philosophy (e.g., Neyman, 1957), one of the things that strikes a scientist is its extreme rejection of post-data reasoning. Neyman adopts the view that once data is obtained statistical inference is not about reasoning, but is rather about the automatic adoption of one of several decisions. Given the importance of post-data reasoning to scientists -- which can be confirmed by reading any scientific manuscript -- I wondered how Neyman would think and write about an actual, applied problem. This series of blog posts explores Neyman's work on the analysis of weather modification experiments. The (perhaps unsurprising) take-home message from this series of posts is this: not even Neyman applied Neyman's philosophy, when he was confronted with real data.

The fallacy of placing confidence in confidence intervals (version 2)

I, with my coathors, have submitted a new draft of our paper "The fallacy of placing confidence in confidence intervals". This paper is substantially modified from its previous incarnation. Here is the main argument:

"[C]onfidence intervals may not be used as suggested by modern proponents because this usage is not justified by confidence interval theory. If used in the way CI proponents suggest, some CIs will provide severely misleading inferences for the given data; other CIs will not. Because such considerations are outside of CI theory, developers of CIs do not test them, and it is therefore often not known whether a given CI yields a reasonable inference or not. For this reason, we believe that appeal to CI theory is redundant in the best cases, when inferences can be justified outside CI theory, and unwise in the worst cases, when they cannot."

The document, source code, and all supplementary material is available here on github.

Friday, April 10, 2015

All about that "bias, bias, bias" (it's no trouble)

At some point, everyone who fiddles around with Bayes factors with point nulls notices something that, at first blush, seems strange: small effect sizes seem “biased” toward the null hypothesis. In null hypothesis significance testing, power simply increases when you change the true effect size. With Bayes factors, there is a non-monotonicity where increasing the sample size will slightly increase the degree to which a small effect size favors the null, then the small effect size becomes evidence for the alternative. I recall puzzling with this with Jeff Rouder years ago when drafting our 2009 paper on Bayesian t tests.

Two things to stop saying about null hypotheses

There is a currently fashionable way of describing Bayes factors that resonates with experimental psychologists. I hear it often, particularly as a way to describe a particular use of Bayes factors. For example, one might say, “I needed to prove the null, so I used a Bayes factor,” or “Bayes factors are great because with them, you can prove the null.” I understand the motivation behind this sort of language but please: stop saying one can “prove the null” with Bayes factors.

I also often hear other people say “but the null is never true.” I'd like to explain why we should avoid saying both of these things.

Statistical alchemy and the "test for excess significance"

[This post is based largely on my 2013 article for Journal of Mathematical Psychology; see the other articles in that special issue as well for more critiques.]

When I tell people that my primary area of research is statistical methods, one of the reactions I often encounter from people untrained in statistics is that “you can prove anything with statistics.” Of course, this rankles, first because it isn't true (unless you use a very strange definition of prove) and second because I've spent years learning the limitations of statistics, and there are many limitations. These limitations exist, however, in the context of enormous successes. In the sciences, the field of statistics rightly has a place of honor.

This success is evidenced by the great number of scientific arguments that are supported by statistical methods. Not all statistical arguments are created equal, of course. But the respect with which statistics is viewed has the unfortunate downside that a statistical argument can apparently turn a leaden hunch into a golden “truth”. This post is about such statistical alchemy.

To Beware or To Embrace The Prior

In this guest post, Jeff Rouder reacts to two recent comments skeptical of Bayesian statistics, and describes the importance of the prior in Bayesian statistics. In short: the prior gives a Bayesian model the power to predict data, and prediction is what allows the evaluation of evidence. Far from being a liability, Bayesian priors are what make Bayesian statistics useful to science.

BayesFactorExtras: a sneak preview

Felix Schönbrodt and I have been working on an R package called BayesFactorExtras. This package is designed to work with the BayesFactor package, providing features beyond the core BayesFactor functionality. Currently in the package are:

Sequential Bayes factor plots for visualization of how the Bayes factor changes as data come in: seqBFplot()
Ability to embed R objects directly into HTML reports for reproducible, sharable science: createDownloadURI()
Interactive BayesFactor objects in HTML reports; just print the object in a knitr document.
Interactive MCMC objects in HTML reports; just print the object in a knitr document.

All of these are pretty neat, but I thought I'd give a sneak preview of #4. To see how it works, click here to play with the document on Rpubs!

I anticipate releasing this to CRAN soon.

Saturday, February 7, 2015

On making a Bayesian omelet

My colleagues Eric-Jan Wagenmakers and Jeff Rouder and I have a new manuscript in which we respond to Hoijtink, van Kooten, and Hulsker's in press manuscript Why Bayesian Psychologists Should Change the Way they Use the Bayes Factor. They suggest a method for "calibrating" Bayes factor using error rates. We show that this method is fatally flawed, but also along the way we describe how we think about the subjective properties of the priors we use in our Bayes factors:

"...a particular researcher's subjective prior is of limited use in the context of a public scientific discussion. Statistical analysis is often used as part of an argument. Wielding a fully personal, subjective prior and concluding 'If you were me, you would believe this' might be useful in some contexts, but in others it is less useful. In the context of a scientific argument, it is much more useful to have priors that approximate what a reasonable, but somewhat-removed researcher would have in the situation. One could call this a 'consensus prior' approach. The need for broadly applicable arguments is not a unique property of statistics; it applies to all scientific arguments. We do not argue to convince ourselves; we should therefore make use of statistical arguments that are not pegged to our own beliefs...

It should now be obvious how we make our 'Bayesian omelet'; we break the eggs and cook the omelet for others in the hopes that it is something like what they would choose for themselves. With the right choice of ingredients, we think our Bayesian omelet can satisfy most people; others are free to make their own, and we would be happy to help them if we can. "

Our completely open, reproducible manuscript --- “Calibrated” Bayes factors should not be used: a reply to Hoijtink, van Kooten, and Hulsker --- along with a supplement and R code, is available on github (with DOI!).

Friday, January 30, 2015

On verbal categories for the interpretation of Bayes factors

As Bayesian analysis is becoming more popular, adopters of Bayesian statistics have had to consider new issues that they did not before. What is makes “good” prior? How do I interpret a posterior? What Bayes factor is “big enough”? Although the theoretical arguments for the use of Bayesian statistics are very strong, new and unfamiliar ideas can cause uncertainty in new adopters. Compared to the cozy certainty of \(p<.05\), Bayesian statistics requires more care and attention. In theory, this is no problem at all. But as Yogi Berra said, "In theory there is no difference between theory and practice. In practice there is."

In this post, I discuss the the use of verbal labels for magnitudes of Bayes factors. In short, I don't like them, and think they are unnecessary.

Multiple Comparisons with BayesFactor, Part 2 - order restrictions

In my previous post, I described how to do multiple comparisons using the BayesFactor package. Part 1 concentrated on testing equality constraints among effects: for instance, that the the effects of two factor levels are equal, while leaving the third free to be different. In this second part, I will describe how to test order restrictions on factor level effects. This post will be a little more involved than the previous one, because BayesFactor does not currently do order restrictions automatically.

Again, I will note that these methods are only meant to be used for pre-planned comparisons. They should not be used for post hoc comparisons.

Multiple Comparisons with BayesFactor, Part 1

One of the most frequently-asked questions about the BayesFactor package is how to do multiple comparisons; that is, given that some effect exists across factor levels or means, how can we test whether two specific effects are unequal. In the next two posts, I'll explain how this can be done in two cases: in Part 1, I'll cover tests for equality, and in Part 2 I'll cover tests for specific order-restrictions.

Before we start, I will note that these methods are only meant to be used for pre-planned comparisons. They should not be used for post hoc comparisons.

A parable on confidence intervals: why "confidence" is misleading

Null hypothesis significance testing (NHST) is increasingly falling out of style with methodologically-minded behavioral and social scientists. Many diverse critiques have been leveled against significance testing; the debate is increasingly what should replace it. Building on work with my colleagues (see here and here), I discuss and critique one replacement option that has been persistently suggested over the years: confidence procedures. We begin with a parable.

Bayes factor t tests, part 2: Two-sample tests

In the previous post, I introduced the logic of Bayes factors for one-sample designs by means of a simple example. In this post, I will give more detail about the models and assumptions used by the BayesFactor package, and also how to do simple analyses of two- sample designs.
See the previous posts for background:

This article will cover two-sample t tests.

Bayes factor t tests, part 1

In my first post, I described the general logic of Bayes factors. I will continue discussing the general logic of Bayes factor, while introducing some of the basic functionality of the BayesFactor package.

What is a Bayes factor?

The BayesFactor package

This blog is a companion to the BayesFactor package in R (website), which supports inference by Bayes factors in common research designs. Bayes factors have been proposed as more principled replacements for common classical statistical procedures such as \(p\) values; this blog will offer tutorials in using the package for data analysis.
In this first post, I describe the general logic of Bayes factors using a very simple research example. In the coming posts, I will show how to do a more complete Bayesian data analysis using the R package.

More about BayesFactor

Thursday, November 12, 2015

Neyman does science, part 2

Tuesday, November 10, 2015

Neyman does science, part 1

Monday, April 20, 2015

The fallacy of placing confidence in confidence intervals (version 2)

Friday, April 10, 2015

All about that "bias, bias, bias" (it's no trouble)

Saturday, March 28, 2015

Two things to stop saying about null hypotheses

Monday, March 23, 2015

Statistical alchemy and the "test for excess significance"

Sunday, March 1, 2015

To Beware or To Embrace The Prior

Tuesday, February 10, 2015

BayesFactorExtras: a sneak preview

Saturday, February 7, 2015

On making a Bayesian omelet

Friday, January 30, 2015

On verbal categories for the interpretation of Bayes factors

Sunday, January 18, 2015

Multiple Comparisons with BayesFactor, Part 2 - order restrictions

Saturday, January 17, 2015

Multiple Comparisons with BayesFactor, Part 1

Friday, December 5, 2014

A parable on confidence intervals: why "confidence" is misleading

Sunday, February 23, 2014

Bayes factor t tests, part 2: Two-sample tests

Wednesday, February 12, 2014

Bayes factor t tests, part 1

Sunday, February 9, 2014

What is a Bayes factor?

The BayesFactor package