To consider the POWER of your statistical analysis, we need to take a step back and talk briefly about Hypothesis tests and their relationship with POWER.
Remember how you start your research? With a hypothesis. For our little example we will have an hypothesis statement that says the mean height of cats is equal to the mean height of dogs. The alternate hypothesis would then say that the mean height of cats is not equal to the mean height of dogs.
Ho: µcats = µdogs
Ha: µcats ≠ µdogs
We are using an alpha value of 5%, therefore our p-value = 0.05. We went out to measure 4 cats and 4 dogs and their height measurements (inches) are:
Cats: 11, 13, 11, 14
Dogs: 24, 21, 18, 28
The mean height for cats is 12.5 with a standard deviation of 1.5
The mean height for dogs is 22.8 with a standard deviation of 4.3
I can conduct a t-test and it provides me with a p-value of 0.02. With data such as this I can also calculate the variation around the mean, such that I have 11.0-14.0 (12.5 ± 1.5) for the cats and 18.5-27.1 (22.8 ± 4.3) for the dogs. Do the ranges overlap? No.
What conclusion do we draw?
That we will reject the Null hypothesis and state that dogs are significantly taller than cats by an average of 10″.
Sounds great right? We did expect that the dogs would be taller than cats. So right from the beginning, in this example, our experience and knowledge of cats and dogs, told us that the Null hypothesis was false – and with our little sample we proved it!
Let’s review this table – in our case we were working with a Ho that we knew to be false and we rejected the Ho – so we have NO ERROR.
H_{o} is TRUE | H_{o} is FALSE | |
REJECT the NULL Hypothesis | Type I error (ALPHA) |
No error (POWER = 1-BETA) |
ACCEPT the NULL Hypothesis | No error (1-ALPHA) |
Type II error (BETA) |
We’re going to repeat this experiment and measure another 8 animals – 4 cats and 4 dogs.
Ho: µcats = µdogs
Ha: µcats ≠ µdogs
We are again using an alpha value of 5%, therefore our p-value = 0.05. We have height measurements (inches) of 4 cats and 4 dogs:
Cats: 21, 13, 11, 14
Dogs: 23, 21, 18, 14
The mean height for cats is 14.8 with a standard deviation of 4.3
The mean height for dogs is 19.0 with a standard deviation of 3.9
I can conduct a t-test and it provides me with a p-value of 0.19. With data such as this I can calculate the variation around the mean, such that I have 10.5-19.1 (14.8 ± 4.3) for the cats and 15.1-22.9 (19.0 ± 3.9) for the dogs. Do the ranges overlap? Yes.
What conclusion do we draw?
That we will NOT reject the Null hypothesis and state that the average height of cats and dogs is the same.
Are we comfortable with this? If you review the table presented above – now we still have a FALSE Ho and this time around we did NOT reject the Null hypothesis – leading us to committing a Type II or Beta error.
A Type II error is directly related to the POWER of the test. By definition, the power of a statistical test, is the probability that the test will correctly reject the null hypothesis when it is false.
POWER is related to a number of factors:
- sample size
- effect size – or the size of the difference between treatment groups
- variation of our outcome variable
- level of significance – p-value
Consider our example above, what factors could be change to increase the POWER of our test and ensure that we won’t see similar results to the second time we collected data?
- Sample size
There are several ways to calculate the POWER of a statistical test. SAS has 2 PROCs – Proc POWER and Proc GLMPOWER. Review the SASsy Fridays post on these. There are many links to online calculators as well. Please choose one that is defendable.