Introduction to Inference
Null Hypothesis
- a statistical statement that assumes there is no effect, no difference, or no relationship between variables being studied
- the default position that any observed difference is due to random chance rather than a real effect
- the question is whether there is enough evidence to reject the null hypothesis
Hypothesis Testing
- goal: reject the null hypothesis in favor of an alternative hypothesis that suggests there is a real effect
- We only reject the null hypothesis if we find strong statistical evidence against it
Think of it like a criminal trial - the defendant is presumed innocent (null hypothesis) until proven guilty beyond a reasonable doubt. Similarly, we assume no effect exists until we have strong statistical evidence suggesting otherwise.
Null Hypothesis
\(H_0\) (Null Hypothesis): The variables region and happiness score are independent. The difference in scores across different regions was due to natural variability inherent in the population.
\(H_1\) (Alternative Hypothesis): The variables region and happiness score are not independent. The difference in scores across different regions was not due to natural variability.
Type I and Type II errors
Type I Error (False Positive):
- Rejecting the null hypothesis when the null hypothesis should not have been rejected
Type II Error (False Negative):
- Failing to reject the null hypothesis when it should have been rejected
A courtroom analogy:
- Type I Error: Convicting an innocent person
- Type II Error: Letting a guilty person go free
p-value
- the probability of type I error
- lower the p-value is, the lower the probability of getting that result if the null hypothesis were true
- alpha level α is usually 0.05 (often the alpha level is adjusted if more than one statistical test is run)
“p-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone” – American Statistical Association (ASA)