Introduction to Inference

Null Hypothesis

a statistical statement that assumes there is no effect, no difference, or no relationship between variables being studied
the default position that any observed difference is due to random chance rather than a real effect
the question is whether there is enough evidence to reject the null hypothesis

Hypothesis Testing

goal: reject the null hypothesis in favor of an alternative hypothesis that suggests there is a real effect
We only reject the null hypothesis if we find strong statistical evidence against it

Think of it like a criminal trial - the defendant is presumed innocent (null hypothesis) until proven guilty beyond a reasonable doubt. Similarly, we assume no effect exists until we have strong statistical evidence suggesting otherwise.

Null Hypothesis

\(H_0\) (Null Hypothesis): The variables region and happiness score are independent. The difference in scores across different regions was due to natural variability inherent in the population.

\(H_1\) (Alternative Hypothesis): The variables region and happiness score are not independent. The difference in scores across different regions was not due to natural variability.

Type I and Type II errors

Type I Error (False Positive):

Rejecting the null hypothesis when the null hypothesis should not have been rejected

Type II Error (False Negative):

Failing to reject the null hypothesis when it should have been rejected

A courtroom analogy:

Type I Error: Convicting an innocent person
Type II Error: Letting a guilty person go free

p-value

the probability of type I error
lower the p-value is, the lower the probability of getting that result if the null hypothesis were true
alpha level α is usually 0.05 (often the alpha level is adjusted if more than one statistical test is run)

“p-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone” – American Statistical Association (ASA)