Why Sample Size Matters
An underpowered study — one with too few participants — risks failing to detect a real effect (Type II error). Conversely, an unnecessarily large sample wastes resources and may expose more participants to research risks than necessary. Proper sample size calculation balances statistical power, precision, and feasibility.
Key Concepts
- Significance level (alpha): The probability of a Type I error (false positive). Conventionally set at 0.05 (5%).
- Statistical power (1 - beta): The probability of detecting a true effect. Typically set at 0.80 (80%) or 0.90 (90%).
- Effect size: The magnitude of difference you expect to detect. Smaller effects require larger samples.
- Variability: More variable outcomes require larger samples.
Formulas by Study Design
Cross-sectional (prevalence study): n = Z² × p(1-p) / d², where Z is the Z-value for desired confidence level (1.96 for 95%), p is the expected prevalence, and d is the desired precision (margin of error).
Comparing two proportions: Used in case-control and cohort studies. Requires the expected proportion in each group, desired power, and significance level.
Comparing two means: Requires the expected difference in means, standard deviation, power, and significance level.
Free Software Tools
- G*Power: Free desktop application with a graphical interface. Supports t-tests, F-tests, chi-square, regression, and more.
- OpenEpi: Free web-based tool (openepi.com) for epidemiological calculations including sample size for various study designs.
- PS (Power and Sample Size): Free software from Vanderbilt University for survival analysis and other designs.
- R packages: pwr, samplesize, and TrialSize for programmatic calculation.
Common Mistakes
- Using arbitrary sample sizes ("we recruited 100 patients because that seemed enough")
- Not accounting for expected dropouts/non-response (add 10-20% to calculated sample)
- Using unrealistic effect size estimates (too optimistic)
- Confusing sample size for the total study with sample size per group
- Not adjusting for clustering in cluster-randomized designs
- Calculating sample size after data collection (post-hoc power analysis is meaningless)