Multiplicity Issues:
Multiplicity issues arise in a number of contexts, but they generally boil down to the same thing: repeated looks at a data set in different ways, until something “statistically significant” emerges. See multiple comparisons for how to handle multiple pairwise testing in conjunction with ANOVA. In observational studies, problems arise when many different models are applied to the same data, particularly when a highly specific thesis-to-be-tested is not stated in advance. Stanley Young draws readers attention to a study (Mostofsky et al) in which an association is claimed between certain constituents of air pollution and ischemic stroke (American Journal of Epidemiology, December 27, 2012, letter to the editor). Young points out that, given the number of predictors and adjustors in the data set, 537 million models are possible to construct. How many models were tried before a statistically-significant association was found? Where multiple testing occurs and is properly disclosed, the false discovery rate (the expected number of false “significant” results with a given number of multiple tests under null models) can be used to control for inflated Type I error.