Sample size computation for multiple comparisons

Traditional sample size computation based on "power" does not apply directly to multiple comparisons, because the power of a test of homogeneity includes the probability of an incorrect decision. For example, the F-test may reject because the sample mean of treatment 2 is much larger than the sample mean of treatment 3, when in fact the population mean of treatment 2 is smaller than the population mean of treatment 3. Thus, the power of a test of homogeneity includes some probability of incorrect multiple comparison inference, which is undesirable.

Sample size computation implemented here computes the joint probability of "correct" and "useful" inference, where

correct inference = all separations are in the right direction

useful inference = all treatments sufficiently far apart are separated

See Appendix C of Multiple Comparisons: Theory and Methods for a discussion of this concept and details of the computation.

In a paper titled "On an Approach to Sample Size Determination for Confidence Intervals Proposed by Hsu" which appeared in the JSM97 proceedings of the Biopharmaceutical Section, Olivier Guilbaud of Astra gave a technique to easily and accurately approximate the desired sample size. His idea is as follows. If one let A be the event of {correct inference} and let B be the event of {useful inference}, then computing P{A and B} by

  1. pretending the complements of A and B are disjoint is conservative, based on the Bonferroni inequality;
  2. pretending A and B are independent is liberal, based on Kimball's inequality.
In the range of alphas with which one is typically concerned, 1. and 2. give probabilities that are very close to each other. His paper includes sample SAS codes. So, in lieu of the computation as described in Appendix D, which is for one-way designs, I suggest Dr. Guilbaud's technique, who is applicable to more general designs.