What happens when a sample size is not big enough?

A good statistical study is one that is well designed and leads to valid conclusions. This however, is not always the case, even in published studies. In Cohen’s (1962) seminal power analysis of the journal of Abnormal and Social Psychology he concluded that over half of the published studies were insufficiently powered to result in statistical significance for the main hypothesis.

What is Sample Size? 

The power of a statistical test is the probability that a test will reject the null hypothesis when the null hypothesis is false. That is, power reflects the probability of not committing a type II error. The two major factors affecting the power of a study are the sample size and the effect size.

Video: How To Calculate Sample Size For Clinical Trials in 5 Steps


The larger the sample size is the smaller the effect size that can be detected. The reverse is also true; small sample sizes can detect large effect sizes. While researchers generally have a strong idea of the effect size in their planned study it is in determining an appropriate sample size that often leads to an underpowered study. This poses both scientific and ethical issues for researchers.

A study that has a sample size which is too small may produce inconclusive results and could also be considered unethical, because exposing human subjects or lab animals to the possible risks associated with research is only justifiable if there is a realistic chance that the study will yield useful information.

Similarly, a study that has a sample size which is too large will waste scarce resources and could expose more participants than necessary to any related risk. Thus an appropriate determination of the sample size used in a study is a crucial step in the design of a study.

More recent studies analysing the power of published papers has shown that, even still, there are large numbers of papers being published that have insufficient power. With the availability of sample size software such as nQuery Sample Size and Power Calculator for Successful Clinical Trials which can calculate appropriate sample sizes for any given power such issues should not be arising so often today.

To summarize why sample size is important:

  • The two major factors affecting the power of a study are the sample size and the effect size
  • A study should only be undertaken once there is a realistic chance that the study will yield useful information
  • A study that has a sample size which is too small may produce inconclusive results and could also be considered unethical by exposing human subjects or lab animals to needless risk
  • A study that is too large will waste scarce resources and could expose more participants than necessary to any related risk
  • Thus an appropriate determination of the sample size used in a study is a crucial step in the design of a study
Now you know why sample size is important, learn the 5 Essential Steps to Determine Sample Size & Power

What happens when a sample size is not big enough?

Click the image above to view our guide to calculate sample size. With this knowledge you can then excel at using a
sample size calculator like nQuery.

See what the industry leading software can do for you

What happens when a sample size is not big enough?

What happens when a sample size is not big enough?

Updated March 13, 2018

By Chris Deziel

Determining the veracity of a parameter or hypothesis as it applies to a large population can be impractical or impossible for a number of reasons, so it's common to determine it for a smaller group, called a sample. A sample size that is too small reduces the power of the study and increases the margin of error, which can render the study meaningless. Researchers may be compelled to limit the sampling size for economic and other reasons. To ensure meaningful results, they usually adjust sample size based on the required confidence level and margin of error, as well as on the expected deviation among individual results.

The power of a study is its ability to detect an effect when there is one to be detected. This depends on the size of the effect because large effects are easier to notice and increase the power of the study.

The power of the study is also a gauge of its ability to avoid Type II errors. A Type II error occurs when the results confirm the hypothesis on which the study was based when, in fact, an alternative hypothesis is true. A sample size that is too small increases the likelihood of a Type II error skewing the results, which decreases the power of the study.

To determine a sample size that will provide the most meaningful results, researchers first determine the preferred margin of error (ME) or the maximum amount they want the results to deviate from the statistical mean. It's usually expressed as a percentage, as in plus or minus 5 percent. Researchers also need a confidence level, which they determine before beginning the study. This number corresponds to a Z-score, which can be obtained from tables. Common confidence levels are 90 percent, 95 percent and 99 percent, corresponding to Z-scores of 1.645, 1.96 and 2.576 respectively. Researchers express the expected standard of deviation (SD) in the results. For a new study, it's common to choose 0.5.

Having determined the margin of error, Z-score and standard of deviation, researchers can calculate the ideal sample size by using the following formula:

(Z-score)2 x SD x (1-SD)/ME2 = Sample Size

In the formula, the sample size is directly proportional to Z-score and inversely proportional to the margin of error. Consequently, reducing the sample size reduces the confidence level of the study, which is related to the Z-score. Decreasing the sample size also increases the margin of error.

In short, when researchers are constrained to a small sample size for economic or logistical reasons, they may have to settle for less conclusive results. Whether or not this is an important issue depends ultimately on the size of the effect they are studying. For example, a small sample size would give more meaningful results in a poll of people living near an airport who are affected negatively by air traffic than it would in a poll of their education levels.