1.4 Designed Experiments

Adapted by John Morgan Russell; from Barbara Illowsky and Susan Dean, David Diez, Mine Cetinkaya-Rundel and Christopher D. Barr; Julie Vu and David Harrington

1.4 Designed Experiments

Observational Studies vs. Experiments

Ignoring anecdotal evidence, there are two primary types of data collection: observational studies and controlled (designed) experiments. Remember, we typically cannot make claims of causality from observation studies because of the potential presence of confounding factors. However, making causal conclusions based on experiments is often reasonable if we control for those factors.

Suppose you want to investigate the effectiveness of vitamin D in preventing disease. You recruit a group of subjects and ask them if they regularly take vitamin D. You notice that the subjects who take vitamin D exhibit better health on average than those who do not. Does this prove that vitamin D is effective in disease prevention? It does not. There are many differences between the two groups beyond just vitamin D consumption. People who take vitamin D regularly often take other steps to improve their health: exercise, diet, other vitamin supplements, choosing not to smoke. Any one of these factors could influence health. As described, this study does not necessarily prove that vitamin D is the key to disease prevention.

Experiments ultimately aim to provide evidence for use in decision-making, so how could we narrow our focus and make claims of causality? In this section, you will learn important aspects of experimental design.

Designed Experiments

The purpose of an experiment is to investigate the relationship between two variables. When one variable causes change in another, we call the first variable the explanatory variable. The affected variable is called the response variable. In a randomized experiment, the researcher manipulates values of the explanatory variable and measures the resulting changes in the response variable. The different values of the explanatory variable may be called treatments. An experimental unit is a single object or individual being measured.

The main principles to follow in experimental design are:

Randomization
Replication
Control

Randomization

In order to provide evidence that the explanatory variable is indeed causing the changes in the response variable, it is necessary to isolate the explanatory variable. The researcher must design the experiment in such a way that there is only one difference between groups being compared: the planned treatments. This is accomplished by randomizing the experimental units placed into treatment groups. When subjects are assigned treatments randomly, all of the potential lurking variables are spread equally among the groups. At this point, the only difference between groups is the one imposed by the researcher. As a result, different outcomes measured in the response variable must be a direct result of the different treatments. In this way, an experiment can show an apparent cause-and-effect connection between the explanatory and response variables.

Recall our previous example of investigating the effectiveness of vitamin D in preventing disease. Individuals in our trial could be randomly assigned, perhaps by flipping a coin, into one of two groups: the control group (no treatment) and the experimental group (extra doses of vitamin D).

Replication

The more cases researchers observe, the more accurately they can estimate the effect of the explanatory variable on the response. In a single study, we replicate by collecting a sufficiently large sample. Additionally, a group of scientists may replicate an entire study to verify an earlier finding. It is also helpful to subject individuals to the same treatment more than once, which is known as repeated measures.

Control

The power of suggestion can have an important influence on the outcome of an experiment. Studies have shown that the expectations of the study participant can be as important as the actual medication. In one study of performance-enhancing drugs, researchers noted, “Results showed that believing one had taken the substance resulted in [performance] times almost as fast as those associated with consuming the drug itself. In contrast, taking the drug without knowledge yielded no significant performance increment.” ^[1]

It is often difficult to isolate the effects of the explanatory variable. To counter the power of suggestion, researchers set aside one treatment group as a control group. This group is given a placebo treatment—a treatment that cannot influence the response variable. The control group helps researchers balance the effects of being in an experiment with the effects of the active treatments. Of course, if you are participating in a study and you know that you are receiving a pill that contains no actual medication, then the power of suggestion is no longer a factor. Blinding in a randomized experiment preserves the power of suggestion. When a person involved in a research study is blinded, he does not know who is receiving the active treatment(s) and who is receiving the placebo treatment. A double-blind experiment is one in which both the subjects and the researchers involved with the subjects are unaware.

Randomized experiments are an essential tool in research. The U.S. Food and Drug Administration typically requires that a new drug can only be marketed after two independently conducted randomized trials confirm its safety and efficacy; the European Medicines Agency has a similar policy. Large randomized experiments in medicine have provided the basis for major public health initiatives. In 1954, approximately 750,000 children participated in a randomized study comparing the polio vaccine with a placebo. In the United States, the results of the study quickly led to the widespread and successful use of the vaccine for polio prevention.

Example

How does sleep deprivation affect your ability to drive? A recent study measured its effects on 19 professional drivers. Each driver participated in two experimental sessions: one after normal sleep and one after 27 hours of total sleep deprivation. The treatments were assigned in random order. In each session, performance was measured on a variety of tasks including a driving simulation.

Your Turn!

The Smell & Taste Treatment and Research Foundation conducted a study to investigate whether smell can affect learning. Subjects completed pencil-and-paper mazes multiple times while wearing masks. They completed the mazes three times wearing floral-scented masks and three times with unscented masks. Participants were assigned at random to wear the floral mask during either the first three or last three trials. For each trial, researchers recorded the time it took to complete the maze and whether the subject’s impression of the mask’s scent was positive, negative, or neutral.

Describe the explanatory and response variables in this study.
What are the treatments?
Identify any lurking variables that could interfere with this study.
Is it possible to use blinding in this study?

Solution

The explanatory variable is scent and the response variable is the time it takes to complete the maze.
There are two treatments: a floral-scented mask and an unscented mask.
All subjects experienced both treatments. The order of treatments was randomly assigned so there were no differences between the treatment groups. Random assignment eliminates the problem of lurking variables.
Subjects will clearly know whether they can smell flowers or not, so subjects cannot be blinded in this study. Researchers timing the mazes can be blinded. The researcher who is observing a subject will not know which mask is being worn.

More Experimental Design

There are many different experimental designs from the most basic—a single treatment and control group—to some very complicated designs. When working with more than one treatment in an experimental design setting, these variables are often called factors, especially if they are categorical. The values of factors are are often called levels. When there are multiple factors, the combinations of each of the levels are called treatment combinations, or interactions. Some basic types of interactions you may see are:

Completely randomized
Block design
Matched pairs design

Completely Randomized

This essential research tool does not require much explanation. It involves figuring out how many treatments will be administered and randomly assigning participants to their respective groups.

Block Design

Researchers sometimes know or suspect that variables outside of the treatment influence the response. Based on this, they may first group individuals into blocks and then randomly draw cases from each block for the treatment groups. This strategy is often referred to as blocking. For instance, if we are looking at the effect of a drug on heart attacks, we might first split patients in the study into low-risk and high-risk blocks, then randomly assign half the patients from each block to the control group and the other half to the treatment group, as shown in the figure below. This strategy ensures each treatment group has an equal number of low-risk and high-risk patients.

Figure 1.5: Block design. Figure description available at the end of the section.

Matched Pairs

A matched pairs design is one where very similar individuals (or even the same individual) receive two different treatments (or treatment vs. control) and the results are compared. Though this design is very effective, it can be hard to find many suitably similar individuals. Some common forms of a matched pairs design are twin studies, before-and-after measurements, pre- and post-test situations, and crossover studies.

Was the use of a new wetsuit design responsible for an observed increase in swim velocities at the 2000 Summer Olympics? In a matched pairs study designed to investigate this question, twelve competitive swimmers swam 1,500 meters at maximal speed, once wearing a wetsuit and once wearing a regular swimsuit. The order of wetsuit and swimsuit trials was randomized for each of the 12 swimmers. Figure 1.6 shows the average velocity recorded for each swimmer, measured in meters per second (m/s).

	swimmer.number	wet.suit.velocity	swim.suit.velocity	velocity.diff
1	1	1.57	1.49	0.08
2	2	1.47	1.37	0.10
3	3	1.42	1.35	0.07
4	4	1.35	1.27	0.08
5	5	1.22	1.12	0.10
6	6	1.75	1.64	0.11
7	7	1.64	1.59	0.05
8	8	1.57	1.52	0.05
9	9	1.56	1.50	0.06
10	10	1.53	1.45	0.08
11	11	1.49	1.44	0.05
12	12	1.51	1.41	0.10

Figure 1.6: Average velocity of swimmers

In this data, two sets of observations are uniquely paired so that an observation in one set matches an observation in the other; in this case, each swimmer has two measured velocities, one with a wetsuit and one with a swimsuit. A natural measure of the effect of the wetsuit on swim velocity is the difference between the measured maximum velocities (velocity.diff = wet.suit.velocit – swim.suit.velocity). Even though there are two measurements per individual, using the difference in observations as the variable of interest allows for the problem to be analyzed.

Example

A new windshield treatment claims to repel water more effectively. Ten windshields are tested by simulating rain without the new treatment. The same windshields are then treated, and the experiment is run again. What experiment design is being implemented here?

Solution

Matched pairs

Your Turn!

A new medicine is said to help improve sleep. Eight subjects are picked at random and given the medicine. The mean hours slept for each person were recorded before and after stating the medication. What experiment design is being implemented here?

Solution

Matched pairs

Additional Resources

Click here for additional multimedia resources, including podcasts, videos, lecture notes, and worked examples.

Figure References

Figure 1.5: Kindred Grey (2020). Block design. CC BY-SA 4.0.

Figure Descriptions

Figure 1.5: Box labeled ‘numbered patients’ that has 54 blue or orange circles numbered from one through 54. Two arrows point from this box to two boxes below it with the caption ‘create blocks’. The left box is all of the orange circles grouped together labeled ‘low-risk patients’. The right box is all of the blue circles grouped together labeled ‘high-risk patients’. An arrow points down from the left box and the right box with the caption ‘randomly split in half’. The arrows point to a ‘Control’ box and a ‘Treatment’ box. Both of these boxes have half orange circles and half blue circles.

McClung, Mary, and Dave Collins, ""Because I know it will!": Placebo Effects of an Ergogenic Acid on Athletic Performance," Journal of Sport & Exercise Psychology, 29, no. 3 (2007): 382-394. ↵

License

Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Significant Statistics: An Introduction to Statistics Copyright © 2025 by John Morgan Russell is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, except where otherwise noted.

Observational Studies vs. Experiments

Designed Experiments

Randomization

Replication

Control

More Experimental Design

Completely Randomized

Block Design

Matched Pairs

License

Share This Book