3.1 Introduction to Probability and Terminology

Learning Objectives

By the end of this chapter, the student should be able to:

  • Understand and use the terminology of probability
  • Determine whether two events are mutually exclusive
  • Determine whether two events are independent
  • Construct and interpret contingency tables
  • Construct and interpret Venn diagrams
  • Construct and interpret tree diagrams
  • Calculate probabilities using the addition rules
  • Calculate probabilities using the multiplication rules


This is a photo taken of the night sky. A meteor and its tail are shown entering the earth's atmosphere.
Figure 3.1: Meteor Shower. Meteor showers are rare, but the probability of them occurring can be calculated.

You have, more than likely, used probability. In fact, you probably have an intuitive sense of probability. Probability deals with the chance of an event occurring. Whenever you weigh the odds of whether or not to do your homework or to study for an exam, you are using probability. In this chapter, you will learn how to solve probability problems using a systematic approach.


Probability is a measure that is associated with how certain we are of outcomes of a particular experiment or activity. An experiment is a planned operation carried out under controlled conditions. If the result is not predetermined, then the experiment is said to be a probability experiment. Flipping one fair coin twice is an example of an experiment.

A result of an experiment is called an outcome. The sample space of an experiment is the set of all possible outcomes. Three ways to represent a sample space are: to list the possible outcomes, to create a tree diagram, or to create a Venn diagram. The uppercase letter S is used to denote the sample space. For example, if you flip one fair coin, S = {H, T} where H = heads and T = tails are the outcomes.

An event is any combination of outcomes. Upper case letters like A and B represent events. For example, if the experiment is to flip one fair coin, event A might be getting at most one head. The probability of an event A is written P(A).

The probability of any outcome is the long-term relative frequency of that outcome. Probabilities are between zero and one, inclusive (that is, zero and one and all numbers between these values). P(A) = 0 means the event A can never happen. P(A) = 1 means the event A always happens. P(A) = 0.5 means the event A is equally likely to occur or not to occur. For example, if you flip one fair coin repeatedly (from 20 to 2,000 to 20,000 times) the relative frequency of heads approaches 0.5 (the probability of heads).

A probability model is a mathematical representation of a random process that lists all possible outcomes and assigns probabilities to each of them.  This type of model is our ultimately our goal when moving forward in our study of statistics.

The Law of Large Numbers

An important characteristic of probability experiments, known as the law of large numbers, states that as the number of repetitions of an experiment is increased, the relative frequency obtained in the experiment tends to become closer and closer to the theoretical probability. Even though the outcomes do not happen according to any set pattern or order, overall, the long-term observed relative frequency will approach the theoretical probability. (The word empirical is often used instead of the word observed.)

You toss a coin and record the result. What is the probability that the result is heads? If you flip a coin two times, does probability tell you that these flips will result in one heads and one tail? You might toss a fair coin ten times and record nine heads. Probability does not describe the short-term results of an experiment, rather it gives information about what can be expected in the long term. To demonstrate this, Karl Pearson once tossed a fair coin 24,000 times! He recorded the results of each toss, obtaining heads 12,012 times. In his experiment, Pearson illustrated the Law of Large Numbers.

The Classical Approach to Probability

Equally likely means that each outcome of an experiment occurs with equal probability. For example, if you toss a fair, six-sided die, each face (1, 2, 3, 4, 5, or 6) is as likely to occur as any other face. If you toss a fair coin, a Head (H) and a Tail (T) are equally likely to occur. If you randomly guess the answer to a true/false question on an exam, you are equally likely to select a correct answer or an incorrect answer.

To calculate the probability of an event A when all outcomes in the sample space are equally likely, count the number of outcomes for event A and divide by the total number of outcomes in the sample space.  This is often called the Classical Approach to probability.

Suppose you roll one fair six-sided die, with the numbers {1, 2, 3, 4, 5, 6} on its faces. Let event E = rolling a number that is at least five.
There are two outcomes {5, 6}. P(E) = \frac{2}{6}. If you were to roll the die only a few times, you would not be surprised if your observed results did not match the probability. If you were to roll the die a very large number of times, you would expect that, overall, \frac{2}{6}) of the rolls would result in an outcome of “at least five”. You would not expect exactly \frac{2}{6}. The long-term relative frequency of obtaining this result would approach the theoretical probability of \frac{2}{6} as the number of repetitions grows larger and larger.


If you toss a fair dime and a fair nickel, the sample space is {HH, TH, HT, TT} where T = tails and H = heads. The sample space has four outcomes. A = getting one head. There are two outcomes that meet this condition {HT, TH}, so P(A) = \frac{2}{4} = 0.5.

The Axioms of Probability

Finding probabilities in more complicated situations starts with the three Axioms of Probability.

  1. P(S) = 1
  2. 0 ≤ P(E) ≤ 1
  3. For each two events E1 and E2 with E1 ∩ E2 = Ø, P(E1 U E2) = P(E1) + P(E2)

The first two Axioms should be fairly intuitive.  Axiom 1 says that the probabilities of all outcomes in a sample space will always add up to 1.   Axiom 2 says the probability of any event must be between 0 and 1.  The Third Axiom is called the disjoint addition rule which we will expand on in the future.

Relationships Between Events

Often we are not just interested in a single event, but multiple events happening at the same time.  In order to find probabilities relating to multiple events, we first have to know about the relationship (or lack thereof) between them.  The two main relationship terms we will look for are independence and mutually exclusive.  Remember, these two terms certainly do not mean the same thing, neither are they opposites.

Consider two events, A&B.  If it is not known whether they are mutually exclusive, assume they are not until you can show otherwise. Likewise, if it is not known whether A and B are independent, assume they are dependent until you can show otherwise. This “default” starting point is illustrated by the 4th position in the following table:

Figure 3.2: Relationships Between Events
Yes No
               Mutually Exclusive?  Yes 1* 2
No 3 4

Depending on the information given in the problem and Assumptions you are able to make, as you move around the above grid, you will apply slightly different versions of each important probability rule.

*Note: You will rarely, if ever, find yourself in this case

Mutually exclusive

A and B are mutually exclusive (or disjoint) events if they cannot occur at the same time. This means that A and B do not share any outcomes and P(A AND B) = 0.


Suppose the sample space S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Let A = {1, 2, 3, 4, 5}, B = {4, 5, 6, 7, 8}, and C = {7, 9}. A AND B = {4, 5}. P(A AND B) = \frac{2}{10} and is not equal to zero. Therefore, A and B are not mutually exclusive. A and C do not have any numbers in common so P(A AND C) = 0. Therefore, A and C are mutually exclusive.

Independent events

Two events A and B are independent if the knowledge that one occurred does not affect the chance the other occurs. For example, the outcomes of two roles of a fair die are independent events. The outcome of the first roll does not change the probability for the outcome of the second roll.  If two events are NOT independent, then we say that they are dependent. To show two events are independent, you only need to show one one of the equivalent conditions below:

  • P(A|B) = P(A)
  • P(B|A) = P(B)
  • P(A AND B) = P(A)P(B)

How you sample can have implications on independence.  Sampling may be done with or without replacement

  • With replacement: If each member of a population is replaced after it is picked, then that member has the possibility of being chosen more than once. When sampling is done with replacement, then events are considered to be independent, meaning the result of the first pick will not change the probabilities for the second pick.
  • Without replacement: When sampling is done without replacement, each member of a population may be chosen only once. In this case, the probabilities for the second pick are affected by the result of the first pick. The events are considered to be dependent or not independent.


You have a fair, well-shuffled deck of 52 cards. It consists of four suits. The suits are clubs, diamonds, hearts and spades. There are 13 cards in each suit consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, J (jack), Q (queen), K (king) of that suit. Consider the following scenarios

  1. Suppose you pick three cards with replacement. The first card you pick out of the 52 cards is the Q of spades. You put this card back, reshuffle the cards and pick a second card from the 52-card deck. It is the ten of clubs. You put this card back, reshuffle the cards and pick a third card from the 52-card deck. This time, the card is the Q of spades again. Your picks are {Q of spades, ten of clubs, Q of spades}. You have picked the Q of spades twice. You pick each card from the 52-card deck.
  2. Suppose you pick three cards without replacement. The first card you pick out of the 52 cards is the K of hearts. You put this card aside and pick the second card from the 51 cards remaining in the deck. It is the three of diamonds. You put this card aside and pick the third card from the remaining 50 cards in the deck. The third card is the J of spades. Your picks are {K of hearts, three of diamonds, J of spades}. Because you have picked the cards without replacement, you cannot pick the same card twice.

Compound Events

Often we are not just interested in a single event, but multiple events happening at the same time.  There are several types of relationships and situations we may be interested in. The relationship between events can tell us a lot about the probability of compound events.  Depending on the compound event we are looking for we will apply different rules.


The complement of event A is denoted A′ (read “A prime”).  A′ consists of all outcomes in the sample space, S, that are NOT in A.

There are several useful forms of the compliment rule:

  • P(A) + P(A′) = 1
  • 1- P(A) = P(A′)
  • 1- P(A’) = P(A)


Let S = {1, 2, 3, 4, 5, 6} and let A = {1, 2, 3, 4}. Then, A′ = {5, 6}. P(A) = \frac{4}{6}, P(A′) = \frac{2}{6}, and P(A) + P(A′) = \frac{4}{6}+\frac{2}{6} = 1


The union of two events, denoted A B, is the outcomes that are in either event A OR B (or both).


Let A = {1, 2, 3, 4, 5} and B = {4, 5, 6, 7, 8}. A OR B = {1, 2, 3, 4, 5, 6, 7, 8}. Notice that 4 and 5 are NOT listed twice.


The intersection of two events, denoted A ∩ B, is the outcomes that are in both events A AND B.


Let A and B be {1, 2, 3, 4, 5} and {4, 5, 6, 7, 8}, respectively. Then A AND B = {4, 5}.

Conditional Probabilities

Sometimes knowing one event has already happened can change the probability of another event occurring.  A conditional probability reduces the sample space by updating our probabilities based on what we already know.  The conditional probability of A GIVEN B is written P(A|B). P(A|B) is the probability that event A will occur given that the event B has already occurred.  . We calculate the probability of A from the reduced sample space B. The formula to calculate P(A|B) is P(A|B) = \frac{P\left(A\text{ AND }B\right)}{P\left(B\right)} where P(B) is greater than zero.


Suppose we toss one fair, six-sided die. The sample space S = {1, 2, 3, 4, 5, 6}. Let A = face is 2 or 3 and B = face is even (2, 4, 6). To calculate P(A|B), we count the number of outcomes 2 or 3 in the sample space B = {2, 4, 6}. Then we divide that by the number of outcomes B (rather than S).

We get the same result by using the formula. Remember that S has six outcomes.

P(A|B) = \frac{P\left(A\phantom{\rule{2pt}{0ex}}\text{ AND }\phantom{\rule{2pt}{0ex}}B\right)}{P\left(B\right)}=\frac{\frac{\left(\text{the number of outcomes that are 2 or 3 and even in}\phantom{\rule{2pt}{0ex}}S\right)}{6}}{\frac{\left(\text{the number of outcomes that are even in}\phantom{\rule{2pt}{0ex}}S\right)}{6}}=\frac{\frac{1}{6}}{\frac{3}{6}}=\frac{1}{3}


Your turn!

Let event C = taking an English class. Let event D = taking a speech class.

Suppose P(C) = 0.75, P(D) = 0.3, P(C|D) = 0.75 and P(C AND D) = 0.225.

Justify your answers to the following questions numerically.

Are C and D independent?
Are C and D mutually exclusive?
What is P(D|C)?

Solving Probability Problems

The key to probability problems is sorting through and understanding important terminology and symbols.  First read each problem carefully to think about and understand what you are looking for.  Look for key words (and, or, not, w/ or w/o replacement etc..) to identify the event(s) of interest and the relationships between them.  Determine whether there is a condition stated in the wording that would indicate that the probability is conditional.  Visualize them if possible using the tools we’ll talk about in the future.  We can then plug into our rules to get a numerical answer.  Finally, based on your understanding of the situation, make sure the probability you came up with makes intuitive sense.


Consider flipping two fair coins.

The sample space is {HH, HT, TH, TT} where T = tails and H = heads. The outcomes are HH, HT, TH, and TT.  Notice the outcomes HT and TH are different. HT means that the first coin showed heads and the second coin showed tails. TH means that the first coin showed tails and the second coin showed heads.

Find the probabilities of the events.

  1. Let A = the event of getting at most one tail.
  2. Let B = the event of getting all tails
  3. Let C = the event of getting all heads.
  4. Let D = event of getting more than one tail
  5. Let E = event of getting at least one tail in two flips
  6. Let F = the event of getting two faces that are the same.
  7. Let G = the event of getting a head on the first flip followed by a head or tail on the second flip.
  8. Let H = the event of getting all tails.
  9. Are A and F mutually exclusive?
  10.  Are G and H mutually exclusive?

Your turn!

A box has two balls, one white and one red. We select one ball, put it back in the box, and select a second ball (sampling with replacement). Find the probability of the following events:

  1. Let F = the event of getting the white ball twice.
  2. Let G = the event of getting two balls of different colors.
  3. Let H = the event of getting white on the first pick.
  4. Are F and G mutually exclusive?
  5. Are G and H mutually exclusive?

Image Credits

Figure 3.1: Ed Sweeney (2009). “2009 Leonid Meteor.” CC BY 2.0. Retrieved from https://flic.kr/p/7girE8



Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Significant Statistics Copyright © 2020 by John Morgan Russell, OpenStaxCollege, OpenIntro is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book