Lecture 12
University of Arizona
INFO 511 - Fall 2024
The probability an event will occur given that another event has already occurred is a conditional probability. The conditional probability of event \(A\) given event \(B\) is:
\[P(A | B) = \frac{P(A \text{ and } B)}{P(B)}\]
\[P(A | B) = \frac{P(A \text{ and } B)}{P(B)}\]
Examples come up all the time in the real world:
Did not die |
Died |
|
---|---|---|
Does not drink coffee | 5438 | 1039 |
Drinks coffee occasionally | 29712 | 4440 |
Drinks coffee regularly | 24934 | 3601 |
Did not die |
Died |
|
---|---|---|
Does not drink coffee | 5438 | 1039 |
Drinks coffee occasionally | 29712 | 4440 |
Drinks coffee regularly | 24934 | 3601 |
Define events \(A\) = died and \(B\) = non-coffee drinker. Calculate the following for a randomly selected person in the cohort:
Marginal probability: \(P(A)\), \(P(B)\)
Joint probability: \(P(A \text{ and } B)\)
Conditional probability: \(P(A | B)\), \(P(B | A)\)
We can write the definition of condition probability
\[P(A | B) = \frac{P(A \text{ and } B)}{P(B)}\]
Using the equation above, we get…
\[P(B) \times P(A | B) = P(A \text{ and } B)\]
What does the multiplicative rule mean in plain English?
Events \(A\) and \(B\) are said to be independent when
\[P(A | B) = P(A) \hspace{10mm} \textbf{OR} \hspace{10mm} P(B | A) = P(B)\]
In other words, knowing that one event has occurred doesn’t cause us to “adjust” the probability we assign to another event.
We can use the multiplicative rule to see if two events are independent.
If events \(A\) and \(B\) are independent, then
\[P(A \text{ and } B) = P(A) \times P(B)\]
Since for two independent events \(P(A|B) = P(A)\) and \(P(B|A) = P(B)\), knowing that one event has occurred tells us nothing more about the probability of the other occurring.
For two disjoint events \(A\) and \(B\), knowing that one has occurred tells us that the other definitely has not occurred: \(P(A \text{ and } B) = 0\).
Disjoint events are not independent!
Did not die |
Died |
|
---|---|---|
Does not drink coffee | 5438 | 1039 |
Drinks coffee occasionally | 29712 | 4440 |
Drinks coffee regularly | 24934 | 3601 |
Are dying and abstaining from coffee independent events? How might we check?
In an introductory statistics course, 50% of students were first years, 30% were sophomores, and 20% were upperclassmen.
80% of the first years didn’t get enough sleep, 40% of the sophomores didn’t get enough sleep, and 10% of the upperclassmen didn’t get enough sleep.
What is the probability that a randomly selected student in this class didn’t get enough sleep?
As we saw before, the two conditional probabilities \(P(A | B)\) and \(P(B | A)\) are not the same. But are they related in some way?
Yes they are (!) using Bayes’ rule:
Bayes’ rule:
\[\begin{align}P(A | B) &= \frac{P(A \text{ and } B)}{P(B)}\\[10pt] &= \frac{P(B | A)P(A)}{P(B)} \end{align}\]
Putting together a few rules of probability…
\[\begin{align}P(A | B) &= \frac{P(A \text{ and } B)}{P(B)}\\[10pt] &= \frac{P(B | A)P(A)}{P(B)}\\[15pt] &= \frac{P(B | A)P(A)}{P(B | A)P(A) + P(B | A^c)P(A^c)}\end{align}\]
Let’s took at an example to see how this works.
Suppose we’re interested in the performance of a diagnostic test. Let \(D\) be the event that a patient has the disease, and let \(T\) be the event that the test is positive for that disease.
What do these probabilities mean in plain English?
For a Abbott BinaxNOW COVID-19 Rapid antigen tests,
Sensitivity, \(P(T | D)\), is 64.2% in symptomatic individuals
Specificity, \(P(T^c | D^c)\), is 99.8%
From CDC statistics in 2021, with 8.7% prevalence from Pima County, Arizona persons aged ≥10 years.
Suppose a randomly selected American aged 13+ has a positive test result. What is the probability they have COVID-19?
What does all of this mean? Let’s take a look!
ae-08
Given:
Work through ae-08
then move on to the discussion questions
Think about the following questions: