Lecture Note: §2 Conditional Probability
Published:
Last Update: 2025-10-18
§2.1 The Definition of Conditional Probability
A major use of probability in statistical inference is the updating of probabilities when certain events are observed. The updated probability of event \(A\) after we learn that event \(B\) has occurred is the conditional probability of \(A\) given \(B\).
- Conditional Probability \(\Pr (A \mid B)\)
The probability of \(A\) given \(B\)
\(\Pr(A \mid B) = \dfrac{\Pr (A \cap B)}{\Pr(B)}\).
- Multiplication Rule
\(\Pr(A \cap B)=\Pr (B \mid A) \Pr (A) = \Pr (A \mid B) \Pr (B)\)
\(\Pr\left(A_1 \cap A_2 \cap \cdots \cap A_n\right) =\Pr\left(A_1\right) \Pr\left(A_2 \mid A_1\right) \Pr \left(A_3 \mid A_1 \cap A_2\right) \cdots \Pr \left(A_n \mid A_1 \cap A_2 \cap \cdots \cap A_{n-1}\right)\).
- Law of Total Probability
Partition: events \(B_1, \ldots, B_k\), such that \(B_1, \ldots, B_k\) are disjoint and \(\bigcup_{i=1}^k B_i=S\).
\(\Pr(A)=\sum_{j=1}^k \Pr(B_j) \Pr (A\mid B_j)\)
§2.2 Bayes’ Theorem
Bayes’s theorem is to the theory of probability what Pythagoras’s theorem is to geometry. — Sir Harold Jeffreys, 1973
Suppose that we are interested in which of several disjoint events \(B_1,\ldots, B_k\) will occur and that we will get to observe some other event \(A\). If \(\Pr(A \mid B_i)\) is available, then Bayes’ theorem is a useful formula for computing the conditional probabilities of the \(\Pr(B_i \mid A)\).
- Bayes’ Theorem
\(\Pr (B_i \mid A)=\dfrac{\Pr (B_i) \Pr (A \mid B_i)}{\sum_{j=1}^k \Pr (B_j) \Pr (A \mid B_j)}\)
Conditional Version: \(\Pr (B_i \mid A\cap C)=\dfrac{\Pr(B_i \mid \cap C) \Pr (A \mid B_i\cap C)}{\sum_{j=1}^k \Pr(B_j\mid C) \Pr(A \mid B_j\cap C)}\)
- Remarks:
Law of Total Probability: given the “reasons” \(B_1, \ldots, B_k\), find out the “result” \(A\).
Bayes’ Theorem: given the “result” \(A\), figure out the reason among \(B_1, \ldots, B_k\).
When you have eliminated the impossible, whatever remains, however improbable, must be the truth. —— A. Conan Doyle
§2.3 Independent Events
If learning that \(B\) has occurred does not change the probability of \(A\), then we say that \(A\) and \(B\) are independent. There are many cases in which events \(A\) and \(B\) are not independent, but they would be independent if we learned that some other event \(C\) had occurred. In this case, \(A\) and \(B\) are conditionally independent given \(C\).
- Independent Events
Two events \(A\) and \(B\) are independent if \(\Pr(A\cap B) = \Pr(A)\Pr(B)\).
\(\Pr(A \mid B) = \Pr (A)\), \(\Pr(B \mid A) = \Pr(B)\).
- Independence of Complements: \(A\) and \(B^{c}\) are also independent.
- \(\Pr(A \mid B^{c}) = \Pr(A)\), \(\Pr (B \mid A^{c}) = \Pr(B)\)
- (Mutually) Independent Events: \(\Pr (A_{i_1} \cap \cdots \cap A_{i_j}) = \Pr(A_{i_1}) \cdots \Pr(A_{i_j})\).
Pairwise Independence
\(\Pr\left(A_{i_1} \cap \cdots \cap A_{i_{m}} \mid A_{j_1} \cap \cdots \cap A_{j_{\ell}}\right)=\Pr \left(A_{i_1} \cap \cdots \cap A_{i_m}\right)\)
Mutually Exclusive \(\neq\) Mutually Independent.
- Conditionally Independent Events
\(\Pr \left(A_{i_1} \cap \cdots \cap A_{i_j} \mid B\right)=\Pr\left(A_{i_1} \mid B\right) \cdots \Pr \left(A_{i_j} \mid B\right)\)
\(A_1\) and \(A_2\) are conditionally independent given \(B\) if and only if \(\Pr(A_2\mid A_1\cap B) = \Pr(A_2\mid B)\)
