Stat 200A Fall 2018

A. Adhikari

UC Berkeley

Lecture 1, Thursday 8/23

These notes contain only the terminology and main calculations in the lectures. For the associated discussions, please come to lecture.

Terminology

Random processes are often called chance experiments or just experiments.

The set of all possible outcomes of an experiment is called the outcome space and will be denoted $\Omega$.

An event is a subset of the outcome space. Typically, events are denoted by upper case letters early in the alphabet: $A$, $B$ etc.

Frequently occurring combinations of events:

  • at least one of $A$ and $B$ occurs: $A \cup B$
  • $A$ and $B$ both occur: $A \cap B$, usually written $AB$

Axioms

Probability is a function on events, with the three properties below. If $\Omega$ is uncountable, some mathematical care is needed to check that such a function exists on a sufficiently rich class of subsets of $\Omega$, but that is outside the scope of this course.

(i) $P(A) \ge 0$ for all $A$

(ii) $P(\Omega ) = 1$

(iii) Addition Rule. If $A_1, A_2, \ldots$ are mutually exclusive, that is, $A_iA_j = \phi$ when $i \ne j$, then

$$ P(\bigcup_{i=1}^\infty A_i) ~ = ~ \sum_{i=1}^n P(A_i) $$

More Terminology

Drawn at random from a finite set $\Omega$ will mean that all outcomes in $\Omega$ are equally likely, that is, each outcome has chance $1/\#\Omega$.

Drawn at random without replacement from a finite set means that each draw is made at random, and the element drawn is not returned to the set before the next draw.

The chance of event $B$, given that event $A$ has occurred, is denoted $P(B \mid A)$ and is called the conditional probability of $B$ given $A$.

Intersections and Conditioning

  • Multiplication Rule. $P(AB) = P(A)P(B \mid A)$

Definition of conditional probability: $$ P(B \mid A) ~ = ~ \frac{P(AB)}{P(A)} $$

Collision Probability

Sample $n$ times at random with replacement from $1, 2, 3, \ldots N$. A collision occurs if the same element appears more than once in the sample.

The chance of a collision is 1 if $n > N$, so we will assume $n \le N$.

\begin{align*} P(\text{collision}) ~ &=~ 1 - P(\text{all different}) \\ &=~ 1 - \frac{N}{N} \cdot \frac{N-1}{N} \cdot \frac{N-2}{N} \cdots \frac{N-(n-1)}{N} \\ &=~ 1 - \prod_{i=0}^{n-1} \frac{N-i}{N} \end{align*}

Exponential Approximation

The course will make repeated use of the fact that $\log(1 + x) \sim x$ when $x$ is small. Here the symbol $\sim$ means that the ratio of the two sides goes to 1 in the limit as $x$ goes to 0.

For large $N$,

\begin{align*} \log \big{(} \prod_{i=0}^{n-1} \frac{N-i}{N} \big{)} ~ &=~ \sum_{i=0}^{n-1} \log \big{(} \frac{N-i}{N} \big{)} \\ &= ~ \sum_{i=0}^{n-1} \log \big{(} 1 - \frac{i}{N} \big{)} \\ &\sim ~ \sum_{i=0}^{n-1} \big{(} -\frac{i}{N} \big{)} \\ &= ~ -\frac{1}{N} \sum_{i=0}^{n-1} i \\ &= ~ -\frac{1}{N} \cdot \frac{(n-1)n}{2} \end{align*}

Therefore, $$ P(\text{collision}) ~ \sim ~ 1 - e^{-(n-1)n/2N} $$