STAT 134, Spring 2017
A. Adhikari
FAQ PAGE
Frequenty Asked Questions about important general issues
1. What are they looking for when the
question asks for the distribution of something?
Answer: "They" =
me, and "something" = a random variable. I am asking you to
specify, in any reasonable way, all the possible values of the variable
as well as a rule for how the total probability of 100% is distributed
over those
values. Important steps, to be taken always:
Figure
out the list of possible values
before you calculate any probabilities.
This has a wonderful way
of focusing the mind when you do start finding chances. It stops you
making gross errors like saying continuous variables are discrete, or
exponential variables are normal.
After you've got the probabilities, try to make sure that they sum (or
integrate) to 1.
Ways to specify a distribution:
a) If you recognize
the random variable as having one of the well-known distributions then
you can simply specify by its name and
parameters. E.g. "uniform on the integers 1 through 100", or
"gamma with shape parameter r=5 and rate lambda=1.9", etc. But your answer is
incomplete if you just give the name of the distribution without values
for the parameters. Be aware that in order to do this you have to
thoroughly understand how all the different well-known distributions
arise and interact with each other.
b) If the variable has finitely many values,
make a distribution table
showing all the possible values and their probabilities. Equivalently
you can specify the probabilities in a formula which
includes a clear statement about when (i.e. for which possible values)
the formula is valid.
c) The c.d.f. of any random variable specifies its distribution
completely. Equivalently, the survival function specifies the
distribution completely.
d) Distributions of continuous random variables can be specified in a
variety of ways:
* if it's a well-known distribution: its name and parameters
* the possible values and the density
* the possible values and the c.d.f.
* the possible values and the survival function
2. What's the difference between
distribution, c.d.f., and density?
Answer: The distribution of a random variable is, always, the
set of all possible values of the random variable and a
clear specification of how the probabilities are distributed over those
values (see Question 1). The c.d.f. is one way of specifying the
distribution (see Part c of the answer to Question 1). If the random
variable is continuous, the density is another way of specifying the
distribution (see Part d of the answer to Question 1).
3. What are
the different ways of finding a density?
Answer: Here are some of the
most common ways of finding the density of a continuous random variable
X.
In many problems, one of these ways will turn out to be much more
convenient than the others. Which one is most convenient will depend on
the problem. No matter what method you use, identify
the possible values first!
a) See if you can recognize it as something well known. To do
this you have to thoroughly understand how all the different well known
densities arise and interact with each other. If you recognize it, give its name and say what
its parameters are in the context of the problem.
b) If you can write X as a
nice function of a random variable whose density you know, use the
change of variable formula for densities.
c) Find P(X in dx) and write it as (function of x times) dx. The function is the density.
d) Find the c.d.f of X and differentiate.
e) If you know the joint density of X and Y, you can find the density of X by integrating the joint density over all the
possible values of Y.
4. How do you find a joint
distribution?
Answer: If the two variables are discrete you'll be finding a
joint probability function. In the continuous case you'll be finding a
joint density function. The main techniques:
a) Are the two
variables independent? Then you can simply state the marginals and say
the variables are independent. If you are required to find the joint
distribution or the joint density, then multiply the marginals.
b) Joint distributions of discrete random
variables can be specified in a table or by a formula for the joint
probability function P(X=x, Y=y)
for all possible pairs (x,y).
You may be able to find the joint probability function directly
from first principles or by conditioning and using the multiplication
rule.
c) There are two cases which you should simply
recognize:
Is the joint distribution uniform over a region? In that case the joint
density or joint probability function is constant over the space of possible values.
Is the joint distribution bivariate normal? In that case you can simply
fill in the blanks in, "bivariate normal, E(X) = , E(Y) = , Var(X) =
, Var(Y) = , Corr(X,Y) = ,". Yes, you can specify SD instead of variance if you prefer; just say clearly which one you're listing.
d) If none of the above has worked, you can find a joint density from
first principles. Specify the possible pairs (x,y). Find P(X in dx, Y in dy). The joint
density is everything in that answer except the dxdy.
e) If even (d) hasn't worked, then find the density of one of the
variables, say X, and
multiply by the conditional density of Y given X.
5. What are the different ways of
finding expectations?
Answer: Treat the definition of the expectation of X (multiply
the possible values by their probabilities and then add; or
multiply the possible values by the density and then integrate) as you
treat the general definition of a derivative. Don't use it unless the
random variable is really simple (e.g. an indicator or something with a
tiny distribution table) or you have no other choice. Rather,
start as follows:
a) Do you recognize the distribution of X? If it's one of the famous
distributions, you can read E(X)
off the properties of the distribution.
b) Can you write X as a sum?
This one is crucial, because if you can write X as a sum (or difference, or other linear
combination) of simpler variables whose expectations you know, then you
can use additivity to find E(X).
The method of indicators is an important special case; see Question 6
below.
c) Is X a nice function of
some variable Y which has a
known distribution? In that case you can find E(X) by using what you know about Y. If your function g is linear, use linearity of
expectation. If it's quadratic, use the
appropriate combination of E(Y^2)
and E(Y). For other functions
g, find E(X) = E(g(Y)) by the function rule for expectations: multiply the
values g(y) by the density of
Y and then integrate with
respect to y. Or, in the
discrete case, multiply g(y)
by P(Y=y) and then add over
all y.
d) Can you condition on something useful? That is, does X depend on a variable Y in such a way that E(X|Y) is a nice function of Y? If so, you can calculate E(X) as E(E(X|Y)). Usually you can identify
Y like this: if you find
yourself saying, "If I knew the value of this other random variable
then I'd know the expectation of X,"
then "this other random variable" may be a good candidate for Y.
e) Can you use the tail sum formula? This
method is not nearly as common as the others, but it is useful for
positive integer-valued variables whose tail probabilities have a
simple form. Typical examples are minima, maxima, and waiting times.
Analogously, for positive continuous variables you can integrate the
survival function to find the expectation.
f) If all else fails, or if the distribution of X is really very simple, then plug
in to the definition of expectation.
6. How do I know to use indicators?
Answer: The method is entitled "Method of indicators for expected
counts". That's what it's for: to find the expected number of
events that occur. It is also handy for finding variances of counts.
See e.g. the derivation of the variance of the binomial. Sometimes these variances involve
covariances as well. See e.g. the derviation of the variance in
the matching problem, done in lectures on Sec 6.4; and see Q9 below.
7.
What are the different ways of finding SDs?
1. As always, start by seeing if the random variable has a well-known
distribution. Then you can read the SD off known facts for that
distribution.
2. See if the random variable is a linear function of another variable
whose SD you know, and use linear change of variable for SDs.
3. If neither 1 nor 2 apply, then you almost invariably have to find
the variance somehow and then take its square root. Which leads to the
next question.
8. So then what are the different ways
of finding variances?
Answer: As far as this course is concerned, the
answers are essentially the same as those in parts (a), (b), (d), and
(f) of
the question on expectations, Question 5 above.
a) Do you recognize the distribution? If so, you can read off the
variance. (Of course if you can recognize the distribution then you'll
already have read off its SD directly, but I'm keeping the method here
for completeness.)
b) Can you write the variable as the sum of simpler variables? If so,
the variance can be found as "the sum of all the variances as well as
all the covariances" (see Question 9 below). An important special case
is writing a count as a
sum of indicators; see Question 6 above. Finally, if the variables in
your sum are independent, then all the covariance terms are 0 and so
the variance of the sum is just the sum of the variances.
d) Can you condition on something useful? In that case, use Var(X) = E(Var(X|Y)) + Var(E(X|Y)). Or first find E(X) by conditioning, then find E(X^2) by conditioning, using the
same method as for E(X). Now
use the computational formula Var(X) = E(X^2) - (E(X))^2.
f) If (a), (b) and (d) fail, you have to use
the computational formula and the distribution of X. Pray that the distribution is
simple enough so that you can carry out the calculations.
The most common method is writing the variable as a sum of
simpler variables. These may be dependent, in which case the variance
of their sum will also involve their covariances.
9. So then what are the different ways
of finding covariances?
a) Check for independence. If the two variables are independent
then the covariance is 0.
b) See if you can
write at least one of X or Y
as a linear combination of simpler variables, then use bilinearity of
covariance. The most common linear combinations are sums and
differences.
c) Check to see if you have Corr(X,Y), SD(X), and SD(Y). Then you can use Cov(X,Y) = Corr(X,Y)SD(X)SD(Y).
d) Check to see if you already know Var(X), Var(Y), and Var(X+Y).
Then you can find the covariance by solving
Var(X+Y) = Var(X) + Var(Y) +
2Cov(X,Y).
e) If all else fails use the computational formula for covariance: Cov(X, Y) = E(XY) - E(X)E(Y). A
crucial special case is the covariance between two indicators: Cov(I(A), I(B)) = P(AB) - P(A)P(B).
A general warning: If you find
yourself trying to compute E(XY)
and X or Y is not an indicator or something very simple with just a few possible values, Prof.
Adhikari is willing to bet that you've failed to notice that you can
use one of (a)-(d) above.