*
I try to get the students to think about how to "measure" how close
the distributions are to the Normal and over the course of
the homework suggest that they think about Q-Q plots and to
then to compute the correlation between the empirical quantiles
and the theoretical quantiles. And then lead them towards
qqnorm to compute these without plotting them (plot.it = FALSE)
and then use cor.
*

Make this efficient!
<2>Timing Algorithms
Timing of algorithms. We usually think of random variables and
associated density functions as being neat mathematical functions
expressed as a function of the value and some parameters. Mixture
distributions are slightly more general versions of random variables
and densities, but require more complex parameters. The idea of a
mixture is relatively simple. To sample from one we have a two step
random process. We have k simpler/regular densities and first
select which one of these k densities to sample from. Having
selected the component of the mixture, we generate a value from that
density in the usual manner for that density, and then we have a
value from the mixture distribution. To specify a mixture we
therefore have to specify the probability of selecting component
1, 2, ... 3 -- (p_{1}, p_{2}, ..., p_{k}) -- and the
form and parameters for each of the k component densities.
In this question, we will focus on a mixture of k Normals. Your
job is to write a function that returns n sample values from a
mixture of Normals. It should accept three parameters: the number
of sample values (n), the probabilities for sampling from each
component (p above), and an object representing the densities of
the components. For the latter, one might choose to use a k x
2 matrix to represent these parameters or a list of length k, each
a named vector giving the mean and SD. Alternatively one might use
a list of functions to represent the densities with each having its
own parameters stored within the function (e.g. using lexical
scoping). This is the most general approach and allows one to have
different types of distributions within the mixture. One can create
a new class to represent these also and provide methods for the
class and to transform from one form to another. You can choose for
yourselves.
You will write two versions of the function
in order to compare two algorithms.
These are as follows.

- The simplest way is to select which of the k distributions to use for each of the n points and then iterate over these n ``component identifiers'' for the distribution from which to sample and take a sample of size 1 from the corresponding density.
- An alternative approach is to generate the component identifiers for the n deviates but then to determine how many observations are to be taken from each of the k components. Then iterate over the k components and generate the appropriate number of random values from each of these.

Duncan Temple Lang <duncan@wald.ucdavis.edu> Last modified: Tue Jul 15 12:38:14 PDT 2008