To see a review of how to start R, look at the beginning of Lab1
The following examples demonstrate how to calculate the value of the cumulative distribution function at (or the probability to the left of) a given number.
> x <- c(-2,-1,0,1,2) > x [1] -2 -1 0 1 2 > pnorm(x) [1] 0.02275013 0.15865525 0.50000000 0.84134475 0.97724987
> x <- c(0,1,2,5,8,10,15,20) > pbinom(x,size=20,prob=.2) [1] 0.01152922 0.06917529 0.20608472 0.80420779 0.99001821 0.99943659 0.99999999 [8] 1.00000000
> x <- c(0,1,2,5,8,10,15,20) > ppois(x,6) [1] 0.002478752 0.017351265 0.061968804 0.445679641 0.847237494 0.957379076 [7] 0.999490902 0.999998545
Exercise : Calculate the following probabilities :
The following examples show how to common the quantiles of some common distributions for a given probability (or a number between 0 and 1).
> y <- c(.01,.05,.1,.2,.5,.8,.95,.99) > qnorm(y,mean=0,sd=1) [1] -2.3263479 -1.6448536 -1.2815516 -0.8416212 0.0000000 0.8416212 1.6448536 [8] 2.3263479
> y <- c(.01,.05,.1,.2,.5,.8,.95,.99) > qbinom(y,size=30,prob=.2) [1] 1 3 3 4 6 8 10 11
> y <- c(.01,.05,.1,.2,.5,.8,.95,.99) > qpois(y,6) [1] 1 2 3 4 6 8 10 12
The following examples illustrate how to generate random samples from some of the well-known probability distributions.
The first sample is from distribution and the next one from
distribution.
> z <- rnorm(10) > z [1] -0.90361592 -1.96522764 -1.35107949 -0.10846423 0.29756634 1.40831606 [7] -0.07844737 1.40575257 -0.97511415 -0.33418299If you would like to see how the distribution of the sample points looks like ....
> w <- rnorm(1000,mean=5,sd=1) > hist(w)
> k <- rbinom(20,size=5,prob=.2) > k [1] 1 2 0 1 0 0 0 2 0 1 0 0 0 0 0 2 4 1 1 1
> x <- rpois(20,6) > x [1] 2 8 7 5 5 5 3 8 5 5 1 8 5 5 5 4 10 7 3 4
Exercise (Advanced) : Generate 500 samples from Student's distribution
with 5 degrees of freedom and plot the historgam. (Note:
distribution is going to be covered in class). The corresponding
function is rt .
> x11() > x <- seq(-4.5,4.5,.1) > normdensity <- dnorm(x,mean=0,sd=1) > plot(x,normdensity,type="l")
> par(mfrow=c(2,1)) > k <- c(1:30) > plot(k,dbinom(k,size=30,prob=.15),type="h") > plot(k,dbinom(k,size=30,prob=.4),type="h") > par(mfrow=c(1,1))
> dbinom(3,size=10,prob=0.5) [1] 0.1171875
** Note the distinction between the continuous (Normal) and the discrete (Binomial) distrubtions.
Exercise : Plot the probability mass functions for the Poisson distribution with mean 4.5 and 12 respectively. Do you see any similarity of these plots to any of the plots above? If so, can you guess why ?
Exercise : Recreate the probabilities that Professor Holmes did in class (Bin(5,.4)) [You can do it in 1 command!] How would you get the expected counts?
R has two different functions that can be used for generating a Q-Q plot. Use the function qqnorm for plotting sample quantiles against theoretical (population) quantiles of standard normal random variable.
Example :
> stdnormsamp <-rnorm(100,mean=0,sd=1) > normsamp <- rnorm(100,mean=5,sd=1) > binomsamp <-rbinom(100,size=20,prob=.25) > poissamp <- rpois(100,5) > par(mfrow=c(2,2)) > qqnorm(stdnormsamp,main="Normal Q-Q plot : N(0,1) samples") > qqline(stdnormsamp,col=2) > qqnorm(normsamp,main="Normal Q-Q plot : N(5,1) samples") > qqline(normsamp,col=2) > qqnorm(binomsamp,main="Normal Q-Q plot : Bin(20,.25) samples") > qqline(binomsamp,col=2) > qqnorm(poissamp,main="Normal Q-Q plot : Poisson(5) samples") > qqline(poissamp,col=2)
Note: Systematic departure of points from the Q-Q line (the red straight line in the plots) would indicate some type of departure from normality for the sample points.
Use of function qqplot for plotting sample quantiles for one sample against the sample quantiles of another sample
Example :
> par(mfrow=c(2,1)) > qqplot(stdnormsamp,normsamp,xlab = "Sample quantiles : N(0,1) samples", + ylab = "Sample quantiles : N(5,1) samples") > qqplot(stdnormsamp,binomsamp,xlab = "Sample quantiles : N(0,1) samples", + ylab = "Sample quantiles : Bin(20,.25) samples")
Exercise : Generate 100 samples from Student's distribution
with 4 degrees of freedom and generate the qqplot for this
sample. Generate another sample of same size, but now from a
distribution with 30 degrees of freedom and generate the q-q plot.
Do you see any difference ?