Suppose you want to calculate the mean within each group, where the
group is defined by a column (field) in your dataframe. First use
the `split()` function to split the data by group, with the
output being a list. Then use `lapply()` to perform the calculation
on each element of the list. You can also create discrete groups from
a continuous variable using the `cut()` function.

Here's an example. Suppose I have a vector `housePrice` and
a vector `income` where the observations are the house price
and income for a number of households. To calculate the median house
price for people with similar incomes, I can do the following:
`categories=cut(income,breaks=c(seq(0,100000,by=10000),500000))
`
`groupedPrices=split(housePrice,categories) `
`meanPrice=unlist(lapply(groupedPrices,mean))`

You can do something similar using the aggregate function:
`categories=cut(income,breaks=c(seq(0,100000,by=10000),500000))
`
`meanPrice=aggregate(housePrice,by=list(categories),FUN='mean`

I haven't used it, by the reshape package looks to be useful for manipulating the dimensions of datasets.

Last modified: 12/13/08.

Chris Paciorek 2012-01-21