Suppose you want to calculate the mean within each group, where the group is defined by a column (field) in your dataframe. First use the split() function to split the data by group, with the output being a list. Then use lapply() to perform the calculation on each element of the list. You can also create discrete groups from a continuous variable using the cut() function.
Here's an example. Suppose I have a vector housePrice and
a vector income where the observations are the house price
and income for a number of households. To calculate the median house
price for people with similar incomes, I can do the following:
You can do something similar using the aggregate function:
I haven't used it, by the reshape package looks to be useful for manipulating the dimensions of datasets.
Last modified: 12/13/08.