- by(), table()
- boxplot(), legend(), lines(), points(), axis(), etc.

- How many people are included in the databases?
- Are all of these players? How many are players? how many are managers? and how many are both?
- What is the earliest season recorded in this database? and the most recent?
- What college produced the number of major league baseball players? How many colleges are there in total?
- Can we tell who won the "World" series in a given year?
- Who lost the "World" series in each year?
- Look at the relationship between the number of games won in a season and winning the world series? And similarly relate these to payroll.
- For 1999, compute the payrolls of the different teams? Can we do this for all years in a single SQL statement?
- Plot the payrolls over years for the different teams. What plot types are good for showing this data? Contrast different graphical techniques. Superimpose the payroll of the two teams that made it to the "World Series" on this plot. Is there a relationship? How about for the teams that made it into the playoffs in a year?
- Show the distributions of the payrolls over years. We can think of a boxplot for each year for this. Again, we can superimpose additional attributes and even lines connecting the different statistics for particular teams if they are not very noisy. boxplot(Payroll ~ Year, data = d) # Standardize by dividing by the maximum for each year d[,3] = d[,3]/maxPayroll[as.character(d[,1])] boxplot(Payroll ~ Year, data = d)
- Look at the payrolls for the teams that are in the same leagues? and then in the same divisions? Are there any interesting characteristics?
- Is the payroll related to the age of the players? One might expect an old team to be paying veteran players a lot near the end of their careers. Teams with a large number of older players would therefore have a large payroll. Is there any evidence supporting this?
- Look at the distribution of salaries of individual players over time for different teams.
- Look at players and see whether the distribution of home runs has increased over the years?
- Are Hall of Fame players, in general, inducted because of rare, excellent performances or years or are they rewarded for consistency over years?
- Are certain baseball parks better for hitting Home Runs? Can we tell from this data? Can we make inferences about this question?
- Do teams with a few good players and many mediocre players tend to do better than a team made up more homogeneous talent?
- Look at the distribution of how well batters do. Does this vary over the years? Do the same players excel each year? Is there a clustering? a bi-modal distribution?
- Do pitchers get better with age? Is there an improvement and then a fall off in performance? And is this related to how old they are? the number of years they have been pitching? which league they are in and the designated hitter rule? Do we have information about each of these factors and how can we combine them to present information about the general question?