------------------------------------------------------------------------ Baseball The BaseballDatabank database on statdocs.berkeley.edu contains 21 relations/tables. These provide information about major league baseball teams and players for the years 1884 through 2003. The database comes from http://baseball1.info and was created and licensed by Sean Lahman. We are entitled to use it for research use and cannot distribute it further. And we are grateful for his efforts in compiling and managing this database. The attributes in the different relations are described in Baseball Archive documentation . Other sites for baseball data include * http://asp.usatoday.com/sports/baseball/salaries/default.aspx * http://www.baseball-databank.org/ The goal of this homework assignment is to illustrate aspects of using a relational database, and understanding how and when to perform some commands in a database and others in R. To help you gain familiarity with using a database, answer each of the questions below. Provide a history of your R session used to answer the questions, the requisite plots, and answers to the questions below. 1. Use the table that contains salaries and compute the payroll for each team in 1999. Which teams had the highest payrolls? Were their payrolls much higher than other teams? 2. Now modify the above SQL statement to find the team payrolls for all years. 3. Study the change in salary over time. Use the annual inflation rates below to control for inflation. Have salaries kept up with inflation, fallen behind, or grown faster? Use the bwplot function in the lattice package to compare payrolls over years. 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 02 03 --- 1.91 3.66 4.08 4.83 5.39 4.25 3.03 2.98 2.61 2.81 2.93 2.34 1.55 2.19 3.38 2.83 1.59 2.27 4. Have certain teams always had top payrolls over the years? Show this graphically. 5. Augment the SQL statment above to include the following team statistics for each season: the number of games in the season, the number games won in a season, the information as to whether the team won the division, wild card, league, or world series. To do this you will need to join the table with the salaries with the team table. 6. Use the xyplot in the lattice plot to explore the relationship between payroll and performance. To do this, control for inflation by cutting the data into six groups according to 3 year intervals, and lay out six plots for these intervals. Construct a factor that indicates whether the team won their division (or wild card), league, or world series. The options are: no title, won division or wild card but not league, won league but not the world series, or won the world series.