Connecting to the SQL database using Windows

1  RMySQL

If you check on CRAN, you'll see that the RMySQL package is not available for Windows. (If you use Mac OSX or Linux, you can install RMySQL in the usual way.) To get around this problem, I've created a version of the RMySQL library that you should be able to get to work under Windows. You can download it from http://www.stat.berkeley.edu/classes/s133/RMySQL_0.7-5.zip. To install it, download the file to your computer, choose Packages->Install from Local Zip File , and navigate to wherever you stored the file. To use the package, you'll also need to install the DBI package in the usual way.
Unfortunately, there's one extra detail that you have to take care of before you can load the package. You need to set up an environmental variable that points to the directory where the RMySQL package was installed. This will be a folder called RMySql inside the library folder in the folder in which R was installed. We'll create an .Rprofile file in the working directory that R uses when you start it so that you won't have to enter the command every time you want to use the library. At the R prompt, type
edit()

When the editor opens, type this line into the editor window:
Sys.setenv(MYSQL_HOME=paste(Sys.getenv('R_HOME'),'library','RMySQL',sep='\\'))

Close the file, and save it under the name .Rprofile - notice the leading period and the capital R in the file name. Now exit R. When you run R in the future, you can simply type
library{RMySQL}

to use the package as we've seen in class.

2  RODBC

As an alternative to the above procedure, you can use the RODBC package, which uses Windows' Open Database Connection facility. To use this facility with a particular database, you need to have a piece of software known as a connector installed and configured on your computer. You can download a MySQL ODBC connector at http://dev.mysql.com/downloads/connector/odbc/5.1.html. (I recommend that you use the MSI version of the installer for your version of Windows. When you click the Download button, it will look like you need a password to access the download, but if you look carefully you'll see a link labeled "No thanks, just take me to the downloads!".) Download and install the connector. You'll see no indication that anything has happened, but you can now configure the connector as follows:
Go to Control Panel->Administrative Tools, and double click on "Data Sources (ODBC)". Click "Add", and choose "MySQL ODBC 5.1 Driver" from the dropdown menu. (As a side note, notice the wide range of database connectors that are already available for use through ODBC. After configuring any one of them, you can use them in R in a similar way to the way we'll access MySQL through RODBC.) After clicking "Finish", enter the following:
Data Source Name:     mysql (or some other name if you'd like)
Description:          whatever you want
TCP/IP Server:        localhost
User:                 stat133
Password:             T0pSecr3t

Notice that we've specified localhost as the server. That's because you'll need to access the MySQL server that runs on my office computer through an SSH tunnel, as described at http://www.stat.berkeley.edu/classes/s133/mysqlt.html. If you've established the tunnel, you can click the "Test" button. You should see a window with the message "Connection Successful". At that point, you can choose Baseball from the "Database" dropdown menu. (If you haven't yet configured the tunnel, don't worry - we can specify the database when we connect to the server.)
Once you've entered the information, Click "OK" as necessary to close the ODBC control panel.
Now, in R, install the RODBC package in the usual way.
If you've already set the database to Baseball in the previous step, and you've got the SSH tunnel running, you can connect to the SQL server by entering
con = odbcConnect('mysql')

If you haven't specified the database when configuring the connector, use
con = odbcConnect('mysql;db=baseball')

(Of course if you used a name different than "msyql" when you configured the connector, you would use that name when you connect.)
When you connect through RODBC, the function to make queries is sqlQuery; other useful functions are sqlTables and sqlColumns.
If you're using Mac OSX and would like to use RODBC to connect to the MySQL server, download the appropriate connector from the MySQL website, and go to Applications->Utilities->ODBC Administrator . The process of configuring the connector is very similar to the method described above.

3  A Final Note

I'll do everything I can to help you get one of these methods working on your computer, but if you encounter problems, remember that you can always connect to the SQL server through any of the SCF computers by specifying a hostname of springer , i.e you don't need an SSH tunnel when you use the SCF computers. To use the SCF computers, you can go to Evans 342, or you can access them remotely from anywhere with an internet connection by following the instructions at http://www.stat.berkeley.edu/classes/s133/remote.html.


File translated from TEX by TTH, version 3.67.
On 7 Mar 2011, 15:55.