Stat 133: Concepts in Computing with Data

This course is NOT:

This course will address:

Various aspects of computing for statistics,
NOT the computational aspects of statistics.

DATA Technologies - You will need to work to get data, work with data, and work with the data ``owners"

Learn how to think about the data process

An Example: SPAM

Return-Path: whisper@oz.net
Delivery-Date: Fri Sep  6 20:53:36 2002
From: whisper@oz.net (David LeBlanc)
Date: Fri, 6 Sep 2002 12:53:36 -0700
Subject: [Spambayes] Deployment
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEJABCAB.tim.one@comcast.net>
Message-ID: <GCEDKONBLEFPPADDJCOECEHJENAA.whisper@oz.net>

You missed the part that said that spam is kept in the "eThunk" and was
viewable by a simple viewer for final disposition?

Of course, with Outbloat, you could fire up PythonWin and stuff the spam
into the Junk Email folder... but then you loose the ability to retrain on
the user classified ham/spam.

David LeBlanc
Seattle, WA USA

> -----Original Message-----
> From: spambayes-bounces+whisper=oz.net@python.org
> [mailto:spambayes-bounces+whisper=oz.net@python.org]On Behalf Of Tim
> Peters
> Sent: Friday, September 06, 2002 12:24
> To: spambayes@python.org
> Subject: RE: [Spambayes] Deployment
>
> [Guido]
> > ...
> > - A program that acts both as a pop client and a pop server.  You
> >   configure it by telling it about your real pop servers.  You then
> >   point your mail reader to the pop server at localhost.  When it
> >   receives a connection, it connects to the remote pop servers, reads
> >   your mail, and gives you only the non-spam.
>
> FYI, I'll never trust such a scheme:  I have no tolerance for false
> positives, and indeed do nothing to try to block spam on any of my email
> accounts now for that reason.  Deliver all suspected spam to a Spam folder
> instead and I'd love it.
> _______________________________________________
> Spambayes mailing list
> Spambayes@python.org
> http://mail.python.org/mailman-21/listinfo/spambayes

Statistical problem:

Goal is to use statistical methodology to filter our mail.

Another Example: Web Logs

DATA
nchass03.telenet-ops.be - - [28/Dec/2003:06:33:55 -0600] "GET / HTTP/1.1" 200 718
nchass03.telenet-ops.be - - [28/Dec/2003:06:34:03 -0600] "GET /R.css HTTP/1.1" 200 658
nchass03.telenet-ops.be - - [28/Dec/2003:06:34:13 -0600] "GET /logo.html HTTP/1.1" 200 244
nchass03.telenet-ops.be - - [28/Dec/2003:06:34:23 -0600] "GET /navbar.html HTTP/1.1" 200 1418
nchass03.telenet-ops.be - - [28/Dec/2003:06:34:23 -0600] "GET /banner.shtml HTTP/1.1" 200 3185
nchass03.telenet-ops.be - - [28/Dec/2003:06:34:33 -0600] "GET /Rlogo.jpg HTTP/1.1" 200 8793
pcp07845748pcs.wilmsc01.tn.comcast.net - - [28/Dec/2003:06:36:06 -0600] "GET /contrib/extra/dcom HTTP/1.1" 301 342
pcp07845748pcs.wilmsc01.tn.comcast.net - - [28/Dec/2003:06:36:06 -0600] "GET /contrib/extra/dcom/ HTTP/1.1" 200 1404
pcp07845748pcs.wilmsc01.tn.comcast.net - - [28/Dec/2003:06:36:06 -0600] "GET /icons/blank.gif HTTP/1.1" 200 148
pcp07845748pcs.wilmsc01.tn.comcast.net - - [28/Dec/2003:06:36:06 -0600] "GET /contrib/extra/dcom/ HTTP/1.1" 200 1404
pcp07845748pcs.wilmsc01.tn.comcast.net - - [28/Dec/2003:06:36:06 -0600] "GET /icons/blank.gif HTTP/1.1" 200 148
pcp07845748pcs.wilmsc01.tn.comcast.net - - [28/Dec/2003:06:36:06 -0600] "GET /icons/binary.gif HTTP/1.1" 200 246
pcp07845748pcs.wilmsc01.tn.comcast.net - - [28/Dec/2003:06:36:06 -0600] "GET /icons/back.gif HTTP/1.1" 200 216
pcp07845748pcs.wilmsc01.tn.comcast.net - - [28/Dec/2003:06:36:07 -0600] "GET /icons/compressed.gif HTTP/1.1" 200 1038
pcp07845748pcs.wilmsc01.tn.comcast.net - - [28/Dec/2003:06:36:07 -0600] "GET /icons/text.gif HTTP/1.1" 200 229
pcp07845748pcs.wilmsc01.tn.comcast.net - - [28/Dec/2003:06:36:15 -0600] "GET /contrib/extra/dcom/ReadMe.txt HTTP/1.1" 200 12247
nchass03.telenet-ops.be - - [28/Dec/2003:06:37:15 -0600] "GET /bin/windows HTTP/1.1" 301 335
nchass03.telenet-ops.be - - [28/Dec/2003:06:37:32 -0600] "GET /bin/windows/ HTTP/1.1" 200 1353
pr3-ts.telepac.pt - - [28/Dec/2003:06:38:53 -0600] "GET /bin/linux/debian/dists/sarge/main/binary-i386/Packages.gz HTTP/1.1" 200 2994
pr3-ts.telepac.pt - - [28/Dec/2003:06:39:02 -0600] "GET /bin/linux/debian/dists/sarge/main/binary-i386/Packages.gz HTTP/1.1" 200 2994
pr3-ts.telepac.pt - - [28/Dec/2003:06:39:17 -0600] "GET /bin/linux/debian/dists/sarge/main/binary-i386/Packages.gz HTTP/1.1" 200 2994
nchass03.telenet-ops.be - - [28/Dec/2003:06:39:28 -0600] "GET /bin/windows/base HTTP/1.1" 301 340
nchass03.telenet-ops.be - - [28/Dec/2003:06:39:47 -0600] "GET /bin/windows/base/ HTTP/1.1" 200 2167
nchass03.telenet-ops.be - - [28/Dec/2003:06:41:28 -0600] "GET /bin/windows/base/README.rw1081 HTTP/1.1" 200 11036
ip142177159155.ns.aliant.net - - [28/Dec/2003:06:42:22 -0600] "GET /robots.txt HTTP/1.1" 404 294
p38.nas2.is4.u-net.net - - [28/Dec/2003:06:43:40 -0600] "GET / HTTP/1.1" 200 718p38.nas2.is4.u-net.net - - [28/Dec/2003:06:43:41 -0600] "GET /R.css HTTP/1.1" 200 658
p38.nas2.is4.u-net.net - - [28/Dec/2003:06:43:41 -0600] "GET /logo.html HTTP/1.1" 200 244
p38.nas2.is4.u-net.net - - [28/Dec/2003:06:43:41 -0600] "GET /navbar.html HTTP/1.1" 200 1418
p38.nas2.is4.u-net.net - - [28/Dec/2003:06:43:42 -0600] "GET /banner.shtml HTTP/1.1" 200 3185
p38.nas2.is4.u-net.net - - [28/Dec/2003:06:43:41 -0600] "GET /navbar.html HTTP/1.1" 200 1418
p38.nas2.is4.u-net.net - - [28/Dec/2003:06:43:42 -0600] "GET /banner.shtml HTTP/1.1" 200 3185
p38.nas2.is4.u-net.net - - [28/Dec/2003:06:43:43 -0600] "GET /Rlogo.jpg HTTP/1.1" 200 8793
lj1229.inktomisearch.com - - [28/Dec/2003:06:45:34 -0600] "GET /robots.txt HTTP/1.0" 404 283

Steps

Goals

Goals of the Course

General Information

Course Materials

Computing Resources

Grading

Participation 10% Class mailing list and in class
     
Homeworks 30% 5 Short computing assignments
     
Projects 60% 4 Parts each worth 15%
    May be done in groups of 2
    SPAM, Web site access, Network intrusion detection
    E-mail connectivity and social networks

Expectations: Although there is no computing, probability, or statistics prerequisite for this course, there is an expectation that you have the

to Explore on your own and Learn as needed.




2004-01-21