XML Assignments
Zillow
Sign up
with Zillow.com to get a free "Zillow Web Service
Identifier" (ZWS-ID).
ThenPossible tasks
-
- use the code from the example in the lecture to make
queries to get estimates for different houses
-
-
- For a given house (identified by Zillow Property ID (an
integer))
get information about the comparable houses and
explore the distribution of values.
Then explore the joint distribution of estimated value and
- number of square foot,
- lot size,
- number of rooms,
- number of bathrooms,
- year built,
- last sold price and date
Some sample code
reply = getForm("http://www.zillow.com/webservice/GetDeepComps.htm",
'zws-id' = "xxxxxxxxxx",
zpid = 24842792,
count = 10)
# then parse the XML
doc = xmlInternalTreeParse(reply, asText = TRUE)
top = xmlRoot(doc)
names(tt[["response"]][[1]][["comparables"]])
cat(saveXML(tt[["response"]][[1]][["comparables"]][[1]]))
Statistical Journal Bibliographies
Full description
& Links to Data
Read data detailing papers in various the
XML bibliographies of various statistical
journals and do some exploratory data analysis
to
- find counts of papers by journal, year, etc.
- number of papers by author
- collaborators/joint authors
- the change in "popularity" of topics based on word
analysis,
- etc.
2004 National Election
Scrape the data from the 2004 presidential election results by state
(not county) from the USA today HTML pages.
-
- Compare presidential race to Senate and house races.
-
For each state, get the information (name, party, number
of votes) about the winners
of the senate and house seats, total number of votes in race,
and "compare" with the presidential outcome across the
50 states.
RSS Feeds
RSS stands for Real Simple Syndication and it is.
Parse and explore the content of an RSS feed of interest to you,
e.g.
- BBC Headlines at
http://en-us.fxfeeds.mozilla.com/en-US/firefox/headlines.xml,
- Google news http://news.google.com/news?ned=us&topic=h&output=rss
European Central Bank Exchange Rates
The exchange rates for different currencies
from 1999 to the present is available
from the European Central Bank
at http://www.ecb.europa.eu/stats/eurofxref/eurofxref-hist.xml
Read this data into R an explore it graphically.
Duncan Temple Lang
<duncan@wald.ucdavis.edu>
Last modified: Wed Jul 16 06:29:50 PDT 2008