Undergraduate Research Projects

These are undergraduate in the sense of not requiring research-level mathematical probability knowledge (see Open Problems for ones which do). On the other hand they are intended to be serious in that successful results will often become a small part of some future published scholarly paper. For undergraduate ``course projects" which are less serious, click here.

Current students (Summer 2012)

Students Project Description
Karthik Ganesan math models of road networks Graphics and simulation data for the "binary hierarchy" model in this paper
Bowen Huang Simulations of a model for city growth Graphics and simulation data for a short publication version of this paper.
Jian Li Dynamic random Gabriel networks Computer simulations and graphics.
Morgan Thompson [graduate volunteer] Data on dust-to-dust models Producing data used in Chapter 11 of Draft write-up of 13 lectures.

Xiaoyu (Lily) Wang Design of simulation of efficient road networks Continuing the theme of heuristic algorithms in this paper to study models with junctions.

General styles of possible future projects

Data collection

Simulation - drawing static pictures

In fact all of these would be even nicer as moving pictures, below.

Simulation - drawing moving pictures

The processes below are very easy to simulate, but I want "moving pictures", in Java or whatever, to illustrate their dynamics.

Inventing good heuristic algorithms and writing code

Statistical analysis

Math theory

I don't encourage undergrads to attempt to do new math research. But a good project is on some topic you have learned from a course or a textbook.

Previous projects and students

A link under ``project" indicates a write-up of specifically the student's work. Where the work features as part of a broader write-up, the link is under ``discussion". Some projects were under the URAP and VIGRE programs. Projects directly relevant to the "Probability in the Real World" project are marked (RW).

Students Project Description
Fayd Shelley (Summer 2006); Sunny Zhao (Fall 2008) Coincidences in Wikipedia (RW) Data appears in this unfinished draft paper.
Dennis Moy (Fall 2006) A regression model using common baseball statistics to project offensive and defensive efficiency. Undergraduate honors thesis.
Yanjiao Cheng, Jesse Friedman, Yu-Jay Huoh, Wayne Lee, Harrison Liu (Spring 2007) Statistics of road networks Data collection for Figure 1 of this paper. URAP and VIGRE.
Tamar Lando (Spring 2008) Efficient Networks and Enumerations on Forests Masters thesis. Part will appear as section xxx of xxx.
Julian Shun (Spring 2008) Optimal spatial networks Simulations, forming a substantial part of our joint paper.
Robert Huang (Spring 2008) Exploratory data analysis of amazon.com book review data. VIGRE.
Eric Chao and Regina Wu (Spring 2009) This and the next are continuations of the same project. URAP.
Timothy Wong (Spring 2009) Exploratory Data Analysis of Amazon.com Book Reviews Undergraduate honors thesis.
Amy Huang and Irvin Liu (Spring 2009) References to chance in blogs (RW) What type of things do "ordinary people" attribute to chance? One way to study this is to search through blogs. URAP.
Amy Huang and Irvin Liu (Spring 2009) The 1.4 trillion dollar project (RW). A Google search on "1.4 trillion dollars" gets a surprisingly large number of hits, which can be traced back to some smaller number of different appearances of "1.4 trillion dollars" in some authoritative data. The project was to count this "number of different appearances" for a variety of dollar amounts (2.8 trillion; 1.8 billion, etc) to see whether they follow a particular "informationless" distribution. URAP.
Tung Phan (Spring 2009) Benford's law. (RW) Data collection, forming a substantial part of our short joint paper When Can One Test an Explanation? Compare and Contrast Benford's Law and the Fuzzy CLT exhibiting a typical undergrad project.
Priscilla Ku and Janet Larwood (Spring 2009) 40,000 coin tosses yield ambiguous evidence for dynamical bias (RW) Testing a prediction of Persi Diaconis et al that in coin-tossing there is a small bias -- maybe 1/100 - towards the coin landing the same way as it started. URAP.
Alan Choi (Spring 2009) Statistics of road networks Data collection, forming a substantial part of our short joint paper A Route-Length Efficiency Statistic for Road Networks .
Wei Zhou and Jonathan Ong (Spring 2009) Empires and percolation . Simulations and pictures, used to complement theory in our joint paper Empires and percolation: stochastic merging of adjacent regions.
Bowei Zheng (2008-2009) Java simulations for a "parking process" The process was studied analytically in this old paper.
Tung Phan (Fall 2009) What can you predict about a team's performance next season? (RW) Quantifies the regression effect for sports teams.
Karthik Ganesan [VIGRE] (Spring 2012) Empirical Study on Route-Length Efficiency of Road Networks Data collection for route-length efficiency of road networks. Graphics used in this talk
Hyerim Hong [Independent study] (Spring 2012) Perception on role of chance in different aspects of life Via a survey
Bowen Huang [VIGRE] (Spring 2012) City Growth Model Simulation Here is a slightly complicated model for city growth in which cities have positions, populations and spheres of influence. It's not hard to simulate the process, but I want some pretty pictures of the spheres of influence.
Willy Lai [VIGRE] (Spring 2012) Fitting power-law distributions to data Testing data for fit to power-law distributions. e.g. this data on family names.
Russell Mays [volunteer] (Spring 2012) Road route networks linking 4 addresses Take 4 street addresses, whose positions form roughly a square, of side length roughly 5 miles or 50 miles or 500 miles. Use e.g. Google maps to find the routes between each of the 6 pairs of addresses, and draw a map showing these 6 routes together. Mathematically there are about 45 topologically different possibilities for the map; presumably some come up often and others rarely, but which? And does it vary with distance (side-length of square)?
Max Moacanin [volunteer] (Spring 2012) Lucky vs Unlucky teams Assuming gambling odds give true probabilities, one can classify a team as having been lucky or unlucky so far. Do results of matches between lucky and unlucky teams fit the gambling odds?
Selene Xu [Independent study] (Spring 2012) Study of Auction Theory in eBay Data Collecting and studying data about auction prices.
Amy Zhang [honors thesis] (Spring 2012) Pairs trading A simulation study to explore possible relationship and connection between profit and different variables associated with stock selections in pairs trading.
Yiming Zhou [Independent study] (Spring 2012) Spatial Poisson processes Draft of possible Wikipedia article