Student Projects

Supervisor: Jochen Voss — J.Voss@leeds.ac.uk

General Remarks

I am happy to supervise student projects and assignments. If you want to do a project with me, here is some advice:

Example Topics

Estimating the Intensity Function of a Poisson Process

[Poisson process]

(This is a 30 credit project, but could be extended to 40 credits.)

A Poisson Process is a stochastic process which consists of a collection of (random) points in time. An example of a Poisson process could be the points of time where customers arrive in a shop. The concept of a Poisson process can be generalised to processes with points in arbitrary sets (instead of points in time). Different time interval ls may have different expected numbers of points; this is described by the intensity function $\lambda$: the expected number of points in the time interval $[a,b]$ is given by $$ E\bigl( N[a,b] \bigr) = \int_a^b \lambda(t) \,dt. $$

The task of this project is to study and compare different methods to estimate the intensity function $\lambda$ for a given instance of a Poisson process. A possible extension would be to consider methods for statistically testing the hypothesis that a sample of a Poisson process corresponds to a given intensity function. The project combines theoretical and computational aspects.

Selected references:

Multilevel Monte Carlo Path Simulation

[multi-level BM]

(This project can be taken either as a 40 or a 30 credits project)

Monte Carlo Methods allow to estimate an expectation $E(X)$ numerically by making use of the law of large numbers: $$ E(X) \approx \frac1n \sum_{i=1}^n X_i $$ where the $X_i$ are independent copies of $X$. The result is only approximate, the error decays as $n\to\infty$ and it transpires that the computational cost of approximating the expectation up to a precision of $\varepsilon$ grows like $1/\varepsilon^2$ as $\varepsilon\to 0$. In practical applications, sometimes $X$ and the $X_i$ themselves can be constructed only approximately. In these cases, an additional error is introduced by using approximations $X^{(m)}_i$ instead of $X_i$, where increasing $m$ reduces this error, but also causes additional computational cost.

In the presence of the two kinds of error discussed above, multilevel Monte Carlo methods minimise the computational cost for estimating $E(X)$ by balancing $m$ and $n$ in a clever way. It turns out that the optimal strategy involves generating many samples for small $m$ (where the construction is cheaper) and only a few samples for big $m$ (where the results are more accurate).

The objective of this project is to understand and summarise the underlying theory and to implement the resulting methods. The main focus of this project is theoretical, but obviously programming is also required.

Selected references:

Random Number Generation

[Ziggurat]

(This is a 15 credit project but can be extended to any of the available project types)

Many algorithms in statistics, for example Monte Carlo methods, make use of random numbers. Since computer programs are inherently deterministic (two independent runs of the same program will result in the same output), generating random numbers with a computer is a non-trivial task. This problem is normally solved using pseudo random number generators (PRNGs).

Modern PRNGs consist of two components:

  1. one algorithm to create (pseudo) random numbers uniformly on the interval $[0,1]$, and
  2. a family of algorithms to convert these random number to a given distribution (like normal, binomial, Poisson, etc.)

The aim of this project is to study methods for use in the second of these steps.

The projects has a theoretical component (understanding why/how the methods work) and a computational component (implementing the methods and determining efficiency). For this topic, the MATH5835 (statistical computing) module is desirable, but not required.

Selected references:

Introduction to the Theory of Large Deviations

[tail probability]

(This project could be either a 15 credit project or could be extended to be a 30 credit project)

Large deviations theory is the topic of computing the asymptotic behaviour of small probabilities. This provides a refinement of results obtained by the law of large numbers: For example, where the law of large numbers tells us that the average of a series of throws of regular dice converges to 3.5, large deviation theory allows us to to study how fast the probability of staying away from this value goes to 0 as the number of throws increases.

The aim of this project is to understand and summarise the basics of large deviation theory. This is a theoretical project without a programming component, so solid understanding of basic probability will be helpful.

Selected references:

Past Projects

The following list contains projects which I have supervised in the past or which I am currently supervising.

student title type year
Chris Thompson Numerical Simulation of SDEs MSc thesis 2010/11
Yu-Wei Twu Statistical Analysis of Web-Server Log Files MSc thesis 2010/11
James Gardner Modelling the Results of Sports Events 4th year 2010/11
Mark Webster Discrete Time Martingales and the Kalman Filter 4th year 2010/11
Andreas Tsiatinis Tools For the Statistical Analysis of Protein Loop Geometry MSc thesis 2009/10
Chrystalla Loizou Numerical Simulation of Stochastic Differential Equations MSc thesis 2009/10
Julia Bowman Random Number Generation 3rd year 2009/10
Amy Palmer Mathematics of Juggling 3rd year 2009/10