Supervisor: Jochen Voss — J.Voss@leeds.ac.uk
I am happy to supervise student projects and assignments. If you want to do a project with me, here is some advice:
(This is a 30 credit project, but could be extended to 40 credits.)
A Poisson Process is a stochastic process which consists of a collection of (random) points in time. An example of a Poisson process could be the points of time where customers arrive in a shop. The concept of a Poisson process can be generalised to processes with points in arbitrary sets (instead of points in time). Different time interval ls may have different expected numbers of points; this is described by the intensity function $\lambda$: the expected number of points in the time interval $[a,b]$ is given by $$ E\bigl( N[a,b] \bigr) = \int_a^b \lambda(t) \,dt. $$
The task of this project is to study and compare different methods to estimate the intensity function $\lambda$ for a given instance of a Poisson process. A possible extension would be to consider methods for statistically testing the hypothesis that a sample of a Poisson process corresponds to a given intensity function. The project combines theoretical and computational aspects.
Selected references:
(This project can be taken either as a 40 or a 30 credits project)
Monte Carlo Methods allow to estimate an expectation $E(X)$ numerically by making use of the law of large numbers: $$ E(X) \approx \frac1n \sum_{i=1}^n X_i $$ where the $X_i$ are independent copies of $X$. The result is only approximate, the error decays as $n\to\infty$ and it transpires that the computational cost of approximating the expectation up to a precision of $\varepsilon$ grows like $1/\varepsilon^2$ as $\varepsilon\to 0$. In practical applications, sometimes $X$ and the $X_i$ themselves can be constructed only approximately. In these cases, an additional error is introduced by using approximations $X^{(m)}_i$ instead of $X_i$, where increasing $m$ reduces this error, but also causes additional computational cost.
In the presence of the two kinds of error discussed above, multilevel Monte Carlo methods minimise the computational cost for estimating $E(X)$ by balancing $m$ and $n$ in a clever way. It turns out that the optimal strategy involves generating many samples for small $m$ (where the construction is cheaper) and only a few samples for big $m$ (where the results are more accurate).
The objective of this project is to understand and summarise the underlying theory and to implement the resulting methods. The main focus of this project is theoretical, but obviously programming is also required.
Selected references:
(This is a 15 credit project but can be extended to any of the available project types)
Many algorithms in statistics, for example Monte Carlo methods,
make use of random numbers. Since computer programs are inherently
deterministic (two independent runs of the same program will result in
the same output), generating random numbers with a computer is a
non-trivial task. This problem is normally solved
using pseudo
random number generators
(PRNGs).
Modern PRNGs consist of two components:
The aim of this project is to study methods for use in the second of these steps.
The projects has a theoretical component (understanding why/how the methods work) and a computational component (implementing the methods and determining efficiency). For this topic, the MATH5835 (statistical computing) module is desirable, but not required.
Selected references:
(This project could be either a 15 credit project or could be extended to be a 30 credit project)
Large deviations theory is the topic of computing the asymptotic behaviour of small probabilities. This provides a refinement of results obtained by the law of large numbers: For example, where the law of large numbers tells us that the average of a series of throws of regular dice converges to 3.5, large deviation theory allows us to to study how fast the probability of staying away from this value goes to 0 as the number of throws increases.
The aim of this project is to understand and summarise the basics of large deviation theory. This is a theoretical project without a programming component, so solid understanding of basic probability will be helpful.
Selected references:
The following list contains projects which I have supervised in the past or which I am currently supervising.
| student | title | type | year |
|---|---|---|---|
| Chris Thompson | Numerical Simulation of SDEs | MSc thesis | 2010/11 |
| Yu-Wei Twu | Statistical Analysis of Web-Server Log Files | MSc thesis | 2010/11 |
| James Gardner | Modelling the Results of Sports Events | 4th year | 2010/11 |
| Mark Webster | Discrete Time Martingales and the Kalman Filter | 4th year | 2010/11 |
| Andreas Tsiatinis | Tools For the Statistical Analysis of Protein Loop Geometry | MSc thesis | 2009/10 |
| Chrystalla Loizou | Numerical Simulation of Stochastic Differential Equations | MSc thesis | 2009/10 |
| Julia Bowman | Random Number Generation | 3rd year | 2009/10 |
| Amy Palmer | Mathematics of Juggling | 3rd year | 2009/10 |