MATH3714 — Linear Regression and Robustness

In this module we will study how linear regression can be used to describe and analyse the relationship between explanatory variables $x_1, \ldots, x_n$ (input) and a response variable $y$ (output). The models we will consider are of the form

$y = \beta_0 + x_1 \beta_1 + \cdots + x_p \beta_p + \varepsilon$,
where the coefficients $\beta_i$ describe how strongly the response depends on the feature $x_i$, and the residual $\varepsilon$ represents the noise, i.e. the component of the data not explicitly described by the model. We will consider the following questions:

Time Table

There will be 27 lectures (L1 to L27) and 6 example classes (E1 - E6). The schedule is given in the following table.

w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11
Mon
2-3pm
25.09.
L1
02.10.
L3
09.10.
L6
16.10.
L8
23.10.
L11
30.10.
L13
06.11.
L16
13.11.
L18
20.11.
L21
27.11.
L23
04.12.
L26
Tue
11-12noon
26.09.
L2
03.10.
L4
10.10.
L7
17.10.
L9
24.10.
L12
31.10.
L14
07.11.
L17
14.11.
L19
21.11.
L22
28.11.
L24
05.12.
L27
Wed
9-10am
27.09.
E1
04.10.
L5
11.10.
E2
18.10.
L10
25.10.
E3
01.11.
L15
08.11.
E4
15.11.
L20
22.11.
E5
29.11.
L25
06.12.
E6

Handouts

The following links contain pdf copies of the handouts from the lectures.

Paper copies of the handouts are ususally available from the blue drawers in front of the taught students office on level 8 of the maths building.

Software

For the module we will use the statistical computing package R. This program is free software, and you can find the program and documentation at the R project homepage.

My recommendation would be to install the RStudio environment, which includes R, on your own computer and use this for the project. (Choose the open source version, "RStudio Desktop", on the download page.) Alternatively you can use RStudio or plain R on the university computers.

Below you can find the RStudio notebook files from the tutorials. I would recommend to download the "RStudio Notebook" to your own computer and to experiment with it in RStudio yourself (right click on the link and choose "Save link as …"); there is also a non-interactive "HTML version" which you can look at.

Useful resources for learning R include to following:

Data

The following data sets were used in the module.

  1. A toy data set for use in homework 9: ex02-q09.csv
  2. The stackloss data set built into R.
  3. A toy data set for use in homework 20: ex05-q20.csv

Links