In this module we will study how linear regression can be used to describe and analyse the relationship between explanatory variables $x_1, \ldots, x_n$ (input) and a response variable $y$ (output). The models we will consider are of the form

$y = \beta_0 + x_1 \beta_1 + \cdots + x_p \beta_p + \varepsilon$,

- How to estimate the coefficients $\beta_0, \ldots, \beta_p$ from data?
- How much of the variance in $y$ is described by the $x_i$? How much by the noise $\varepsilon$?
- Is a linear model appropriate for the data?
- What happens if there are outliers in the data?

There will be 27 lectures (L1 to L27) and 6 example classes (E1 - E6). The schedule is given in the following table.

w1 | w2 | w3 | w4 | w5 | w6 | w7 | w8 | w9 | w10 | w11 | |
---|---|---|---|---|---|---|---|---|---|---|---|

Mon 2-3pm | 25.09. L1 | 02.10. L3 | 09.10. L6 | 16.10. L8 | 23.10. L11 | 30.10. L13 | 06.11. L16 | 13.11. L18 | 20.11. L21 | 27.11. L23 | 04.12. L26 |

Tue 11-12noon | 26.09. L2 | 03.10. L4 | 10.10. L7 | 17.10. L9 | 24.10. L12 | 31.10. L14 | 07.11. L17 | 14.11. L19 | 21.11. L22 | 28.11. L24 | 05.12. L27 |

Wed 9-10am | 27.09. E1 | 04.10. L5 | 11.10. E2 | 18.10. L10 | 25.10. E3 | 01.11. L15 | 08.11. E4 | 15.11. L20 | 22.11. E5 | 29.11. L25 | 06.12. E6 |

The following links contain pdf copies of the handouts from the lectures.

Paper copies of the handouts are ususally available from the blue drawers in front of the taught students office on level 8 of the maths building.

For the module we will use the statistical computing package R. This program is free software, and you can find the program and documentation at the R project homepage.

My recommendation would be to install the RStudio environment, which includes R, on your own computer and use this for the project. (Choose the open source version, "RStudio Desktop", on the download page.) Alternatively you can use RStudio or plain R on the university computers.

Below you can find the RStudio notebook files from the tutorials. I would recommend to download the "RStudio Notebook" to your own computer and to experiment with it in RStudio yourself (right click on the link and choose "Save link as …"); there is also a non-interactive "HTML version" which you can look at.

- Tutorial 1: RStudio Notebook, HTML version
- Tutorial 2: RStudio Notebook, HTML version
- Tutorial 3: RStudio Notebook, HTML version
- Tutorial 4: RStudio Notebook, HTML version
- Tutorial 5: RStudio Notebook, HTML version

Useful resources for learning R include to following:

- Some introductory notes I wrote for the 2015/16 version of the MATH1712 module.
- The R manual.
- The R online help, accessed by typing help() or help.start() in R.
- The departmental web page has a list with some R tutorials.

The following data sets were used in the module.

- A toy data set for use in homework 9: ex02-q09.csv
- The stackloss data set built into R.
- A toy data set for use in homework 20: ex05-q20.csv

- MATH3714 module catalog entry
- The university timetable/room plan