The least squares regression line

Posted in Maths, Statistics 2

The least squares regression line is the line which produces the smallest value of the sum of the squares of the residuals. A residual is the vertical distance from a point on a scatter diagram to the line of best fit. Therefore the least squares regression line can be seen as the best line of best fit.

The equation of the least squares regression line of y on x is:

Dynamic image 0

Where b is:

Dynamic image 1

As you can see there are 3 different ways to calculate the value for b, based on what information you are given in the question

Example

Calculate the least squared regression line of y on x from the following data:

x 20 30 40 50 60 70
y 2.49 2.41 2.38 2.14 1.97 2.03

Firstly draw up a table of the values you need and fill it out. In this example we'll use the third equation for b so we need all the values of x2 and xiyi:

x y x2 xiyi
20 2.49 400 49.9
30 2.41 900 72.3
40 2.38 1600 95.2
50 2.14 2500 107
60 1.97 3600 118.2
70 2.03 4900 142.1
Sum: 270 13.42 13900 584.7

Next step is to calculate the means of the x and y values:

Dynamic image 2

Dynamic image 3

Now the value of b can be calculated:

Dynamic image 4

Hence the equation is therefore:

Dynamic image 5

Cleaning it up:

Dynamic image 6