What is “Least Squares” Regression?

Web resource used: http://www.keypress.com/sketchpad/java_gsp/squares.html

Objective: To gain conceptual understanding with regard to the principle of least squares regression; to understand the meaning of a “line of best fit.”

Created by: Felice Shore (Baltimore City Community College) for Project Synergy

This site illustrates the principle of least squares regression. You see six points, labeled P(1), P(2),…,P(6) with a line drawn through them (not necessarily the line of best fit!). But what are those squares??…and what makes a line of “best fit” ?

On the graph, you see a line that is “going through the neighborhood” of six points.  Ideally, we’d have a line that goes through all six points, but that is impossible if the points are not collinear. So each point has an error associated with the line, which is the vertical distance from that point to the line. This is called the residual, or error. That error is visually depicted as the vertical side of each square coming off of each point. If we squared that error, we get a square number, visually depicted as the area of each square. The SSE is the sum of squared errors for all points in a data set.  That is, imagine finding that squared residual for all data points in a set, and then adding up those squared residuals, to get the SSE. In the graph, the large red square is the total area of all six squares; it is a visual depiction of the SSE.

A line of “best fit” is constructed in such a way as to minimize the SSE. That is, the line that will fit that data best will be the line that is associated with the smallest red square you can possibly make.  Notice that the total area of the red square is given to you.

            First, be aware of the six data points in this graph. They are labeled P(1) through P(6). A y-intercept and a slope “control point” mark the line itself. You can effectively change the location and orientation of the line  by clicking and dragging either of those two control points. Those are the only two points you should move.  

If you click and drag the “y-intercept” up and down the y-axis, you will move your line up or down without changing the slope. If you click on and drag the “slope” point on the line, you will rotate the line, thereby changing only it’s slope, but not the y-intercept. Remember, the large red square in the bottom right is showing you the total area of all six smaller squares.

  1. First click on and drag the y-intercept either up or down beyond all six data points. Notice how large the red square got. Clearly, your new line is not “fitting the data best” in this location.
  1. Move the line back down among the points, and now use the slope control point, along with the y-intercept point to try to get the smallest sum of squares you can get. When you think you have the best line, write down that minimum SSE you were able to get:__________. See if anyone else in the class got a smaller SSE. If they did, then their line is a better line of fit.
  1. (Change the scale on the grid to accommodate a “larger” area of the plane by moving the red point that marks the “x=1” scale to the left a bit.) Now, you and a partner should each change the data set by moving the six points around. Make sure that both of you have roughly the same new data set. Now move the slope and y-intercept to try to find the line of best fit. Who can find the minimum SSE?