The Regression Line
Interactive computer-based tools provide students with the opportunity to easily investigate the relationship between a set of data points and a curve used to fit the data points. As students work with bivariate data in grades 9-12, they will be able to investigate relationships between the variables using linear, exponential, power, logarithmic, and other functions for curve fitting. Using interactive tools like the one below, students can investigate the properties of regression lines and correlation.
In analyzing the relationship between two variables in an experiment, one may try to fit a straight line or any simple curve to a plot of the data points. For example, the weight of a person often depends on their height. Both weight and height are variables. We would like to find a formula for weight as a function of height in general, a formula that we can use to predict any person's weight given only their height. To find such a formula, we take a sample of 40 (say) people and measure both the height and weight of each. For each person, we end up with a pair of numbers (x, y ), where x is the height and y is the weight. We plot the 40 height-weight pairs as points in the xy-plane to make what is called a scatterplot. Note that height is on the horizontal axis and weight is on the vertical axis. The "input" (independent variable) is height, which goes on the horizontal axis, and the "output" (dependent variable) is weight, which goes on the vertical axis.
We then try to fit a curve to these points that somehow represents the overall shape of the scatterplot and find the equation of that curve. The equation is then used to represent the relationship between height and weight in general and therefore to predict any person's weight if we know only their height.
There are many different kinds of curves one could fit to data. The graphs of linear, exponential, logarithmic, and power functions are all useful curves. In this i-Math, you will investigate the simplest one, the straight line, which is the graph of a linear function.
Plot points using the regression tool below. The tool will automatically find a straight line for you that" fits" the points. The line is called the "least squares regression line" of y on x. The tool will also calculate the equation
of the line for you and its Pearson correlation coefficient r, which you will study in part 3. The equation and the correlation coefficient are displayed in the top left corner of the tool; n is the number of points.
Linear Regression I Applet
Getting to Know the Regression Line
1. Plot one point and then click SHOW LINE. Why do you think a line is not graphed?
2. CLEAR the graph and plot two points that have whole number coordinates.
• On your own paper, find an equation for the line through these two points. • Click SHOW LINE. Compare the equation for the line drawn to the equation you calculated. Explain and resolve any differences.
3. CLEAR the graph and plot 3 points. Think about a line that "fits" these three points as closely as possible.
• Is it possible for a single straight line to contain all three of the points you plotted?
• On your own paper, sketch a line that you think best fits the three points.
• Click SHOW LINE. Do you think that the line graphed fits the points well? How does it compare to the line you drew?
4. CLEAR the graph and plot several points. Think about a line that best fits these points.
• Click SHOW LINE to see the "least-squares regression line" that fits these points.
• What do you think will happen to the regression line if you plot a new point? Try it and find out.
(NOTE: When you plot a new point without clearing the graph, then the new regression line is drawn automatically.)
• Plot some more points and see what happens. Describe any patterns or trends that you see.
5. The line that the computer draws is called the least-squares regression line. It "fits" the data points according to criteria that you will learn about later. Roughly, the least-squares regression line is the line that minimizes the squared "errors" between the actual points and points on the line. This makes the line fit the points. Just to get a better feel for the regression line, try the following tasks.
a) Plot 4 points so that the regression line is horizontal. Do this in several different ways.
b) Plot 3 points (not all on a line) so that the regression line is horizontal.