Least Squares Fit to a Straight Line

Although it is relatively easy to draw a straight line with ruler and pencil through a series of points if they all fall on or near the line, it becomes more and more a matter of judgment if the data are scattered. The least-squares line of best fit minimizes the sum of the squares of the y deviations of individual points from the line. This statistical technique is called regression analysis. Regression analysis in the simplest form assumes that all deviations from the line are the result of error in the measurement of the dependent variable y.

Regression analysis uses the quantities defined below, where there are N measurements of X(, y, data pairs.

Sxy= Tjcjyi-TjciIly/N (13-5)

For a straight line y^mx + b, the least-squares slope and intercept are given by equations (13-6) and (13-7).

The correlation coefficient, R, is a measure of the correlation between x and y. If x and y are perfectly correlated (i.e., a perfect straight line), then R = 1. An R value of zero means that there is no correlation between x and y, and an R value of—1 means that there is a perfect negative correlation.

More commonly, R2, the square of the correlation coefficient, given by equation (13-8), is used as the measure of correlation; it ranges from 0 (no correlation) to 1 (perfect correlation).

R2 can be used as a measure of the goodness of fit of data to (in this case) a straight line. A value of R2 of less than 0.9 corresponds to a rather poor fit of data to a straight line.

Excel provides worksheet functions to calculate the least-squares slope, intercept and R2 of the straight line y = mx + b.

0 0

Post a comment