May 3, 2012

## Linear Regression with Excel 2010

Linear regression is a way to determine how close two number series of data: x (independent) and y (potentially dependent), fit a linear function of the form: y = a*x + b.

This is the first of a series of planned posts that will cover how to set up linear regression a variety of different languages. This week, we will discuss the easiest method of performing Linear regression analysis; and that is with Excel 2010.

First, we will need to enable the Analysis ToolPak for Excel:

1)      Click File and select options

3)      Select Excel Add-ins in the drop down list named Manage at the bottom of the pop up

4)      Click the ‘Go’ button

5)      Tick the checkbox for Analysis ToolPak if it is empty

6)      Click the ‘Ok’ button

Once the Add-in is installed, create a table of data similar to the following:

 x y 7 216 12 302 14 431 6 151 22 677 38 958 14 431 9 227 10 308 11 277

To perform the Regression analysis, select a cell and then click the ‘Data’ ribbon tab. Then double-click the Data Analysis section of the ribbon. This will present a popup similar to the following:

Select the Regression option and click the ‘Ok’ button. This will prompt another popup to enter the cell ranges for the data which will be analyzed: For the Input Y Range, select all cells in the Y column; including the header. Do the same for the Input X Range using the X column. Be sure to select the Labels checkbox and then click the ‘Ok’ button. Doing so will create the regression analysis within your spreadsheet:

Next week, we will talk a bit about what all of this means, but for right now, we will focus on the R Square value, and the Coefficients of the rows labeled ‘Intercept’ and ‘x’. The best way to demonstrate the significance of these values is to present them in a scatter chart with the linear trend line being rendered. This can be done in the following steps:

1)      Click an empty cell

2)      Select the ‘Insert’ ribbon tab

3)      Select the Scatter Plot and choose ‘Scatter with only Markers’

4)      Right Click in the center of the Chart and choose ‘Select Data’

6)      Click the ‘Ok’ button

Your chart should be displayed similar to the following:

The following steps can be used to display the linear trend line in the chart:

1)      Left Click any of the plot markers

2)      Right Click and select ‘Add Trendline’

3)      Select Linear in the Trendline Options pop up

4)      Enable the checkbox for ‘Display Equation on chart’

5)      Enable the checkbox for ‘Display R-Squared value on chart’

6)      Click the close button And there you have it; Linear Regression done simply in Excel 2010. You should notice that the formula given uses the values obtained in the Summary Output of the Data Analysis step. Next week, we will discuss how to perform Regression analysis programmatically in C#.