# Define sum of squares

Regression analysis helps us determine how a data series can be fitted into a function. This may help explain how the data series was generated. The sum of squares is used to find the function that best fits the data. It’s a statistical technique used in regression analysis to determine the dispersion of data points.

The meaning of sum of squares can be better understood by explaining what it tells us. The sum of squares measures the deviation from the mean. In statistics, the mean is the average of a set of numbers, which can be most commonly used in measuring the central tendency. The arithmetic mean is calculated by adding the data set and dividing by the number of values. However, just knowing the mean may not be of much help. The objective is to know how much variation there is in a set of data. How far apart are the individual values from the mean, may give us some insight into how to fit the observations or values to the regression model that is created.

## Sum of squares formula

Sum of squares commonly known as variations. To find sum of squares we can use the formula:

For a set X of n items:

where:

Xi is the ith item in the set

X̄ is the mean of all items in the set

Hence, (Xi - X̄) = deviation of each item from the mean

Now let’s look at how to calculate sum of squares in detail:

Identify the same size: ‘n’ represents the sample size, viz-a-viz the number of measurements

Calculate the mean: calculate the arithmetic mean by adding all the measurements and dividing by the sample size, ‘n’.

Subtract each measurement from the mean: this will give you a series of ‘n’ individual deviations from the mean – some of which might be negative due to the number being larger than the mean.

Square the difference of each measurement: the result of a squared number is always positive, hence negative numbers in the series will be positive, giving you a series of ‘n’ positive numbers.

Add the squares together and divide by (n-1), this gives you the sum of squares, which is the standard variance for the sample size.

Let’s look at an example to see how to use the sum of squares:

The closing prices for XYZ stock for 5 days were 25.1, 26, 28.2, 25.8 and 27.3

Count, or ‘n’ = 5

Sum of total price = 132.4

Mean is 132.⅘ = 26.48

Subtract each measurement from the mean.

SS = (26.48 - 25.1)2 + (26.48 - 26)2 + (26.48 - 28.2)2 + (26.48 - 25.8)2 + (26.48 - 27.3)2

Square the series and add to get the sum of squares.

SS = (1.38)2 + (0.48)2 + (-1.72)2 + (0.68)2 + (-0.82)2

SS = 1.9 + 0.23 + 2.96 + 0.46 + 0.67

SS = 6.22

The example shows that 6.22 is the variability in the stock price of XYZ in the last five days, helping investors gauge price stability and volatility of the stock and invest accordingly. Often investors look for a price stable and low volatile stock.

Latest video