IB Class Notes

Calculating Correlation

Aim: Here we will learn how to calculate coefficient of correlation (\(r\)) by hand with the help of EXCEL.

As a reminder, coefficient of correlation (\(r\)) tells us whether two sets of data, say rate of chemical reaction and temperature, are linearly related or not.

Click here to open the EXCEL spreadsheet to learn the calculation with guided steps. Please remember to enable Macro to activate the spreadsheet. Some exercises are available within the spreadsheet at "More About This Spreadsheet."

What should my value of r be?

  1. If your data points rise from left to the right of the graph (\(x\) and \(y\) increases together) then your \(r\) will be positive.

  2. Your \(r\) should be negative if the data points slope downward from left to the right of the graph. That is, as \(x\) increases then value of \(y\) decreases and vice versa.

  3. Perfect positive linear correlation is +1 and perfect negative linear correlation is -1.

  4. No linear correlation is when \(r = 0\) .

  5. Thus, the value of \(r\) must always lie between -1 and +1, i.e. -1 ≤ \( r < 1\)
    If your value of calculated \(r\) is not in this range then you probably have made a mistake and should re-check your workings.

  6. Coefficient of correlation (\(r\)) is calculated as
    \[ r = \large \frac{\sum{xy} -\frac{1}{n}\sum{x} \sum{y} }{ \sqrt{ \left( \sum{x^2} -\frac{1}{n}(\sum{x})^2 \right) \left(\sum{y^2}-\frac{1}{n}(\sum{y} )^2 \right) } } \] Product-moment correlation coefficient is another name for \(r\). Another way to express \(r\) is
    \( r = \large \frac{S_{xy}}{S_xS_y} \)
    \( S_{xy} = \sum{xy} - \frac{1}{n}\sum{x}\sum{y} \) ,
    \( S_x = \sqrt{ \left( \sum{x^2} -\frac{1}{n}(\sum{x})^2 \right) } \), and
    \( S_x = \sqrt{ \left( \sum{x^2} -\frac{1}{n}(\sum{y})^2 \right) }\)

Note that
\( S_{xy} = ns_{xy} \) or \( n \)Covariance of \(x\) and \(y\). Distinguish the use of capital \(S\) from small \(s\).
The covariance of \(x\) and \(y\), \(s_{xy} = \frac{\sum{xy} - \large \frac{\sum{x}\sum{y}}{n} }{n}\)
Similarly \( S_x = ns_x \) or \( n \)standard deviation of \(x\).
Standard deviation of \(x\), \( s_x = \sqrt{ \frac{\left[ \sum{x^2} - \large \frac{(\sum{x})^2}{n} \right]}{n} } \) and
\( S_y = ns_y \) or \( n \)standard deviation of \(y\).
Standard deviation of \(y\), \( s_y = \sqrt{ \frac{\left[ \sum{y^2} - \large \frac{(\sum{y})^2}{n} \right]}{n} } \)


Email KokMing Lee
Li Po Chun United World College of Hong Kong,
10 Lok Wo Sha Lane, Sai Sha Road,
Shatin, New Territories, Hong Kong SAR.