IB Class Notes

Normal Distribution

Aims: By the end of this note, you will be able to

  1. to calculate \( P ( a < X < b ) \) as the area under the Normal Curve using GDC [extension: with statistical table] ;
  2. to obtain the inverse to Normal Curve using GDC [extension: with statistical table] &;
  3. to standardize a normal variable and use normal distribution as a model to solve problems.

Properties of a Normal Distribution Curve

  1. So far we have been considering discrete data and their distributions. In this section, we will look at distribution of continuous data. One of the most important distributions in statistics and of continuous data is Normal Distribution. Many measured quantities in natural sciences can be modeled using a normal distribution. Furthermore, as the sample size becomes sufficiently large, normal distribution can also be used to approximate Binomial distribution and Poisson distribution. The Normal distribution is often a good model of real world data when the data clusters about a mean and varies from this mean as a result of random factors. The variation from mean is measured by the standard deviation of the data. Tthe birth weights of babies, the heights of adults in a population and the IQ score of adults in a population can be modeled fairly well by normal distribution.
    A random variable X that is continuous with mean μ and standard deviation σ has probability density function \( f(x) \) \[ f(x) = \displaystyle \large \frac{1}{ \sigma \sqrt{2 \pi}} \large e ^{\frac{-(x- \mu)^2}{2\sigma^2 } } -------------- (1) \]
  2. X is called a normal variable and X ∼ N(μ,σ2) is read as "X is normally distributed with mean μ and a variance σ2 ."
    X ∼ N(10,2) is read as "X is normally distributed with mean 10 and a standard deviation of 2."
  3.  

  4. normal distribution curveAll normal distribution curves have the following features:
    1. bell-shaped,
    2. symmetric about mean μ,
    3. the domain extends from negative infinity to positive infinity,
    4. maximum value of \( f(x) = \frac{1}{ \sigma \sqrt{2 \pi}} \),
    5. the mean = mode = median,
    6. total area under the curve is 1,
    7. approximately 95% of the distribution lies between two standard deviations of the mean, and
      approximately 99.9% of the distribution lies between three standard deviations of the mean.
    8.  

  5. Normal curve with shading from a to bThe area under the Normal curve between a and b is P(a ≤X ≤ b).
    What is the value of P(X = a) ?
    Why is P(X ≤ a ) = P(X < a) ?

    The area under the curve between a and b can be calculated by integrating \( f(x) \) in equation (1) above but this is not easy. Thus, we often use statistical table or/and GDC for this task. By the way, students are not required to memorize equation (1) for the IB examination.
  6.  

  7. normal curve with shading up to z at one endTo facilitate the studies of all possible normal distributions, we could standardize the mean μ to 0 and variance σ2 to 1.
    This new distribution is called Standard Normal Distribution and the normal variable is denoted by Z,
    i.e. Z ∼ N(0,1),
    where \( \large \boldsymbol Z = \large \frac{X - \mu}{\sigma} \)
    Z is called "standardized X."
  8. The table for area under the standard normal curve is \( p\) =P(Z ≤ z) (see shaded area in the above diagram ) where \( p\) is the value of the probability that standard normal variable Z is less than or equal to value z. Note that P(Z ≤ z) is also denoted by \( \Phi (z) \).

    Table 1: Normal Probabilities [Extension]

    Selected Normal Probability Table
    Example 1
    Use Normal table to find the following:
    (a) P(Z < 0.70)
    (b) P(Z ≤ 0.82)
    (c) P(Z ≥ 0.82)
    (d) P(Z < -1.13)
    (e) P(Z ≥ -0.96)
    (f) P(0.70 < Z < 0.82)
    (g) P(-1.13≤ Z≤ 0.70)
    (h) P(-1.19 < Z < -0.75)
    (i) P( |Z| < 1.15)
    (j) P( |Z| ≥ 0.75)

    Solution
    (a) We move down the first column entitled z to 0.7 and across to the second column (as indicated by the appropriate line). P(Z < 0.70) = 0.7580.
    Note that P(Z = a)=0 thus P(Z ≤ a) = P(Z < a)

    (b)Normal curve with both tails shaded P(Z ≤ 0.82) = 0.7939 as above.
    (c) P(Z ≥ 0.82) = 1 - P(Z ≤ 0.82) [Refer to the diagram and let b=0.82 and we need to calculate the colored region in the right]
    P(Z ≥ 0.82 = 1 - 0.7939
          = 0.2061

    (d) For this question refer to this diagram. Let a=-1.13 and since the normal curve is symmetric about its mean then the area to the left of a is the same as the area to the right of b. Let b = -a. Thus, P(Z < -1.13) = 1 - P(Z < 1.13)
    P(Z < -1.13) = 1 - 0.8708
    diagram 5 P(Z < -1.13) = 0.1292

    (e) Again, we will use the symmetry property of normal curve. So P(Z ≥ -0.96) = P(Z ≥ 0.96) = 0.8315

    (f) P(0.70 < Z < 0.82) = P (Z<0.82) - P(Z<0.70)
    P(0.70 < Z < 0.82) = 0.7939-0.7580
          = 0.0359

    (g) P(-1.13≤ Z≤ 0.70) = P(-1.13 < Z < 0.70)
          = P(Z < 0.70) - P(Z<-1.13)
    The first part can be obtained from the table directly and is 0.7580 but the second part will need us to apply the symmetry as in (d) above. Thus,
    P(-1.13≤ Z≤ 0.70) = 0.7580 - 0.1292
          = 0.6288

    (h) P(-1.19 < Z < -0.75) = P(Z<-0.75) - P(Z<-1.19)
    Here again, we will use the symmetry as in (d).
    Notice that, P(Z<-0.75) = 1 - P(Z<0.75) and P(Z<-1.19)=1-P(Z<1.19). Thus,
    P(-1.19 < Z < -0.75) =1 - P(Z<0.75) - [1-P(Z<1.19)]
          = P(Z<1.19) - P(Z<0.75)
          = 0.8830-0.7734
       diagram 6   = 0.1096

    (i) P( |Z| < 1.15) = P( -1.15 < Z < 1.15 )
    From the diagram here, it is clear that we can rewrite the above as
    P( |Z| < 1.15) = P( Z < 1.15 ) - P(Z<-1.15)
          = P(Z<1.15) - [ 1 -P(Z<1.15)]
          = 2P(Z<1.15) - 1
          = P(Z<1.19) - P(Z<0.75)
          = 2(0.8749) - 1
     diagram 7     = 0.7498

    (j) We will first look at the diagram before we start calculating for P( |Z| ≥ 0.75). It is perhaps clear that the total shaded area is 2P(Z<-0.75) or 2[1 - P(Z<0.75)]. Thus,
    P( |Z | ≥ 0.75) = 2[1 - P(Z<0.75)]
          = 2[1 - 0.7734]
          = 0.4532


    Use of T1-84.
    To calculate the area underneath a normal curve, we press
    [2nd][VARS] for DISTR, select 2:normalcdf(
    The function normalcdf means normal cumulative density function and takes the following parameters:
    normalcdf( lowerbound, upperbound [,μ , σ] ) , If nothing is entered into μ and σ then the default values are 0 and 1 respectively which describes the standard normal curve.

    Examples:
    (i) P(Z ≤ 0.637) notice that 0.637 cannot be read directly from the normal probabilities table. We will enter
    normalcdf(-9E99,0.637) ≈ 0.7379 (4d.p.)
    where -9.E99 in this case is taken to approximate negative infinity. E is obtained by pressing [2nd][ , ].

    (ii) If X ∼ N(57, 8) and you like to find P(X≥51) then
    normalcdf(51,9E99,57, \( \sqrt{8} \) ) ≈ 0.9831 (4 d.p.)

    (iii) If X ∼ N(-5,6) and you like to find P(-6<X<0) then
    normalcdf(-6,0,-5,\( \sqrt{6} \) ) ≈ 0.6378 (4 d.p.)

    We can also use GDC to shade a normal curve.
    [2nd][VARS] for DISTR and at the top of the screen selet DRAW then 1:ShadeNorm( [ENTER]. This function takes in the same definition for parameters as normalcdf( . Here we will look at example (iii).
    ShadeNorm( -6,0,-5, \( \sqrt{6} \))
    To obtain the plot as above, you will need to make sure there is no equations in [Y=] and all STAT PLOT are off and clear all drawing first by [2nd][PRGM] and select 1:ClrDraw [ENTER]. Don't forget to change your window to reflect your distribution. An example is given above.

    With Nspire, the process is as follow. Go to Scratchpad. Press the [MENU] button and select the following:
    → 6:Statistics → 5:Distributions → 2:Normal Cdf and press [ENTER] The following screen will then pops up.The entry is exactly the same as explained in Ti-84 in the first column. The "E" in the screen below is obtained from [EE] button to the left of [A] button. Nspire screen for Normal Cdf
  9. EXERCISE A. Use GDC to repeat questions in example 1 above.
    (a) P(Z < 0.70)
    (b) P(Z ≤ 0.82)
    (c) P(Z > 0.82)
    (d) P(Z < -1.13)
    (e) P(Z ≥ -0.96)
    (f) P(0.70 < Z < 0.82)
    (g) P(-1.13 ≤ Z ≤ 0.70)
    (h) P(-1.19 < Z < -0.75)
    (i) P( |Z| < 1.15)
    (j) P( |Z| ≥ 0.75)

    EXERCISE B
    For these questions report your answers accurate to 3 significant figures.
    (a) If X ∼ N(82.1, 11.7) and you like to find P(X≥90) and P(70<X<85) .


    (b) If the height of adults in a town (H) can be modeled using H ∼N(172, 36.7). Height is measured in centimeter.
    (i) What is the probability a randomly selected adult from this town has height between 165 and 185?
    (ii) What is the probability a randomly selected adult from this town has at most 155 cm?


    Answers: (a) 0.0105, 0.802 ; (bi) 0.860, 0.00251.

  10. Inverse Normal

  11. In a lot of cases, when we are given a probability P( X <a) we will be interested to find out the value of \( x = a \) in the distribution that will give rise to the given probability. Here we are dealing with Inverse Normal probabilities. Although the method with Inverse Normal Table is presented here first this represents an extension in your syllabus. I recommend that you work through this section with a good understanding of all corresponding graphs.

  12. Table 2: Inverse Normal Probabilities.
    Notice that the first column is entitled "p" for probabilities in this table.
    Example 2:Find the value of a for the following:
    (a) P(Z ≤ a) = 0.765
    (b) P(Z ≤a) = 0.23
    (c) P(Z ≥ a) = 0.802
    (d) P(Z ≥ a) = 0.212
    (e) P(-a < Z < a) = 0.60
    (f)
    diagram 7 Here is a standard normal curve where the the shaded area on the left is given by P(Z<-a) and the area in the right is symmetrical to that on the left. Each shaded area is 0.215. Find the value a.

    Solutions:
    (a) Read directly from the Inverse Normal table. a = 0.7225.
    (b)
    diagram 4 P(Z ≤ a) = 0.23 represents the shaded area in the left. This cannot be read from the table because p starts from 0.50. But we can exploit the symmetry of the normal curve. Let b = -a and P(Z ≥ b) = 0.23. Thus, P(Z ≤ b) =1- 0.23 = 0.77 and from the Inverse Normal Table above we obtain
    b = 0.7389 and a= -0.7389

    (c)
    diagram 9 P(Z ≥ a) = 0.802 represents the shaded area in the diagram on the left. This cannot be read from the table. But we can exploit the symmetry of the normal curve. Let b = -a and P(Z ≤ b) = 0.802. Thus, from the Inverse Normal table above b = 0.8488 and
    a= -0.8488

    (d)
    diagram 8 P(Z ≥ a) = 0.212 represents the shaded area. This cannot be read from the table because p starts from 0.50. But we can exploit the fact that the total area underneath the curve is 1. So P(Z ≤ a) =1-0.212 = 0.788 and from the Inverse Normal Table above we obtain
    a = 0.7995.

    (e)
    diagram 6 P(-a < Z < a) = 0.60 represents the shaded area. From the graph, we known that each unshaded area in the tail is 0.20. Then, P(Z < a) = 0.80 and from the Inverse Normal table we obtain a = 0.8416.

    (f)
    diagram 7 Given that the curve is standard normal and each shaded area is 0.215. Thus, P(Z<b)=1-0.215 =0.785 where b represents the upper boundary. From the Inverse Normal table above, we obtain b as 0.7892. Since the curve is symmetric about the mean then b=-a. So a = 0.7892.

    Use of Ti-84
    To calculate the value x=a that is associated with an area underneath a normal curve, we press
    [2nd][VARS] for DISTR, select 3:invNorm(
    The function invNorm means "inverse of the normal distribution" and takes the following parameters:
    invNorm( area or probability [ μ, σ] ) , If nothing is entered into μ and σ then the default values are 0 and 1 respectively which describes the standard normal curve.

    Examples:
    (i) P(Z ≤ a) = 0.7653 notice that 0.7653 cannot be read directly from the normal probabilities table. We will enter
    invNorm(0.7653) ≈ 0.7235 (4d.p.)

    (i) P(Z ≤ a) = 0.2113
    invNorm(0.2113) ≈ -0.8019 (4d.p.)

    (iii) If X ∼ N(25, 6) and you like to find P(X ≤ a) = 0.123 then
    invNorm(0.123,25, \(\sqrt{6}\) ) ≈ 22.1583 (4 d.p.)
    With Ti Nspire the process is as follow. Go to Scratchpad. Press the [MENU] button and select the following:
    → 6:Statistics → 5:Distributions → 3:Inverse Normal and press [ENTER] The following screen will then pops up. The rest of the entry is similar to the one for Ti-84. Nspire Inverse Normal screen
  13. EXERCISE C
    Confirm the answers in example 2 above with GDC.

    EXERCISE D
    Report answers accurate to 3 significant figures.
    Use GDC to find the value of a in the following questions:

    (a) If X ∼ N(0.5, 0.3) then find P(X < a) = 0.2.

    (b) If X ∼ N(65, 15) then find P(X < a) = 0.053


    Answers: (a) 0.0390, (b) 58.7.

Solving problems using Normal Distribution.

Example 3
The weight (W) of a population of students in JazzSchool is normally distributed with a mean of 75.6kg and a standard deviation of 20.1 kg.
(a) Find the probability that a randomly selected student from JazzSchool has a weight of at least 60 kg.
(b) Find the probability that a randomly selected student from JazzSchool has a weight between 60 and 90 kg.

Solution

Method 1: By hand.
(a) First calculated the standardized W with \[ \large \boldsymbol z = \frac{w - \mu}{\sigma} \] \( z_1 = \frac{60-75.6}{20.1} \sim \) -0.7761 (to 4 d.p.)
P(Z≥-0.7761) = 1- P(Z<-0.7761)
    ≈ 0.781 (3 s.f.)

(b) \( z_2 = \frac{90-75.6}{20.1} \sim \) 0.7164
\( z_1 = \frac{60-75.6}{20.1} \sim \) -0.7761 (to 4 d.p.)
P(60< W < 90) = P(W<90)- P(W<60)
    = P(Z<0.7164)-P(Z<-0.7761)
    ≈ 0.544 (3 s.f.) with GDC.

Method 2: By GDC

(a)P(W ≥60) ≈ 0.781 (3 s.f.)
I suggest you skecth a normal curve with a shaded area as part of your working. Alternatively, you can report the working in Method 1 and report answer from GDC

(b) P(60< W < 90) ≈ 0.544 (3s.f.)

Example 4
Given that X ∼ N(μ, 212) and P( X < 29.8)=0.4525 then find the value of m.

Solution
First consider P(Z < a) =0.4525 and from GDC we know that a ≈ -0.1193 (to 4 d.p.).
Thus, applying \[ \large \boldsymbol Z = \frac{X - \mu}{\sigma} \] we have
-0.1193 = \( \frac{29.8-\mu}{21} \)
μ = 29.8 + 21(0.1193)
μ ≈ 32.3 (3 s.f.)

In this question, you will have to use the standardization formula to obtain the correct answer.

Example 5
Given that X ∼ N(μ, 4) and the 75th percentile of the distribution is 5.349 then find the value of m.

Solution
First consider P(Z < a) =0.75 and from GDC we know that a ≈ 0.6745 (to 4 d.p.).
Thus, applying \[ \large \boldsymbol Z = \frac{X - \mu}{\sigma} \] we have
0.6745 = \( \frac{5.349-\mu}{2} \)
μ = 5.349 - 2(0.6745)
μ ≈ 4.00 ( 3 s.f.)


EXERCISE E
(a) Given that X ∼ N(μ, (3.9)2) and P( X ≥ 150)=0.9976 then find the value of μ. [Answer 161(3s.f.)]

(b) Given that X ∼ N(87, σ2) and P( X ≥ 107)=0.0455 then find the value of σ. [Answer 11.8 (3s.f.)]


Summary: Write out the appropriate expression for the calculation of the following shaded areas.
Use a and b for boundaries.

(i)

Example: The shaded area is P(X ≤ z).

(ii) diagram 8

(iii) diagram 9

Write an expression for each of the shaded area shown here.

(iv)daigram 2 (v) diagram 7Assume that the shaded regions are symmetrical.

 

Answers: (ii) 1- P(X<a) (iiI) P(X<-a) since a <0 then -a >0. P(X<b) . (iv) P(X<b)-P(X<a). (v) P( |X| >a) where a>0.


Email KokMing Lee