Excel Tip of the Week #162 - Normal distribution functions

Hello all and welcome back to the Excel Tip of the Week!  This week, we have a Creator post in which we are looking at the functions for calculating various statistics related to the normal distribution - namely NORM.DIST, NORM. INV, NORM.S.DIST, NORM.S.INV, and STANDARDIZE.

Note that this post is intended to talk about how to use these functions in Excel - it is not a probability primer!

Also note that the functions we discuss here are those present in Excel 2010 and later - very similar functions (without the full stops) are present in Excel 2007 and earlier, and are available for compatibility reasons only in later versions.

Throughout this Tip, I will be using a well-known normal distribution as an example to illustrate each of the points - namely women's heights in inches.  I am using a mean of 63.6 inches and a standard deviation of 2.5 inches.

How to write these functions

=NORM.DIST(x, mean, standard deviation, cumulative)

=NORM.INV(probability, mean, standard deviation)

=NORM.S.DIST(z, cumulative)


=STANDARDIZE(x, mean, standard deviation)

x - The value being tested
mean - The mean of the normal distribution in question (assumed to be 0 for the NORM.S functions)
standard deviation - The standard deviation of the normal distribution in question (assumed to be 1 for the NORM.S functions)
cumulative - An option setting - TRUE or 1 will return the cumulative distribution function (i.e. the probability that the value will be less than or equal to x), and FALSE or 0 will return the probability density function (the probability that the value will be exactly x).
probability - The probability that is being worked back from for the inverse functions
z - The z-value for the standard normal distribution - essentially the same as the x-value for NORM.DIST

NORM.DIST tells us the probability for a result at a particular value - either the spot probability of hitting that exact value, or the cumulative probability of seeing that value or lower.  For example, we could calculate the probability that a woman is 5'7" (67 inches) as:
     =NORM.DIST(67, 63.6, 2.5, FALSE) = 6.3%
And the probability that a woman is 5'7" or shorter would be:
     =NORM.DIST(67, 63.6, 2.5, TRUE) = 91.3%
The probability that a woman is between 5' and 6' is the difference between these two functions:
     =NORM.DIST(72, 63.6, 2.5, TRUE) - NORM.DIST(60, 63.6, 2.5, TRUE) = 99.96% - 7.49% = 92.47%

NORM.INV does the reverse - that is, if we want a given likelihood of a certain event happening, what threshold should we set?  For example, if we want to find the lower quartile of women's heights:
     =NORM.INV(25%, 63.6, 2.5) = 61.9 inches
We can invert this by considering symmetry - for example, if we want a threshold that 60% of women are taller than, we can find the one that 40% of women are shorter than.

NORM.S.DIST and NORM.S.INV are essentially the same functions if the mean is 0 and the standard deviation is 1; these functions are used more in raw statistics.  STANDARDIZE counts how many standard deviations from the mean a certain value is.

Generating random numbers that conform to a known distribution

If you want to generate random numbers that follow a given mean and standard deviation, you can do so by using a RAND() function as the input to a NORM.INV function.  This is far fairer for creating dummy data than creating truly random data.  Check out the example in the attached file.

Previous post - Revisiting text extraction functions
TOTW index 
Next post - Tables redux, part 1

This blog is brought to you by the Excel Community where you can find additional blogs, extended articles and webinar recordings on a variety of Excel related topics. In addition to live training events, Excel Community members have access to a full suite of online training modules from Filtered. There is also an online forum where you can ask questions and share ideas with other community members.