Hello all and welcome back to the Excel Tip of the Week! This week, we have a Creator post in which we are looking at the functions for calculating various statistics related to the normal distribution - namely NORM.DIST, NORM. INV, NORM.S.DIST, NORM.S.INV, and STANDARDIZE.
Note that this post is intended to talk about how to use these functions in Excel - it is not a probability primer!
Also note that the functions we discuss here are those present in Excel 2010 and later - very similar functions (without the full stops) are present in Excel 2007 and earlier, and are available for compatibility reasons only in later versions.
Throughout this Tip, I will be using a well-known normal distribution as an example to illustrate each of the points - namely women's heights in inches. I am using a mean of 63.6 inches and a standard deviation of 2.5 inches.
How to write these functions
=NORM.DIST(x, mean, standard deviation, cumulative)
=NORM.INV(probability, mean, standard deviation)
=STANDARDIZE(x, mean, standard deviation)
x - The value being testedmean - The mean of the normal distribution in question (assumed to be 0 for the NORM.S functions)standard deviation - The standard deviation of the normal distribution in question (assumed to be 1 for the NORM.S functions)cumulative - An option setting - TRUE or 1 will return the cumulative distribution function (i.e. the probability that the value will be less than or equal to x), and FALSE or 0 will return the probability density function (the probability that the value will be exactly x).probability - The probability that is being worked back from for the inverse functionsz - The z-value for the standard normal distribution - essentially the same as the x-value for NORM.DIST
NORM.DIST tells us the probability for a result at a particular value - either the spot probability of hitting that exact value, or the cumulative probability of seeing that value or lower. For example, we could calculate the probability that a woman is 5'7" (67 inches) as: =NORM.DIST(67, 63.6, 2.5, FALSE) = 6.3%And the probability that a woman is 5'7" or shorter would be: =NORM.DIST(67, 63.6, 2.5, TRUE) = 91.3%The probability that a woman is between 5' and 6' is the difference between these two functions: =NORM.DIST(72, 63.6, 2.5, TRUE) - NORM.DIST(60, 63.6, 2.5, TRUE) = 99.96% - 7.49% = 92.47%
NORM.INV does the reverse - that is, if we want a given likelihood of a certain event happening, what threshold should we set? For example, if we want to find the lower quartile of women's heights: =NORM.INV(25%, 63.6, 2.5) = 61.9 inchesWe can invert this by considering symmetry - for example, if we want a threshold that 60% of women are taller than, we can find the one that 40% of women are shorter than.
NORM.S.DIST and NORM.S.INV are essentially the same functions if the mean is 0 and the standard deviation is 1; these functions are used more in raw statistics. STANDARDIZE counts how many standard deviations from the mean a certain value is.
Generating random numbers that conform to a known distribution
If you want to generate random numbers that follow a given mean and standard deviation, you can do so by using a RAND() function as the input to a NORM.INV function. This is far fairer for creating dummy data than creating truly random data. Check out the example in the attached file.
Previous post - Revisiting text extraction functionsTOTW index Next post - Tables redux, part 1
This blog is brought to you by the Excel Community where you can find additional blogs, extended articles and webinar recordings on a variety of Excel related topics. In addition to live training events, Excel Community members have access to a full suite of online training modules from Filtered. There is also an online forum where you can ask questions and share ideas with other community members.