Hello and welcome back to the Excel Tip of the Week. This week we have a General User post in which we look at two random-number-generating functions, RAND and RANDBETWEEN. What RAND does: RAND generates a random number with 15 decimal places, somewhere between 0 and 1. Each RAND formula will make a different random number from each other RAND, and every time something changes in the workbook, every RAND will be changed to a new random number. How to write a RAND formula: =RAND() That’s it! The function is completely defined by its name and doesn’t require any inputs from you at all. You do still need to write the () part though. How to use a RAND formula for random sampling: Because each RAND is different, it can be used to pick a random sample from a list of data by shuffling that list into a random order. Check out the demonstration in the attached file. What RANDBETWEEN does: RANDBETWEEN generates a random integer number between two posts you specify. How to write a RANDBETWEEN formula: =RANDBETWEEN(lower bound, upper bound) Lower bound – the lowest possible number you want the formula to make. Must be a whole number, but can be positive or negative. Upper bound – the highest possible number you want the formula to make. Must be a whole number, but can be positive or negative. Must be larger than lower bound. Unlike RAND, with RANDBETWEEN you are not guaranteed to avoid duplicates; depending on the width of the interval you specify, duplicated values might be quite common if you have multiple RANDBETWEENs. For almost all cases, RAND is a better formula to use. Previous post - SUMPRODUCT TOTW index Next post - General efficiency tips This blog is brought to you by the Excel Community where you can find additional blogs, extended articles and webinar recordings on a variety of Excel related topics. In addition to live training events, Excel Community members have access to a full suite of online training modules from Excel with Business. There is also an online forum where you can ask questions and share ideas with other community members.
Since I've been quoted...
Rob: I think the article you mention would suggest exactly the opposite
but certainly worth clarifying the issue about RAND() and duplicates.
Haha Richard, excellent point - rewrote the main post to make it clear we're Silver. The comments thread may be headed towards Platinum at this point!
Remind me again, was this a bronze, silver or gold post?!
I've actually got a background in mathematics myself - which is why I used "guaranteed". With 1 quadrillion different possible outputs calculating independently, even if you filled an entire column of a workbook with RAND() formulas, there's still only a 1/1800 chance of even a single duplicate. Duplicates are possible by the operation of the formula, but statistically guaranteed not to happen in real use. The mechanism you suggest for random sampling is considerably more time consuming and will not be necessary for any set of data that can be actually contained within an Excel workbook.
Or, in other words, I guess we're both freaks :)
RAND is not guaranteed to avoid duplicates. Because it returns a "real" number (actually not - strictly it's a rational number) instead of an integer, it returns one of a large number of possible answers (10^15 if it's to 15 decimal places) with an even distribution so that any number is equally likely. If RAND was guaranteed unique each time, then if you ran it 10^15 times it would be easy to predict the last number, so it wouldn't be random.
If the lottery uses a function like this, count me in about 100 years from now :)
The choice of which to use should be based on whether you want an integer result or not.
So to simulate rolling a die, use RANDBETWEEN(1,6).
To get a random sample from a list with no duplicates you should run it each time on a new list with the previously selected values removed.