All dashboards need one thing to work – data. However, while building dashboards we very often do not have access to the original data and this compels us to create random data sets. While most of us know how to create numeric data sets (by using the RAND function), I often get a lot of queries on generating random date values. So, I thought to post something on my approach of generating them.
It used to be a bit tricky in excel 2003, especially if one wanted dates to be within a range – which is mostly the case as dashboards usually have data for the relevant 5-6 years and it would not make a lot of sense to get dates that are 100 years old. However, with the Excel 2007 & 2010 RANDBETWEEN function it has become really easy. How?
Excel dates are nothing but numbers – Don’t trust me? Test it yourself – write a date in a cell in Excel, now go to the ‘number’ sub-tab on the home ribbon and change the format to number, surprised!! Well basically this is how Excel stores dates and hence it makes it fairly easy to compared dates.
Now, let’s come back to the problem. First of all, let’s explore the format of the RANDBETWEEN function:
RANDBETWEEN ( bottom, top)
This function requires two inputs - ‘bottom’ and ‘top’. Both the inputs require numeric integer values. The value returned by the function is a random integer number value between the bottom (or the lower) and the top (or the higher) value (both inclusive). E.g. RANDBETWEEN (2, 10) will return a random number between 2 and 10 (both inclusive).
So, how do we get the random dates? Simple, in two cells write you bottom (or the start date) and the top (end date) dates
cell 1 = 01 January 2012 - to be used as bottom value
cell 2 = 09 December 2012 – to be used a s top value
Now use them as the ‘bottom’ and ‘top’ values in the RANDBETWEEN formula to get random dates i.e.
RANDBETWEEN( cell 1, cell 2)
If you are wondering how this is processed – Excel converts ‘01 January 2012’ to its numeric equivalent ‘40909’ and similarly ‘09 December 2012’ to ‘41252’. Now it randomly selects values between 40909 and 41252 to generate random dates (Don’t panic if you see the newly generated random values as number and not dates. It may have happened because you were not using Date format in relevant cells. Changing the format to date will solve this issue).
Similarly RANDBETWEEN can be used for several other randomization exercises. One such useful exercise can be to randomize company/product names. This can be done using a combination of CHOOSE and RANDBETWEEN. It might be worthwhile to discuss the CHOOSE function a bit, here. In its simplest form, CHOOSE allows you to select a value from a list based on an index number. The general format is:
CHOOSE (index number, value 1, value 2, …, value n)
E.g. CHOOSE (2, “capsule”, ”tablet”, “injection”) will return “tablet” i.e. the second item in list.
Now that you are familiar with the choose function, lets come to the original problem – generating random company names. Suppose I want to randomize 4 company names – how do I do this?
CHOOSE (RANDBETWEEN(1,4),”Novartis”, “Pfizer”, “Merck”, “Sanofi”)
So, RANDBETWEEN replaces the ‘index number’ – thereby generating random index numbers between 1 and 4, based on which choose returns a random company for you. Similarly, you can also use this for products or any other text values that you wish to randomize.
Hope you find this tip useful. Leave us a comment if you are facing difficulties in randomizing some other values and we would be happy to help.