I often need random data. For example to use in an example file on this blog, to deliver a customer demo or just to test out something new. As I am probably not the only one with this need, here’s a short and sweet post on how to generate random data for use in QlikView.
I usually create random data by using variables that generate different types of random data. Some of these are shown in the image above. For dimension labels I use mapping tables containing, for example, first and last names or company names. Then I put together some load statements like the one shown below and voila, random data.
While this approach works well it can be quite a hassle, especially if you want to create dimension values that look plausible. You can imagine that I was pleasantly surprised when I caught a tweet by Qlik’s Michael Tarallo yesterday in which he mentioned an online tool for generating test data: Mockaroo.
This tool lets you generate dozens of different data types, from random dates, string and numbers to cities, names and even shirt sizes. You can generate up to 100,000 rows of random data, which can be exported to many different formats, CSV being the most useful for QlikView. You can even save your schemas and retrieve them directly from Mockaroo in QlikView:
While the 100,000 row limit is on the low side when you want to test the performance implications of various approaches, this does seem to be a very useful tool for customer demo’s. Alternatively, you can use this tool to generate your dimension tables while using the ‘old’ approach to generate big fact tables. In any case, if you ever need some realistic test data then Mockaroo is definitely worth a look.
Some additional tips for nice, random data in QlikView:
- You can use the NormInv() function in QlikView to generate random values that follow a normal distribution, in my opinion this makes for much nicer, more realistic charts.
- You can use the Sin() or Cos() functions to create (relatively) realistic seasonal patterns in your data.
8 Comments
Good catch. Creating a random sales amount is easy, but the hard part is creating random costs, payments, discounts that are related to the original sales amount and make enough sense to look like real data.
Karl
Hi Karl,
I would generate the random sales amount and sales date in the first load and then use preceding load to generate random costs and payments that are in line with the sales amount. You can generate all of this in a big, flat table and split it out later if necessary.
It’s true that getting everything to look real can be quite a challenge. One of the things I like to use is the NormInv() function instead of Rand(). This makes the data much more plausible. For example, many salespeople/products/employees/etc will be near average performers but there’s always a few that are stand-out performers (and underperformers). Using NormInv() makes your charts look much nicer too, instead of the flat, linear distribution that Rand() gives you.
You can also use a sine or cosine function to generate a pattern that looks seasonal. First generate a table of 365 days that follow the sin/cos wave. Next, loop through that table and autogenerate the amount of records that was calculated for each day.
Additionally, if this is a sales demo, I will add a few ‘discoveries’ to the data manually. Discovering these nuggets during a demo really makes for a much more compelling demo and better illustrates the value of QlikView.
Kind regards,
Barry
Nice tip Barry. I have used QlikView hot key + Q + Q to generate random data in the past, but this tool seems worth looking at to generate random data for demo apps even for blog articles.
Thanks for sharing!
Shilpan
thank you for great post. mackaroo is so awesome tool
Thanks !
Found another tool: http://www.generatedata.com/
You can use it on your own server …
I use often http://www.yanDATAellan.com tool to generate my test databases; it is simple, rich of functionalities, and user-friendly. You could generate up to 10.000 rows of mock data in several file formats (CSV, Excel, SQL, JSON, HTML, and XML). Take a look!