Visualizing customer profitability with a whale curve

By Barry Harmsen

Do you know if your customers are profitable? All of them? Performing Customer Profitability Analysis can answer these questions and give you some amazing, and sometimes counter-intuitive, insights into your customers’ contribution to your bottom line.

This post describes one of the visualizations that you can create once you possess accurate data* on the profitability of your customers: the whale curve.

In a whale curve, customers are ranked by profitability, from highest to lowest, on the X-axis while their accumulated profit is plotted on the Y-axis. The curve that results can, with some imagination, be said to look like a whale coming out of the water. An example of a whale curve chart is shown below.

Whale curve example

When you look at this chart, you may notice that the top 200 customers generate the bulk of the profit.  You may also notice that the you are losing serious money on the bottom 100 customers and that the customers in the middle are more or less break-even.

Read on to learn how to create a whale curve in QlikView. Even if you’re not interested in creating a whale curve, you might still want to read on to learn more about the rank function and the continuous x-axis.

First off, you can download the complete example qvw by clicking here. The source data for the example can be downloaded by clicking here.

Assuming we have a data model that contains the profit per customer, we can create a whale curve in QlikView using a line chart. In this line chart we first need to create a calculated dimension that ranks each customer according to their profitability.

Whale curve dimension

The expression to use for the calculated dimension is:

=aggr(num(rank(sum(Profit), 4)), Customer)

The way that this expression works is that the total amount of profit is summed and ranked per customer using the sum, rank and aggr functions. The parameter value 4 tells the rank function to increment the rank even when customers have the same amount of profit. The table below illustrates the use of the rank function with and without the parameter value 4.

Value Default rank Parameter 4 rank
5 1 1
4 2 2
4 2 3
3 4 4
2 5 5

As the rank function returns a dual value (both a numeric and string value), the num function is used to force the result to be interpreted numerically.

Once the calculated dimension is created, we can create the expression that plots the cumulative profits. This expression is a simple sum statement. Because we want to see the accumulated profit, we select the “Full accumulation” radio button.

Whale curve expression

The customers need to be sorted from the most profitable to the least profitable customer. Since the calculated dimension that we created returns already ranks the customers in this order, we choose an ascending sort by numeric value.

Whale curve sort

The example chart shows two reference lines, one showing the actual amount of profit and one showing the maximum amount of profit, i.e. the amount of profit you could have made if you exclude all the customers that you are currently losing money on. We can add these reference lines on the presentation tab.

(please keep in mind that identifying the customers you are currently losing money on is only a start, you will want to find out why you are losing money on these customers, simply getting rid of all of them is not the answer)

Whale curve reference lines

The expressions to use for these reference lines are:

Actual profit: =sum(Profit)

Maximum profit: =sum ({$ <Profit={“>0”}>} Profit)

The actual profit is a straightforward sum of the profits. The maximum profit is the same sum, but it adds a set modifier that excludes any negative (or zero) profit amounts.

Next, we need to let QlikView know that the values on the x-axis need to be treated as a continuous range of numbers, not as separate categories. To do this we need to tick the “Continuous” checkbox on the “Axes” tab.

Whale curve axes

The finishing touch is to properly format the continuous x-axis as an integer on the “Number” tab.

Whale curve number format

Once this has been done we can click “OK” to close the properties dialog, the whale curve is ready.

You can download the complete example qvw by clicking here. The source data for the example can be downloaded by clicking here.

Should you be interested in more ways to visualize customer profitability then you may want to take a look at my post on decile analysis.

As always, if you have any questions and/or comments, feel free to drop them in the comments section.

* Unfortunately, implementing a costing model such as Activity Based Costing to underpin this type of analysis can be notoriously hard. But that is a subject falls outside of the realm of this QlikView-themed blog, maybe I will cover it on my other blog sometime.

About The Author

Barry Harmsen

Hi there, I'm Barry and I'm a Business Intelligence Consultant at Bitmetric and based in the Netherlands. Originally from a background of 'traditional' Data Warehousing, Business Intelligence and Performance Management, for the past 10 years I have been specializing in Qlik and a more user-centric form of BI. I have done numerous QlikView and Qlik Sense implementations in many different roles and industries. In 2012 I co-authored the book QlikView 11 for Developers. You can follow me on Twitter at @meneerharmsen.

10 Comments

  • 1
    November 10, 2010 - 02:50 | Permalink

    Great post. Thanks.

  • 2
    paul yeo
    November 13, 2010 - 04:42 | Permalink

    http://community.qlikview.com/forums/p/36835/144443.aspx#144443
    Hi sir , Nice post , i like the chart , i try and fail to make it , i post a question on above link , hope you can help me.

    • 3
      November 13, 2010 - 09:32 | Permalink

      Paul,

      This happens because the “company” field in your application contains duplicates, there are 3778 values in there from only 230 distinct companies. The solution is to change the “count(total company)” part to “count(total distinct company)”.

      Kind regards,

      Barry

  • 4
    John Witherspoon
    June 29, 2011 - 00:19 | Permalink

    Nice post! I had fun with this by seeing what the graph meant, and then trying to figure out how to create it on my own. I ended up with a small difference.

    In a typical live application, you wouldn’t have a single profit number associated with a customer. Rather, the profit would be on some other object, like order items or invoices, where there can be many per customer. In that case, a simple =sum({$=0″]>} Profit) produces the wrong result for the peak of the graph. The below appears to work, however, by “selecting” customers out of the currently-possible customers that are profitable, instead of merely summing positive profit numbers.

    Well, another difference was that I didn’t notice that I needed the 4 parameter on my rank(). Oops!

  • 5
    Mike
    August 11, 2011 - 09:55 | Permalink

    Very nice post.

    I was wondering if it would be possible to add a reference line to the x-axis positioned on the rank where the curve crosses the y-axis reference line of the actual profit? So, in your example, draw the reference line on rank = 134.

  • 6
    John Witherspoon
    August 11, 2011 - 18:56 | Permalink

    Certainly. The comments don’t support a full character set, but hopefully the explanation will be clear. Take the expression I gave above, change sum() to count() and Profit to distinct Customer. In other words, just count the profitable customers.

  • 7
    John Witherspoon
    August 11, 2011 - 19:27 | Permalink

    Ah, sorry. I misunderstood and gave you the reference line for where the curve crosses maximum profit, not actual profit. I haven’t worked out an expression for where we cross the actual profit reference line.

  • 8
    Mike
    August 12, 2011 - 08:52 | Permalink

    John,

    Thanks. It’s indeed that obvious; was looking too far.

    In the mean time I have been playing around with this in a live app. I am having an issue with the aggr-function however. Created a post on the qv community (http://community.qlikview.com/message/140464), but no real solution yet. Maybe you have any ideas?

    Regards,

    Mike

  • 9
    Satya
    May 25, 2016 - 07:40 | Permalink

    Hi,

    I have a problem to solve with fractile function. we have set of values across months and want to plot a line chart with 90th percentile(Fractile) with month as Dimension.

    Here comes the issue. Trend shows wrong values and correct values are visible ONLY if I select on a month. I am unable to use aggr function with Fractile or getting error when I created resident table with Group by…

    Ex: 90th percentile

    Jan Feb Mar Apr
    1 4 2 6

    But the value for March shows as 3 in the trend graph and shows as 2 if i click on March month..

    Can anyone advise me how to arrest this behavior..

    Trial 1: Tried creating a Resident table

    Load

    month,

    Fractile (Total Tat, 0.9) as Fra90

    Resident Table1 Group by month;

    Getting error while reloading the script.. How to aggregate the Fractile values at month so that this can be arrested..

  • Leave a Reply

    Your email address will not be published. Required fields are marked *

    Read previous post:
    Portable QlikView: run QlikView Desktop from your USB drive

    As a consultant, one thing that I have really come to rely upon is my collection of Portable Applications. Until...

    Close