Merging tables: the JOIN prefix

By Barry Harmsen

Merging tables: the JOIN prefixThis is the second post in a series on merging tables in QlikView. In the previous post I wrote about appending rows to tables by using the CONCATENATE prefix, this time I will show you how to add  columns to a previously loaded table by using the JOIN prefix.

Table of contents

The story so far

Concatenated tablesIn the previous post we discussed the example of a paper company that wanted to merge the employee list of their Sales Department of one of their branches (the Stamford branch) into the employee list of another branch office (the Scranton branch). We merged the two tables by using the CONCATENATE prefix, demonstrating the difference between implicit and explicit concatenation along the way. The resulting table is shown on the right.

The new requirements

The corporate dataNow that we have a single, merged Employee table, the manager of the Scranton branch comes up with an additional requirement; wouldn’t it be a cool idea if we could link corporate’s HR data with the merged employee table so we also have everyone’s job description and pay scale available in our reports and dashboards? While I personally reserve the “cool idea” qualification for ideas that are really, well, cool (say, shark cage diving or designing your own heart valve implant) for the sake of this post lets assume that this is a cool idea and that we really want to implement it. So how do we proceed?

Just load the table

Tables loaded separately

The first option we could consider is just loading the table as-is. The resulting data model would look like the picture on the right. Technically, there is nothing wrong with using this approach but I prefer to merge the tables for two reasons:

  • It makes for a simpler, less-cluttered data model;
  • Logically, it makes more sense for employee-related data to be grouped together into a single employee dimension table.

To merge the columns from two (or more) tables we can use the JOIN prefix.

The JOIN prefix

When the JOIN prefix is put in front of a LOAD statement, it tells QlikView not to load the data into a new table, but instead join the data to a previously loaded table.

Join script example

The join is a natural join made over all the common fields. This means that QlikView compares all the common fields between the two tables and joins the rows from both tables when the values in the shared columns match.

Optionally, the name of the table to which the data should be joined can be specified between parentheses. If no table name is specified, QlikView assumes that the data should be joined with the last loaded table. Those who read the first post know that I am not a big fan of making assumptions in the load script, it makes the code harder to read and maintain. Therefore I always specify a table name.

Join types

When you look closely at the two tables that we want to join, you notice that not all Person ID’s correspond between the tables. The Employees tables contain an employee, Michael Scott, who is not present in the HR Data table. Conversely, the HR Data table contain data for a Receptionist who is not present in the Employees table.

Not all values between tables match

When we do a standard join, the tables are merged so that the corresponding records from both tables are shown on the same row. Records without a corresponding record in the other table get null (empty) values for those columns. The result is the following table:

Standard join

This is almost the merged table we want, but it still contains data for a Receptionist who is not an employee at the Scranton branch. Fortunately there is a way to solve this, by specifying the join type.

By using a join type we can specify how QlikView joins the tables, we can use the following options:

  • INNER: Only records which are present in both tables are included in the merged table. In our example this would mean that both Person ID 1 (Michael Scott) and 10 (Receptionist) would be excluded from the merged table.
  • OUTER: All records from both tables are included in the merged table, regardless if they have corresponding records in the other table. This returns the same result as the standard join.
  • LEFT: All records from the first table are included in the merged table, and only those records from the second table which have a corresponding value in the first table. In our example this would mean that Person ID 1 (Michael Scott) would be included and that Person ID 10 (Receptionist) would be excluded, as it has no corresponding value in the first table.
  • RIGHT: All records from the second table are included in the merged table, and only those records from the first table which have a corresponding value in the second table. In our example this would mean that Person ID 10 (Receptionist) would be included and that Person ID 1 (Michael Scott) would be excluded, as it has no corresponding value in the second table.

As we want to keep the entire Employee table and only add the corresponding values from the HR Data table to it, we decide to use the LEFT JOIN prefix.

Left join

This results in the following merged table:

Left joined table

And that’s how it’s done! The example QVW for this post shows the completed script and also shows what result you get when you use another join type. As always, if you have any questions or comments, feel free to drop them in the comments section.

Downloads

Download the example QVW and source data