Aliases, to indent or not?

By Barry Harmsen

“Code is read more often than it is written”

  – common programming adage

As someone who often has to look at/modify QlikView script written by others, I have come to appreciate the value of readable script. Rather than taking an “if it works, it works” attitude, I always try to write my script with readability and consistency in mind. Somewhere down the line someone else (or worse, you) might have to decipher what a specific script is doing in order to modify or extend it.

Today I want to ask your opinion on the readability of a specific part of the QlikView script, indentation of aliases within LOAD statements. To this end, I’ve taken a piece of script from the QlikView System Monitor and formatted the aliases in three different ways. (note: I don’t agree with the absence of white space in functions and between operators, but have left that as is)

Have a look at the three options, and then please answer the two multiple choice questions below.


Option 1

QlikView LOAD script - no indentation of aliases

 


Option 2

QlikView LOAD script - indenting aliased fields

 


Option 3

QlikView LOAD script - indenting all fields

 


You might wonder what the difference between option 2 and 3 is, but I’ll let you discover that for yourself. If you don’t see it, just pick whatever you found most readable.

I’d be very interested in hearing your motivation for picking one option over the other, especially if you answered Something else to the second question. If you’d like to elaborate, or have any other thoughts around this subject, I welcome you to share them in the comments section below.

 

About The Author

Barry Harmsen

Hi there, I'm Barry and I'm a Business Intelligence Consultant at Bitmetric and based in the Netherlands. Originally from a background of 'traditional' Data Warehousing, Business Intelligence and Performance Management, for the past 10 years I have been specializing in Qlik and a more user-centric form of BI. I have done numerous QlikView and Qlik Sense implementations in many different roles and industries. In 2012 I co-authored the book QlikView 11 for Developers. You can follow me on Twitter at @meneerharmsen.

22 Comments

  • 1
    Stefan
    June 16, 2014 - 16:41 | Permalink

    Another thing I do is adding the commas in front i.e.

    LOAD
    @1 AS Field1
    ,@2 AS Field2
    ,@3 AS Field3

    IMHO this makes it easier to find errors and to comment in / out some fields without having to bother about the commas.

    • 2
      Borys Tyukin
      June 16, 2014 - 18:12 | Permalink

      i like commas first as well – my old habbit from SQL world 🙂 that way if you delete a column, you script will still work fine unless it is a very first column

      • 3
        June 16, 2014 - 21:14 | Permalink

        Yeah, I am aware of the comma at the front of the line thing. Personally, I don’t really find it aesthetically pleasing (but that shouldn’t be a reason in itself), don’t really have a problem with missing comma’s and find the overhead of moving all the comma’s too much hassle (the code generated by a table wizard puts them at the end of the line).

        To me, a comma means the end of a ‘statement’, so it should be on the same line as that statement. Also, it just shifts the problem from the last column to the first column, although it’s more likely that you’ll be adding extra fields at the end instead of extra fields at the start.

        Most of this probably sounds arbitrary, my opinion is you should come up with some sort of convention for yourself and stick with it. And it works best if this convention is something that’s commonly used, as mixing two different conventions (for example, when you inherit work from another developer) is probably worse than no convention.

  • 4
    June 16, 2014 - 16:58 | Permalink

    Hi Barry,

    A lot of the code I work with I inherit, so mine is a right mix. Probably a hybrid of one and two. If I wanted to enforce the clean look of 3 I would probably add the (perhaps superfluous) line:

    Type AS Type,

    The other hot battle with code (particularly SQL code) is commas to the left or to the right of each line. Personally, mine go to the right.

    Steve

  • 5
    Borys Tyukin
    June 16, 2014 - 18:18 | Permalink

    i really like option 2 (3rd one looks confusing to me) and I try to do this formatting whenever I can. However it is easy to get this out of control in QV and it does not help that QV editor does not have any beautifier / formatter which are very common in SQL world. So it is normally a mix of option 1 and 2 – but i would prefer 2.

    For the first column, I align it with the comma in front of the second column:

    LOAD
    c1
    ,c2
    ;

    not :

    LOAD
    c1
    ,c2
    ;

    I also like to align parentheses and type a closing one before i start writing the expressions inside them:

    Also like to align statements and put ; on a separate line like that:

    LOAD
    c1
    ,c2
    ;

    • 6
      June 16, 2014 - 21:16 | Permalink

      Yes, I also put the semicolon on a separate line, it makes it easier to modify/add a where-clause later and creates a natural white-space between statements (I use two additional returns between LOAD statements, with the exception of statements that are directly related to each other, such as JOINs or CONCATENATEs)

  • 7
    June 16, 2014 - 21:18 | Permalink

    It’s interesting to see by the way that (so far) roughly 80% prefer option 2 or 3, but more than 50% say they use option 1 in their scripts.

  • 8
    MarkusG
    June 17, 2014 - 08:30 | Permalink

    I use Option3, but with commas and semikolons as first char, like Borys.
    in SQL i also do
    SELECT [TAB] @1
    ,[TAB] [TAB] @2
    FROM [TAB] my.table
    WHERE [TAB] @1 [TAB] [TAB] ‘myvalue’
    AND [TAB] [TAB] @2 [TAB] ….
    ;
    .. and so on that i have straight columns

    i like it when code is readable without comments.

    happy coding!
    plex

  • 9
    MarkusG
    June 17, 2014 - 08:32 | Permalink

    oh on my comment the brackets went away… (between the Tabs there sould be greater/smaller things.. and so on.. )

  • 10
    Jochem
    June 17, 2014 - 08:53 | Permalink

    I Use option 2
    I Also Use the same statement as Steve for aligning fields without a alias, example type as type

  • 11
    Nenad
    June 17, 2014 - 15:06 | Permalink

    Hi Barry,

    Being a meticulous person by nature I think about that as well and find it hard not to adjust inherited scripts.

    I am not a fan of having so much space between fields and aliases as it takes too much time to move your eyes from one point to another on the same line. The same applies for screens in various software packages (ERP and bespoke) where 90% uses your options 2/3. for example it is very easy to mix up a field and alias in lines 2 and 3 as they are too much separated. It is also hard to align it when there are long functions, etc.

    In complex function and mathematical expressions I like to put a blank space between parameters/operator to increase readability.

    I use a mix like in the following excerpt:
    load
    upper(@5) as AuthenticatedUser,
    @1 as AuditserverStarted,
    @2 as ActionTimeStamp,
    SomeField,
    date(@2) &’_’& time(@2, ‘hh’) &’_’& time(@2, ‘mm’) as DateHourKey,
    Qty * Price as Amount,
    if(index(@6, ‘\’) > 0, subfield(@6, ‘\’, 1), ‘Selection’) as Action,
    textbetween(mid(filename(), 1, index(filename(), ‘.’) – 1) &’_’, ‘_’, ‘_’) as QVSClusterNodeA

    Cheers

  • 12
    Nenad
    June 17, 2014 - 15:09 | Permalink

    I see now it is not seen correctly as I also like indenting in nested functions/loops and load statmemtns like here:
    load
    …Field1 as Description1,
    …Field2 as Description2

  • 13
    Brian Garland
    June 17, 2014 - 20:24 | Permalink

    I want the fields on their own lines so I always put the LOAD statement on a line by itself by hitting enter before that first field. Then I highlight the entire field list and hit Shift-Tab 5 times, followed by a single tab. That gets all the fields lined up and indented to the first tab stop. And while the fields are still highlighted I right-click and comment them all out. I then just uncomment the ones I want to use (at least during initial development). It’s a little bit of a pain to do this every time but it’s second nature for me now. Too bad their isn’t a way to customize the layout the wizards use.

  • 14
    Bob Dawes
    June 17, 2014 - 20:24 | Permalink

    Back in the days of COBOL, we were taught to put the PIC(X) phrase in column 44, always.

    Rebelling from that formality, I go with a hybrid most similar to your option 2, but rather than put the AS far enough to the right to allow for the longest input syntax, I’ll group them in sections so that the AS is not more than an inch or two away from the statement that it is modifying.

  • 15
    Erik de ruiter
    June 18, 2014 - 15:16 | Permalink

    I do it like this;

    Load
    A,
    B,
    C,
    Capitalize(subfield(c,’;’,1) as C,
    Capitalize(subfield(c,’;’,1) as D,
    Capitalize(subfield(c,’;’,1) as E
    Capitalize(subfield(c,’;’,1) as F,
    G,
    G as H,
    G as J,
    G as K,
    L,
    M
    From ……..

    • 16
      Erik de ruiter
      June 18, 2014 - 15:22 | Permalink

      By posting my comment the tabs where deleted. Before every alias i use a tab, for the first alias 1 tab, by the second alias 2 tabs, thirt alias 3 tabs etc etc.
      ……So the
      ………..Script look
      ………………….like this

      (Whitout the dots)

      This is not visible in my comment, sorry.

  • 17
    Lior A
    June 18, 2014 - 19:19 | Permalink

    I agree that option 3 is neater. I personally prefer to group related variables together (if order is not important), add new lines between expressions, and wrap my named variables in brackets.

    Here’s an example:

    LOAD
    Autonumber(@1) as [%some ID],
    Autonumber(@2) as [%someother ID],

    @3 as [Something 1],
    @4 as [Something 2],
    Upper(@5) as [Something Else],

    // Format date as following “MM-DD-YY_hh_mm”:
    Date(@5, ‘MM-DD-YY’) & ‘_’ & Time(@6, ‘hh’) & ‘_’ & Time(@6, ‘mm’) as [Some Date],
    Date(@4) as [Some Other Date],

    // Format version information in a meaningful way, i.e. “Product Version #”:
    Pick(
    WildMatch(variable, ‘abc1’, ‘abc2’, ‘abc3’),
    ‘ABC version 1’,
    ‘ABC Version 2’,
    ‘ABC Version 3’
    ) as [Version]

  • 18
    Werner
    June 20, 2014 - 21:22 | Permalink

    Hi Barry,

    Good topic!
    I also put a comma first and semi colon on a new line
    I use tabs after a comma.

    Example:
    LOAD
    Field1
    , Field2
    , IF(Field1 = 1, ‘ABC’, ‘DEF’) AS Field3
    RESIDENT table
    WHERE
    Field2 = 2014
    ;

  • 19
    PingPing
    July 26, 2014 - 18:58 | Permalink

    I use Option 3 as I find it the easiest to follow. Unfortunately QlikView does not allow prefixing a field definition with its alias, eg. [Alias] = ColumnName which is what I like to do in SQL Server because I don’t code in Arabic ( read left-to-right) but, hey, Option 3 is the next best thing.

  • 20
    Mike
    August 8, 2014 - 16:14 | Permalink

    Howdy,

    I use Option 1 although I did dabble for a while moving all the aliases to the right. However as I always emphasis white space between blocks of code I found it was making it more difficult to quickly identify an expression as there was too much white space (probably moving them too far).

    For me personally indenting blocks of code, white space between code and commenting a lot makes code more readable.

    Also comma at the end of the line obviously :O).

  • 21
    Marcus Sommer
    August 31, 2014 - 11:18 | Permalink

    Hi Barry,

    I’ll prefer a mixed code-structure depending on the kind of load-statement and the amount of fields and transformations. Often I try to write very compact code with several fields in one line – if the amount of fields are larger and/or the transformations more complex I use indentations and/or cluster the fields. The readability from option 2/3 isn’t good enough to make such more efforts.

    For me it’s more important to have a better view on the chain of load-statements.

    – Marcus

  • 22
    Tom Weissmann
    October 20, 2014 - 03:43 | Permalink

    I’m in favour of aligning the aliases (option 2) and using leading commas. Both require a proper editor that can automatically align and that provides a column selection. Currently Notepad++ is the only half-capable editor out there with a Qlikview mode.

    Option 3 I find very confusing because there’s nothing on the left to correspond to the unaliased fieldnames on the right, but I automatically look for it and get lost. In any case it seems unnecessary to mix aliased and unaliased fields like this (and certainly not positional and named fields…)

    I like to use preceding loads to assign meaningful names to everything before doing any logic. It can be a good way to separate blocks of logic as well.

    Load
    AuthenticatedUser
    , AuditServerStarted
    , AuditMessage
    , SomeField
    , AnotherField

    , Object
    , Action
    , SheetID
    , SelValues
    , QVSClustNodeAudit

    , Timestamp
    , Timestamp As ActionTimestamp
    , DateHourKey
    , Date2

    , SessionDocName
    , DocFolder
    , DocFullPath
    ;

    // QVS Cluster Node comes from the file base name
    Load *
    , SubField(_FileBaseName & '_', '_', 2) As QVSClustNodeAudit
    ;

    // Parse AuditMessage
    Load *
    , SubField(AuditMessage, '\', -1) As Object
    , If( Match(AuditMessage, '\')
    , SubField(AuditMessage, '\', 1)
    , 'Selection') As Action
    , If( Match(AuditMessage, 'Activated sheet Document')
    , SubField(AuditMessage, '\', 2)
    ) As SheetID
    ;

    // Split DocFullPath into folder and document
    // There should be a function for this...
    Load *
    , Capitalize(
    SubField(DocFullPath, '\', -1)) As SessionDocName
    , Left(DocFullPath
    , Index(DocFullPath, '\', -1) // last path separator
    - 1) As DocFolder
    ;

    // Cast / convert / format
    Load *
    , Upper(_Username) As AuthenticatedUser
    , Date(Floor(Timestamp)) As Date2
    , Date(Timestamp, 'dd_MM_yyyy_hh_mm') As DateHourKey
    ;

    Load
    @1 As AuditServerStarted
    , @2 As Timestamp
    , @3 As DocFullPath
    // @4 is not used?
    , @5 As _Username
    , @6 As AuditMessage
    , @7 As SelValues
    // ? mixing positional and named fields ?
    , FileBaseName() As _FileBaseName
    , Type
    , SomeField
    , AnotherField

  • Leave a Reply

    Your email address will not be published. Required fields are marked *

    Read previous post:
    ABC: Always Be Closing

    A very short tip on writing QlikView expressions that might save you a lot of headaches. In my last online...

    Close