Greg Beech's Website

Information redundancy in language syntax

Jeff Atwood has quite a number of posts comparing VB and C#, and the general theme is the claim that the verbosity of VB makes it easier to read. One aspect he doesn't appear to have considered, however, is the information redundancy caused by Visual Basic's verbose syntax.

Consider the following equivalent property declarations.

VB:

Public Overridable ReadOnly Property FirstName() As String
    Get
       Return Me.firstNameField
    End Get
End Property

C#:

public virtual string FirstName
{
    get
    {
        return this.firstName;
    }
}

In the VB code there are the following pieces of information:

  1. It is public (from the Public keyword)
  2. It can be overridden (from the Overridable keyword)
  3. It is read-only (from the ReadOnly keyword)
  4. It is a property (from the Property keyword)
  5. It is called FirstName (from the FirstName token)
  6. Properties are really methods underneath (from the presence of parentheses)
  7. It has the type of string (from the String keyword)
  8. It is a read-only property (from the presence of a Get construct and the lack of a Set construct)
  9. It has a backing field called firstNameField (from the Return line)

Compare this with the C# code:

  1. It is public (from the public keyword)
  2. It can be overridden (from the virtual keyword)
  3. It has the type of string (from the string keyword)
  4. It is called FirstName (from the FirstName token)
  5. It is a read-only property (from the presence of a get construct and the lack of a set construct)
  6. It has a backing field (from the return line).

When looking at the C# code you can clearly see from the code shape without even reading anything that it's a read-only property. You then scan the top line and pick up the that it's a public overridable string called FirstName in sequence. The fact that is has a backing field can be inferred because it returns something, and we don't even need to think about that because we know the field will be the lower-case name of the property. So that's one piece of information you see, four that you read, and one you infer.

With the VB code the process is a bit more complex. The initial piece of information from the code shape is that it is a read-only property, although it's a little more disguised than the C# because of the parentheses which make it look as if it could be a method. Next you scan the top line and see it's public and overridable, then that it's read-only... but hang on a second we already knew that from the code shape so ditch that piece of information. Next we see it's a property - but hang on again we already knew that so ditch that piece of information too. Next we see it's called FirstName, then that properties are really methods underneath... but that's not really important so let's forget that. Eventually we see it's a string, and then we can finish by remembering what the backing field is called because we can't infer that it is the lower-case version of the property as VB is case-insensitive.

Phew! To parse the VB property we had to see one piece of information and read eight other pieces, two of which we discarded because they were redundant and one of which we discarded because it was irrelevant. And we still have to remember an additional piece of information as we can't trivially infer the backing field name.

When you consider that the human brain can typically only hold between 5 and 9 pieces of information simultaneously, we need to see how this thought process could be affected. Reading the C# code you build up 5 pieces of information linearly with no filtering, which anybody can handle in a single pass because it's at the lower limit. However, when reading the VB code if you're not one of the lucky people who can store all 9 pieces of information and then filter them down to 6 you need to perform on-the-fly filtering of information so you don't overflow, and if you don't filter out the redundant pieces quickly enough you'll forget the important ones.

It doesn't take too long for this process to become so quick that you won't even realise you're doing it, but you are. Every time you see a piece of information, you need to filter it and decide whether it is important enough to register. It's very hard to filter information out of things you're focusing on. If you don't believe me, try reading this paragraph again and see if you can automatically filter out all the adverbs (too, so, enough, etc.) which are redundant pieces of information, unnecessary for comprehension of the point.

You may think I'm picking on properties as one of the worst offenders, but this type of redundancy is all over the language: Things like having to declare a method as a Function or a Sub when it's clear from looking at it that it's a method, having to explicitly declare parameters as ByVal or ByRef even though just about every parameter ever is passed by value, the As keyword appearing all over the place purely as padding, the AddressOf keyword even though it's clear you want to assign a method to a delegate from the context, ... enough already!

When designing a language syntax, why include redundant information that you need to filter out? It's an unnecessary distraction from the real task of comprehending the code. As such I'm forced to conclude that C# is easier to read than VB. Sorry Jeff.


Posted Feb 04 2008, 09:39 PM by Greg Beech
Filed under: ,

Comments

Greg Beech's Tech Blog wrote To var or not to var, implicit typing is the question
on 03-24-2008 10:56 PM

The introduction of the var keyword in C# 3.0 was required to support anonymous types, however it may

Add a Comment

(required)  
(optional)
(required)  
Remember Me?

Enter the numbers above:
Copyright (C) Greg Beech. All rights reserved.
Powered by Community Server (Non-Commercial Edition), by Telligent Systems