Saturday, August 21, 2010

The easy way out

I was just reading an article by Andrei Alexandrescu called, The Case for D.  In it he compares the traditional C "hello world" program to its D counterpart. Consider the following example:
#include <stdio.h>

main()
{
    printf("Hello, world\n");
}
If I compile this with gcc -o hello hello.c, I get an executable (without any warnings from the compiler).  It runs, and the return value is 13.  Wha-what?  Return value?  I didn't return anything.  But printf() did.

~/stuff> ./hello
Hello, world
~/stuff> echo $?
13
See?  The thirteen happens to be the return value of printf() - 13 characters in "Hello, world\n" got printed out to the console.  This sticks around for the shell echo, because status did not get returned back to the OS, via a return statement.  The above program is not exactly conforming C - not to C89, and not to C99.  In C89, main may be declared as above, but must have an explicit return statement.  Under C99, main must be declared as one of the following:
int main(void)
int main(int argc, char *argv[]) /* int main(int argc, char **argv)
                             * is also acceptable
                             */
Additionally, under C89, an explicit return at the end of main is required.  In C99, it is not, and if not provided, a return value of zero is implied.

Andrei Alexandrescu argues that this all creates incorrect behaviour.  If we do the classic hello world program, such as the first example1, then a weird value gets returned from main. That is not really desirable at all.  If we add a return (under C89), or if one is implied (C99), we get rid of the weird values being sent back to the OS. This appears to fix the problem. Or does it?

What happens if printf() fails?  It returns a negative value.  Do we let printf() happily fail, and then return 0 (which means success on most systems)?  That does not make sense either.  His suggestion was an altered "hello world" program:
#include <stdio.h>

int main(void)
{
    return printf("Hello, world!\n") < 0;
}

This might even make an experienced C programmer do a double take.  For those who may not get what it is doing, basically the return value of printf() is evaluated and compared to 0.  If it is less than 0 (meaning printf() failed, and returned a negative value), then the expression is true (remember that in C, the result of any expression evaluating to true is non-zero).   Because it is true, non-zero then gets returned from main.  If the return from printf() is not less than zero (meaning it succeeded, and returned the number of characters printed), then the expression is false.  All expressions evaluating as false are 0.  So if printf() is successful, the expression evaluates as false, and the result of the expression is 0, which gets returned from main.  In short, if the call to printf() fails, the proper value (i.e., not 0) is returned, indicating some sort of failure to the OS.  If the call to printf() succeeds, the integer 0, indicating all is well, is returned. This is, in my opinion, a truly elegant bit of code.


At any rate, the point of this one part of his article is not to talk about how to fix C, but that using D is the preferred option, because,  it does things more correctly.  According to Alexandrescu,
"D attempts not only to allow you to do the right thing, it systematically attempts to make the right thing synonymous to the path of least resistance whenever possible. And it turns out they can be synonymous more often than one might think."

From what I can glean from the article, I find myself having to agree. I mean, the "right thing" for C is to check those return values.  But this is not _at all_ the path of least resistance.  His modified example is for the simplest bit of code.  A less refined approach might look something like this:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void)
{
    char message[] = "Hello, world\n";
    size_t message_len;

    message_len = strlen(message);
    if (printf("%s", message) != (int) message_len) {
        fprintf(stderr, "%s\n", "Call to printf() failed!");
        exit(EXIT_FAILURE);
    }

    return 0;
}

As you can see it gets real ugly - real fast. Furthermore, while it might deal with one problem, it also causes another: If we are to check the return values of all functions, then we must also deal with the call to fprintf() (I will leave that as an exercise to the reader). All of this for one of the most simple programs you can do!  The point is that while we don't have to go to these lengths for a simple "hello world" program (recall the elegant example by Alexandrescu), we do have to address these issues in production code. In truth, it often seems easier to ignore some things than to deal with them. But this will come back to bite you like a rabid dog. Wouldn't it be nice if some of this stuff was covered by the language so that you, the programmer, could get on with writing good code? I think that D just might be headed in that direction.

1. I believe the example given is straight out of K&R, albeit from the initial introductory/tutorial material of the text. Returning a value from main is explained in more detail further on in the book.

No comments:

Post a Comment