Archive for October, 2009

From Baker Street To Binary (Henry Ledgard)

Posted in Books on October 18, 2009 by jkiparsky

From Baker Street to Binary (Henry Ledgard, E. Patrick McQuail, & Andrew Singer) – One of the first books I ever read on programming was Henry Ledgard’s BASIC tutorial, which used Sherlock Holmes’ milieu as a pedagogical conceit – Holmes uses the newly-invented Difference Engine to solve cases, and the reader writes or analyzes the programs he uses. It was fun – I was about ten at the time, and missed a lot of the point, but it was still fun and probably influenced the way I think today to some extent. This is not that volume, although it re-uses a chapter from it. It’s more of a tutorial on computers and what they’re capable of, written in 1983. A wonderful historical document, then showing something of the state of the art 25 years ago. It’s worth reading, just so you can remind yourself when you’re done that this is ten years before the world-wide-whatsit metastasized, and we stepped on the slippery slope down to the mess we’re in today.
In other oddly short periods of time, it’s worth noting that the steam engine in the United States had a run of just over a century, and that when UNIX was being written, steam engines were still being manufactured in this country. File this next to the notion that it’s a hundred years from the Civil War to Woodstock. One human lifespan could encompass the blues from emancipation to Jimi Hendrix.
And that’s why I like to read obsolete technical books – they remind you that life is long and history is short.

Unexpected variable interactions in C

Posted in General on October 16, 2009 by jkiparsky

And now for something completely different…

Doing an assignment for CS240 (C Programming), I wrote myself the sort of error that you have to tell the world about. So I’ll do the next best thing: I’ll make a note of it here, so I know where it is if I ever need it.
It’s actually a pretty cool example of how a piece of C can misfire so subtly that you have to really work to find it.

The assignment is pretty simple: converting an int to hex, and back – the hex, of course, being a char string. Totally useless, since C will do this for you for free, but good practice, and something that I hadn’t actually written before. So here’s what I wrote:

main: call the two functions and report the output

#include the usual stuff
main()
{
char hexstring[ENOUGH_SPACE];
int n=0, m=0;
while (scanf(“%d”, &n)!=EOF)
{
itox(hexstring,n);
m = xtoi(hexstring);
printf(“n=%12d %s m=%12d\n”,n, hexstring, m);
}
}

itox: convert integer to hexadecimal

char hex[]={‘0′,’1′,’2′,’3′,’4′,’5′,’6’, ‘7’,
‘8’,’9′,’A’,’B’,’C’,’D’,’E’, ‘F’};

/* function represents the int n as a hexstring which it places in the hexstring array */
void itox( char hexstring[], int n)
{
int i=ENOUGH_SPACE;
hexstring[i]=’\ 0′;
while (i>=0)
{
–i;
hexstring[i]=hex[n%16];
n/=16;
}
// 0/16=0, so leading zeros will be filled in automatically
}

xtoi: convert hex to integer:

/* function converts hexstring array to equivalent integer value */

int xtoi( char hexstring[])
{
int n = 0; //return integer
int i = 0; //index
long int p = 1; //decimal placeholder

while (hexstring[i]!=’\ 0′)
i++;
while (i>=0)
{
i–;
while (i>=0)
{
i–;
if ((hexstring[i]>=’0′)&&(hexstring[i]=’A’)&&(hexstring[i]<='F') )
n+= ((hexstring[i] – 'A' + 10)*p);

p*=16;
}

return n;

I got all of this, cleaned up the usual typos (you always lose a semicolon along the way, that sort of thing) and ran it. Worked great, except it always reported out the input value – n, in main(), the one that is the target of the scanf() and then is never written to again – as 48.
Hm.
Can’t find any place where I’ve changed that value, let’s see where it changes. I start putting in printf() statements. It’s fine all the way through main, and then it comes back from xtoi and it’s 48. Okay, that’s easy: it’s impossible. Everyone knows you can’t change a variable in main while you’re in a function, right? Not unless you use pointers and I didn’t use… oh.
Oh. (people with some experience in C are nodding their heads at this point, they’re going back and looking at the code, and they’ve just seen it)
Okay, for the rest of you, here’s what happened:
I passed the itox function an array of char to hold the text string representing a hex value. Since I have the length of the string defined in a global header, I use that as the length of the string, and set the last char in it to ‘\ 0′, the string delimiter. Then I just divide by sixteen repeatedly, putting the remainders into the array in positions of increasing significance. Worked great, as you’d expect: nothing very difficult about this.

So how does this change the value of a variable in main?

The trouble is in the loop control. As usual, I rolled back down the string from i (length of string, least significant digit, the right side) to 0 (first character in string, most significant digit, the left side). This would be fine, except I put the decrement at the start of the loop body. So if i==0, we go sailing merrily into the loop, and the first thing that happens is i- -, so i is now -1. Now the next thing that happens is we assign hexstring[i] a value. That value is pretty much guaranteed – due to the conditions of the program – to be zero, or rather, the ASCII value of zero. Which, you will not be surprised to learn, is 48.
What’s happening is that the piece of memory just “below” hexstring[0], that is, hexstring[-1], is in fact my original n. So when I overshot the end of the string (harmless, right?) – I ended up using a pointer (arrays and pointers are mostly different ways of doing the same thing) to directly access a piece of memory that I had no idea I was accessing.
This created a bug that took me half an hour to track down, because it was the sort of thing I’d been taught to think just couldn’t happen.

Three things to learn from this. One, of course, is that C is a tricky language, and even the simplest things can creep up and bite you. The second: be careful of your loop counters. Don’t assume that what you meant to do is what you did, and don’t assume that just because it should work fine, it will.
And third: there’s a good reason to know the ASCII values, or at least the rough locations of a-z, A-Z, and 0-9. This would probably have saved me twenty minutes’ debugging, and allowed me to get that cup of coffee I needed before class.