posts

Day 011: The Quiet Byte

Day 011: The Quiet Byte

K&R 1.8 is short. Call by value. C copies everything you pass to a function. Everything except arrays. That one exception changes the entire security landscape of the language.

What I Did

Worked through sections 1.8 and 1.9. Section 1.8 covers how arguments pass to functions. Every variable gets copied. The function works on the copy. The caller never knows. Simple. Safe. Until arrays show up.

When you pass an array, C passes the address of the first element. Not a copy. The original. The function can reach into the caller’s memory and change whatever it wants. No guardrails.

Section 1.9 is where it gets real. Character arrays. The longest-line program. Two functions: getline reads characters into a buffer, copy moves one string into another. C does not have strings. It has arrays of characters that end with '\0'. That single byte is the only thing telling the program where the data stops and the unknown begins.

The Questions That Came Up

What happens across multiple files?

K&R assumes the whole program lives in one file. That is a big assumption. If getline and copy were in a separate file, the compiler would not know their signatures when compiling main. You need forward declarations or a header file. Without them, old C would guess the function signature. Silently. Wrongly. The same class of problem I hit on Day 1 when main had no return type and clang refused to compile it.

Is null really just zero?

I always thought a null pointer was something undefined. It is not. '\0' is the integer value 0 stored in a char. A null pointer is the address 0. Both zero. Different meanings. The confusion between them has caused real bugs in real codebases.

What about the overflow K&R admits to?

K&R says this directly: getline checks for overflow. copy does not. Their reasoning is that the caller of copy already knows how big the strings are. That sentence is the design philosophy that launched an entire vulnerability class. The programmer assumed the caller would behave. Attackers do not behave.

The Feynman Test

A C string is a row of characters in memory with a zero at the end. That zero is how every function knows when to stop reading. There is no length field. No boundary marker the hardware enforces. Just a convention that says “when you see zero, stop.”

If the string fills the array exactly and there is no room for the zero, the program does not stop. It keeps reading into whatever memory comes next. Maybe that memory happens to be zero already. The program works fine. Ship it. Then the memory layout changes on a different machine or a different day and the program reads into something that is not zero. Now it is reading data it was never supposed to see.

That is not a crash. That is a data leak. That is Heartbleed.

Hacker Connection

On Day 10 I watched a buffer overflow in GDB. I saw data land past the end of the array and corrupt the stack. Today I understand why that happens at the language level. C does not copy arrays when passing them to functions. Functions write directly into the caller’s memory. And the only thing marking the boundary of a string is a single byte that the programmer is responsible for placing correctly.

The copy function in K&R does not check if the destination is big enough. K&R acknowledges this. Multiply that decision across fifty years of C code and you get the buffer overflow as the most exploited vulnerability class in the history of computing. The null terminator is a gentleman’s agreement. Exploitation begins when someone stops being polite.

What Is Next

Exercises 1-16, 1-17, and 1-18. These are the character array exercises and they build directly on the longest-line program. Then section 1.10 on scope and external variables. That is the last section of Chapter 1. The end of the foundation.


Day 11 of 365. The most dangerous byte is the one that isn’t there.