Day 009: Shortcut Tax Repaid
Day 009: Shortcut Tax Repaid
On Day 6 I copied exercise solutions from Stack Overflow. Got caught. Wrote about it. Said I would do better. Today was the receipt.
What I Did
Ten exercises. 1-3 through 1-12. All rewritten from scratch on a Linux machine. No peeking at the versions on my normal workstation. No Stack Overflow. Just the exercise prompts from K&R and whatever I actually understood.
Some of them came fast. The temperature table exercises were muscle memory at this point. Print a heading. Reverse the loop. Change the formula. Those took seconds.
The verification exercises made me think. Printing the value of
getchar() != EOF required understanding that a comparison in C
is an expression that evaluates to 0 or 1. Printing EOF itself
gave me -1 and reminded me why we declare the input variable as
int instead of char. A char cannot hold -1 without collision.
Exercise 1-9 was the flag variable pattern. Track whether the last character was a blank. Only print a blank if the previous character was not one. The sequencing matters. Reset the flag, check the flag, set the flag. That order is not arbitrary.
Exercise 1-10 was the redemption round. This is the one I copied on
Day 6. Today I wrote it clean. Tabs become \t, backspaces become
\b, backslashes become \\. Testing it taught me something new.
Typing a tab in the terminal triggers shell completion instead of
sending a literal tab character. Had to use echo -e with escape
sequences piped into the program to actually verify it worked.
Exercise 1-12 took thirty minutes. The state machine. IN and OUT
with #define. Read a character. If it is not whitespace and the
state is OUT, transition to IN. If it is whitespace and the state
is IN, print a newline and transition to OUT. I knew the pattern
but rebuilding it without looking at my old code forced me to think
through every condition again. That was the point.
After the exercises I set up proper repo hygiene. Created a build/
directory for compiled binaries. Added a .gitignore so executables
never touch version control. Cleaned up inconsistent file naming.
Pushed everything to GitHub.
The Questions That Came Up
Testing programs that read from stdin is harder than I expected. You
cannot just type a tab and expect the terminal to pass it through.
The shell intercepts it. Piping input with echo -e or printf
solves this but it was not obvious at first.
The else versus else if bug bit me on exercise 1-8. Writing
else (c == '\n'); instead of else if (c == '\n') compiled
without errors. The compiler saw a valid expression statement. But
the logic was completely wrong. The ++nl counter ran on every
character instead of only newlines. Syntactically legal. Logically
broken. Silently wrong.
The Feynman Test
A state machine is a program that remembers one thing: what mode it is in. The word-per-line program only needs to know whether it is currently inside a word or outside one. Every character it reads either keeps it in the same mode or flips it to the other one.
Think of a light switch. You walk through a room and every time you cross a doorway you flip the switch. You do not need to remember every doorway you passed through. You just need to know whether the light is on or off right now. That one piece of information controls your next action.
The power of this pattern is that it turns complicated input into simple decisions. No matter how many spaces or tabs or newlines come in a row, the program only prints one newline because it only reacts to the transition from IN to OUT. Not to being OUT.
Hacker Connection
The else bug in exercise 1-8 is the same class of error as
Apple’s “goto fail” vulnerability from 2014. Code that compiles
and runs but does something the programmer did not intend. In that
case a duplicated goto fail; line caused SSL certificate
validation to always succeed. The compiler did not complain. The
tests did not catch it. The code shipped.
The state machine pattern shows up everywhere in security tooling. Firewalls use state tables to track TCP connections. IDS engines track protocol states to detect anomalies. Parsers use state machines to process input safely. Getting the transitions wrong in any of those contexts means either dropping legitimate traffic or letting malicious traffic through. The pattern is simple. Getting it right under every edge case is where the discipline lives.
What Is Next
Section 1.6 on arrays. This is where C introduces indexed data in
contiguous memory. Digit counting with ndigit[10]. The expression
c - '0' to convert a character to its numeric value. This is the
foundation for understanding buffer overflows and off-by-one errors.
The hacker track stays warm. Lab is operational. Exercises are clean and pushed. Tomorrow the new material starts again.
Day 9 of 365. The shortcut tax was three days of doubt. The repayment took two hours of honest work.