Unix makes Computer Science easy
As I've learned more Computer Science over the years, I've grown to
realize that Unix packs a whole lot of CS power into a bunch of
seemingly small commands with simple interfaces.
Consider the diff utility. It figures out the smallest number of
changes required to transform one file into another. It uses some
pretty heavy computer science, but the end user just tells it which
files to compare and gets back the differences.
Or how about bc and dc? Arbitrary precision arithmetic was a bit of a
trick in those days, but with Unix, you just run "bc" and type in your
expressions, including exponential or trigonometric functions, and out
pop your answers.
External sorting takes some skill and work to pull off. To sort a file
that's ten times the size of available RAM, you need to sort lots of
little chunks of the file and merge them together over multiple
passes. Unix's sort utility just does it for you, creating temporary
files as needed and hiding all of the complexity from you. It even
does variable length records without batting an eye.
Once you've sorted your text file, and want to find something in it
quickly, Unix lets you do a binary search in a file with variable
length records with just a single command. look. And you don't even
have to know what log(n) means.
Shell scripting provides something similar to functional programming,
with files treated as lists of lines. The map, reduce, and filter
operations from functional programming are made available in things
like sed, wc, and grep.
Awk can do all three. Patterns without actions are a filter. Actions
that do output perform mappings. Actions that accumulate information
to be spat out by an END block do reductions.
For programmers, things get even better. Topological sorting is where
you take a bunch of things, and a bunch of rules saying that one thing
must come before another thing, and come up with an ordering of the
things that satisfies all of the rules. It's kind of useful when you
need to figure out what order in which to run all of the commands that
build your program. make handles it all in a way that completely hides
the complexity of the topological sort from the programmer.
Compilers, and parsing in particular, used to be deep magic,
understood fully by very few people. lex and yacc made it easy for
anyone to write a few regular expressions and a grammar and get out a
parser (especially if they owned a copy of The Unix Programming
Environment), without any need to worry about the ins and outs of
finite state machines or push down automata or a whole bunch of stuff
in the dragon book.
And all of this was available in Unix Seventh Edition. In 1979. On a
computer that only gave each program 64k of memory in which to run.
[edit: Changed to reflect the fact that I was talking more about
parsing than compiling in general. From what I have been able to
gather from old books on compilers, parsing used to dominate the
compiler writing effort much more than it does nowadays, but the
distinction is still a valid one. Thanks to simen and Blackheart for
 
No comments:
Post a Comment