| <html><head><title>The design of toybox</title></head> |
| <!--#include file="header.html" --> |
| |
| <a name="goals"><b><h2><a href="#goals">Design goals</a></h2></b> |
| |
| <p>Toybox should be simple, small, fast, and full featured. In that order.</p> |
| |
| <p>When these goals need to be balanced off against each other, keeping the code |
| as simple as it can be to do what it does is the most important (and hardest) |
| goal. Then keeping it small is slightly more important than making it fast. |
| Features are the reason we write code in the first place but this has all |
| been implemented before so if we can't do a better job why bother?</p> |
| |
| <p>It should be possible to get 80% of the way to each goal |
| before they really start to fight. Here they are in reverse order |
| of importance:</p> |
| |
| <b><h3>Features</h3></b> |
| |
| <p>The hard part is deciding what NOT to include. |
| A project without boundaries will bloat itself |
| to death. One of the hardest but most important things a project must |
| do is draw a line and say "no, this is somebody else's problem, not |
| something we should do."</p> |
| |
| <p>Some things are simply outside the scope of the project: even though |
| posix defines commands for compiling and linking, we're not going to include |
| a compiler or linker (and support for a potentially infinite number of hardware |
| targets). And until somebody comes up with a ~30k ssh implementation (with |
| a crypto algorithm that won't need replacing every 5 years), we're |
| going to point you at dropbear or bearssl.</p> |
| |
| <p>The <a href=roadmap.html>roadmap</a> has the list of features we're |
| trying to implement, and the reasons why we decided to include those |
| features. After the 1.0 release some of that material may get moved here, |
| but for now it needs its own page.</p> |
| |
| <p>There are potential features (such as a screen/tmux implementation) |
| that might be worth adding after 1.0, in part because they could share |
| infrastructure with things like "less" and "vi" so might be less work for |
| us to do than an external from-scratch implementation. But for now, major |
| new features outside posix, android's existing commands, and the needs of |
| development systems, are a distraction from the 1.0 release.</p> |
| |
| <b><h3>Speed</h3></b> |
| |
| <p>It's easy to say lots about optimizing for speed (which is why this section |
| is so long), but at the same time it's the optimization we care the least about. |
| The essence of speed is being as efficient as possible, which means doing as |
| little work as possible. A design that's small and simple gets you 90% of the |
| way there, and most of the rest is either fine-tuning or more trouble than |
| it's worth (and often actually counterproductive). Still, here's some |
| advice:</p> |
| |
| <p>First, understand the darn problem you're trying to solve. You'd think |
| I wouldn't have to say this, but I do. Trying to find a faster sorting |
| algorithm is no substitute for figuring out a way to skip the sorting step |
| entirely. The fastest way to do anything is not to have to do it at all, |
| and _all_ optimization boils down to avoiding unnecessary work.</p> |
| |
| <p>Speed is easy to measure; there are dozens of profiling tools for Linux |
| (although personally I find the "time" command a good starting place). |
| Don't waste too much time trying to optimize something you can't measure, |
| and there's no much point speeding up things you don't spend much time doing |
| anyway.</p> |
| |
| <p>Understand the difference between throughput and latency. Faster |
| processors improve throughput, but don't always do much for latency. |
| After 30 years of Moore's Law, most of the remaining problems are latency, |
| not throughput. (There are of course a few exceptions, like data compression |
| code, encryption, rsync...) Worry about throughput inside long-running |
| loops, and worry about latency everywhere else. (And don't worry too much |
| about avoiding system calls or function calls or anything else in the name |
| of speed unless you are in the middle of a tight loop that's you've already |
| proven isn't running fast enough.)</p> |
| |
| <p>"Locality of reference" is generally nice, in all sorts of contexts. |
| It's obvious that waiting for disk access is 1000x slower than doing stuff in |
| RAM (and making the disk seek is 10x slower than sequential reads/writes), |
| but it's just as true that a loop which stays in L1 cache is many times faster |
| than a loop that has to wait for a DRAM fetch on each iteration. Don't worry |
| about whether "&" is faster than "%" until your executable loop stays in L1 |
| cache and the data access is fetching cache lines intelligently. (To |
| understand DRAM, L1, and L2 cache, read Hannibal's marvelous ram guide at Ars |
| Technica: |
| <a href=http://arstechnica.com/paedia/r/ram_guide/ram_guide.part1-2.html>part one</a>, |
| <a href=http://arstechnica.com/paedia/r/ram_guide/ram_guide.part2-1.html>part two</a>, |
| <a href=http://arstechnica.com/paedia/r/ram_guide/ram_guide.part3-1.html>part three</a>, |
| plus this |
| <a href=http://arstechnica.com/articles/paedia/cpu/caching.ars/1>article on |
| cacheing</a>, and this one on |
| <a href=http://arstechnica.com/articles/paedia/cpu/bandwidth-latency.ars>bandwidth |
| and latency</a>. |
| And there's <a href=http://arstechnica.com/paedia/index.html>more where that came from</a>.) |
| Running out of L1 cache can execute one instruction per clock cycle, going |
| to L2 cache costs a dozen or so clock cycles, and waiting for a worst case dram |
| fetch (round trip latency with a bank switch) can cost thousands of |
| clock cycles. (Historically, this disparity has gotten worse with time, |
| just like the speed hit for swapping to disk. These days, a _big_ L1 cache |
| is 128k and a big L2 cache is a couple of megabytes. A cheap low-power |
| embedded processor may have 8k of L1 cache and no L2.)</p> |
| |
| <p>Learn how <a href=http://nommu.org/memory-faq.txt>virtual memory and |
| memory managment units work</a>. Don't touch |
| memory you don't have to. Even just reading memory evicts stuff from L1 and L2 |
| cache, which may have to be read back in later. Writing memory can force the |
| operating system to break copy-on-write, which allocates more memory. (The |
| memory returned by malloc() is only a virtual allocation, filled with lots of |
| copy-on-write mappings of the zero page. Actual physical pages get allocated |
| when the copy-on-write gets broken by writing to the virtual page. This |
| is why checking the return value of malloc() isn't very useful anymore, it |
| only detects running out of virtual memory, not physical memory. Unless |
| you're using a <a href=http://nommu.org>NOMMU system</a>, where all bets are off.)</p> |
| |
| <p>Don't think that just because you don't have a swap file the system can't |
| start swap thrashing: any file backed page (ala mmap) can be evicted, and |
| there's a reason all running programs require an executable file (they're |
| mmaped, and can be flushed back to disk when memory is short). And long |
| before that, disk cache gets reclaimed and has to be read back in. When the |
| operating system really can't free up any more pages it triggers the out of |
| memory killer to free up pages by killing processes (the alternative is the |
| entire OS freezing solid). Modern operating systems seldom run out of |
| memory gracefully.</p> |
| |
| <p>Also, it's better to be simple than clever. Many people think that mmap() |
| is faster than read() because it avoids a copy, but twiddling with the memory |
| management is itself slow, and can cause unnecessary CPU cache flushes. And |
| if a read faults in dozens of pages sequentially, but your mmap iterates |
| backwards through a file (causing lots of seeks, each of which your program |
| blocks waiting for), the read can be many times faster. On the other hand, the |
| mmap can sometimes use less memory, since the memory provided by mmap |
| comes from the page cache (allocated anyway), and it can be faster if you're |
| doing a lot of different updates to the same area. The moral? Measure, then |
| try to speed things up, and measure again to confirm it actually _did_ speed |
| things up rather than made them worse. (And understanding what's really going |
| on underneath is a big help to making it happen faster.)</p> |
| |
| <p>In general, being simple is better than being clever. Optimization |
| strategies change with time. For example, decades ago precalculating a table |
| of results (for things like isdigit() or cosine(int degrees)) was clearly |
| faster because processors were so slow. Then processors got faster and grew |
| math coprocessors, and calculating the value each time became faster than |
| the table lookup (because the calculation fit in L1 cache but the lookup |
| had to go out to DRAM). Then cache sizes got bigger (the Pentium M has |
| 2 megabytes of L2 cache) and the table fit in cache, so the table became |
| fast again... Predicting how changes in hardware will affect your algorithm |
| is difficult, and using ten year old optimization advice and produce |
| laughably bad results. But being simple and efficient is always going to |
| give at least a reasonable result.</p> |
| |
| <p>The famous quote from Ken Thompson, "When in doubt, use brute force", |
| applies to toybox. Do the simple thing first, do as little of it as possible, |
| and make sure it's right. You can always speed it up later.</p> |
| |
| <b><h3>Size</h3></b> |
| <p>Again, being simple gives you most of this. An algorithm that does less work |
| is generally smaller. Understand the problem, treat size as a cost, and |
| get a good bang for the byte.</p> |
| |
| <p>Understand the difference between binary size, heap size, and stack size. |
| Your binary is the executable file on disk, your heap is where malloc() memory |
| lives, and your stack is where local variables (and function call return |
| addresses) live. Optimizing for binary size is generally good: executing |
| fewer instructions makes your program run faster (and fits more of it in |
| cache). On embedded systems, binary size is especially precious because |
| flash is expensive (and its successor, MRAM, even more so). Small stack size |
| is important for nommu systems because they have to preallocate their stack |
| and can't make it bigger via page fault. And everybody likes a small heap.</p> |
| |
| <p>Measure the right things. Especially with modern optimizers, expecting |
| something to be smaller is no guarantee it will be after the compiler's done |
| with it. Binary size isn't the most accurate indicator of the impact of a |
| given change, because lots of things get combined and rounded during |
| compilation and linking. Matt Mackall's bloat-o-meter is a python script |
| which compares two versions of a program, and shows size changes in each |
| symbol (using the "nm" command behind the scenes). To use this, run |
| "make baseline" to build a baseline version to compare against, and |
| then "make bloatometer" to compare that baseline version against the current |
| code.</p> |
| |
| <p>Avoid special cases. Whenever you see similar chunks of code in more than |
| one place, it might be possible to combine them and have the users call shared |
| code. (This is the most commonly cited trick, which doesn't make it easy. If |
| seeing two lines of code do the same thing makes you slightly uncomfortable, |
| you've got the right mindset.)</p> |
| |
| <p>Some specific advice: Using a char in place of an int when doing math |
| produces significantly larger code on some platforms (notably arm), |
| because each time the compiler has to emit code to convert it to int, do the |
| math, and convert it back. Bitfields have this problem on most platforms. |
| Because of this, using char to index a for() loop is probably not a net win, |
| although using char (or a bitfield) to store a value in a structure that's |
| repeated hundreds of times can be a good tradeoff of binary size for heap |
| space.</p> |
| |
| <b><h3>Simplicity</h3></b> |
| |
| <p>Complexity is a cost, just like code size or runtime speed. Treat it as |
| a cost, and spend your complexity budget wisely. (Sometimes this means you |
| can't afford a feature because it complicates the code too much to be |
| worth it.)</p> |
| |
| <p>Simplicity has lots of benefits. Simple code is easy to maintain, easy to |
| port to new processors, easy to audit for security holes, and easy to |
| understand.</p> |
| |
| <p>Simplicity itself can have subtle non-obvious aspects requiring a tradeoff |
| between one kind of simplicity and another: simple for the computer to |
| execute and simple for a human reader to understand aren't always the |
| same thing. A compact and clever algorithm that does very little work may |
| not be as easy to explain or understand as a larger more explicit version |
| requiring more code, memory, and CPU time. When balancing these, err on the |
| side of doing less work, but add comments describing how you |
| could be more explicit.</p> |
| |
| <p>In general, comments are not a substitute for good code (or well chosen |
| variable or function names). Commenting "x += y;" with "/* add y to x */" |
| can actually detract from the program's readability. If you need to describe |
| what the code is doing (rather than _why_ it's doing it), that means the |
| code itself isn't very clear.</p> |
| |
| <p>Environmental dependencies are another type of complexity, so needing other |
| packages to build or run is a big downside. For example, we don't use curses |
| when we can simply output ansi escape sequences and trust all terminal |
| programs written in the past 30 years to be able to support them. Regularly |
| testing that we work with C libraries which support static linking (musl does, |
| glibc doesn't) is another way to be self-contained with known boundaries: |
| it doesn't have to be the only way to build the project, but should be regularly |
| tested and supported.</p> |
| |
| <p>Prioritizing simplicity tends to serve our other goals: simplifying code |
| generally reduces its size (both in terms of binary size and runtime memory |
| usage), and avoiding unnecessary work makes code run faster. Smaller code |
| also tends to run faster on modern hardware due to CPU cacheing: fitting your |
| code into L1 cache is great, and staying in L2 cache is still pretty good.</p> |
| |
| <p>But a simple implementation is not always the smallest or fastest, and |
| balancing simplicity vs the other goals can be difficult. For example, the |
| atolx_range() function in lib/lib.c always uses the 64 bit "long long" type, |
| which produces larger and slower code on 32 bit platforms and |
| often assigned into smaller interger types. Although libc has parallel |
| implementations for different data sizes (atoi, atol, atoll) we chose a |
| common codepath which can cover all cases (every user goes through the |
| same codepath, with the maximum amount of testing and minimum and avoids |
| surprising variations in behavior).</p> |
| |
| <p>On the other hand, the "tail" command has two codepaths, one for seekable |
| files and one for nonseekable files. Although the nonseekable case can handle |
| all inputs (and is required when input comes from a pipe or similar, so cannot |
| be removed), reading through multiple gigabytes of data to reach the end of |
| seekable files was both a common case and hugely penalized by a nonseekable |
| approach (half-minute wait vs instant results). This is one example |
| where performance did outweigh simplicity of implementation.</p> |
| |
| <p><a href=http://www.joelonsoftware.com/articles/fog0000000069.html>Joel |
| Spolsky argues against throwing code out and starting over</a>, and he has |
| good points: an existing debugged codebase contains a huge amount of baked |
| in knowledge about strange real-world use cases that the designers didn't |
| know about until users hit the bugs, and most of this knowledge is never |
| explicitly stated anywhere except in the source code.</p> |
| |
| <p>That said, the Mythical Man-Month's "build one to throw away" advice points |
| out that until you've solved the problem you don't properly understand it, and |
| about the time you finish your first version is when you've finally figured |
| out what you _should_ have done. (The corrolary is that if you build one |
| expecting to throw it away, you'll actually wind up throwing away two. You |
| don't understand the problem until you _have_ solved it.)</p> |
| |
| <p>Joel is talking about what closed source software can afford to do: Code |
| that works and has been paid for is a corporate asset not lightly abandoned. |
| Open source software can afford to re-implement code that works, over and |
| over from scratch, for incremental gains. Before toybox, the unix command line |
| has already been reimplemented from scratch several times (the |
| original AT&T Unix command line in assembly and then in C, the BSD |
| versions, Coherent was the first full from-scratch Unix clone in 1980, |
| Minix was another clone which Linux was inspired by and developed under, |
| the GNU tools were yet another rewrite intended for use in the stillborn |
| "Hurd" project, BusyBox was still another rewrite, and more versions |
| were written in Plan 9, uclinux, klibc, sash, sbase, s6, and of course |
| android toolbox...). But maybe toybox can do a better job. :)</p> |
| |
| <p>As Antoine de St. Exupery (author of "The Little Prince" and an early |
| aircraft designer) said, "Perfection is achieved, not when there |
| is nothing left to add, but when there is nothing left to take away." |
| And Ken Thompson (creator of Unix) said "One of my most productive |
| days was throwing away 1000 lines of code." It's always possible to |
| come up with a better way to do it.</p> |
| |
| <p>P.S. How could I resist linking to an article about |
| <a href=http://blog.outer-court.com/archive/2005-08-24-n14.html>why |
| programmers should strive to be lazy and dumb</a>?</p> |
| |
| <a name="portability"><b><h2><a href="#portability">Portability issues</a></h2></b> |
| |
| <b><h3>Platforms</h3></b> |
| <p>Toybox should run on Android (all commands with musl-libc, as large a subset |
| as practical with bionic), and every other hardware platform Linux runs on. |
| Other posix/susv4 environments (perhaps MacOS X or newlib+libgloss) are vaguely |
| interesting but only if they're easy to support; I'm not going to spend much |
| effort on them.</p> |
| |
| <p>I don't do windows.</p> |
| |
| <p>We depend on C99 and posix-2008 libc features such as the openat() family of |
| functions. We also assume certain "modern" linux kernel behavior such |
| as large environment sizes (<a href=https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b6a2fea39318>linux commit b6a2fea39318</a>, went into 2.6.22 |
| released <a href=faq.html#support_horizon>July 2007</a>). In theory this shouldn't prevent us from working on |
| older kernels or other implementations (ala BSD), but we don't police their |
| corner cases.</p> |
| |
| <b><h3>32/64 bit</h3></b> |
| <p>Toybox should work on both 32 bit and 64 bit systems. 64 bit desktop |
| hardware went mainstream in 2005 and was essentially ubiquitous |
| by the end of the decade, but 32 bit hardware will continue to be important |
| in embedded devices for several more years.</p> |
| |
| <p>Toybox relies on the fact that on any Unix-like platform, pointer and long |
| are always the same size (on both 32 and 64 bit). Pointer and int are _not_ |
| the same size on 64 bit systems, but pointer and long are. |
| This is guaranteed by the LP64 memory model, a Unix standard (which Linux |
| and MacOS X both implement, and which modern 64 bit processors such as |
| x86-64 were <a href=http://www.pagetable.com/?p=6>designed for</a>). See |
| <a href=http://www.unix.org/whitepapers/64bit.html>the LP64 standard</a> and |
| <a href=http://www.unix.org/version2/whatsnew/lp64_wp.html>the LP64 |
| rationale</a> for details.</p> |
| |
| <p>Note that Windows doesn't work like this, and I don't care. |
| <a href=http://blogs.msdn.com/oldnewthing/archive/2005/01/31/363790.aspx>The |
| insane legacy reasons why this is broken on Windows are explained here.</a></p> |
| |
| <b><h3>Signedness of char</h3></b> |
| <p>On platforms like x86, variables of type char default to unsigned. On |
| platforms like arm, char defaults to signed. This difference can lead to |
| subtle portability bugs, and to avoid them we specify which one we want by |
| feeding the compiler -funsigned-char.</p> |
| |
| <p>The reason to pick "unsigned" is that way char strings are 8-bit clean by |
| default, which makes UTF-8 support easier.</p> |
| |
| <p><h3>Error messages and internationalization:</h3></p> |
| |
| <p>Error messages are extremely terse not just to save bytes, but because we |
| don't use any sort of _("string") translation infrastructure. (We're not |
| translating the command names themselves, so we must expect a minimum amount of |
| english knowledge from our users, but let's keep it to a minimum.)</p> |
| |
| <p>Thus "bad -A '%c'" is |
| preferable to "Unrecognized address base '%c'", because a non-english speaker |
| can see that -A was the problem (giving back the command line argument they |
| supplied). A user with a ~20 word english vocabulary is |
| more likely to know (or guess) "bad" than the longer message, and you can |
| use "bad" in place of "invalid", "inappropriate", "unrecognized"... |
| Similarly when atolx_range() complains about range constraints with |
| "4 < 17" or "12 > 5", it's intentional: those don't need to be translated.</p> |
| |
| <p>The strerror() messages produced by perror_exit() and friends should be |
| localized by libc, and our error functions also prepend the command name |
| (which non-english speakers can presumably recognize already). Keep the |
| explanation in between to a minimum, and where possible feed back the values |
| they passed in to identify _what_ we couldn't process. |
| If you say perror_exit("setsockopt"), you've identified the action you |
| were trying to take, and the perror gives a translated error message (from libc) |
| explaining _why_ it couldn't do it, so you probably don't need to add english |
| words like "failed" or "couldn't assign".</p> |
| |
| <p>All commands should be 8-bit clean, with explicit |
| <a href=http://yarchive.net/comp/linux/utf8.html>UTF-8</a> support where |
| necessary. Assume all input data might be utf8, and at least preserve |
| it and pass it through. (For this reason, our build is -funsigned-char on |
| all architectures; "char" is unsigned unless you stick "signed" in front |
| of it.)</p> |
| |
| <p>Locale support isn't currently a goal; that's a presentation layer issue |
| (I.E. a GUI problem).</p> |
| |
| <p>Someday we should probably have translated --help text, but that's a |
| post-1.0 issue.</p> |
| |
| <p><h3>Shared Libraries</h3></p> |
| |
| <p>Toybox's policy on shared libraries is that they should never be |
| required, but can optionally be used to improve performance.</p> |
| |
| <p>Toybox should provide the command line utilities for |
| <a href=roadmap.html#dev_env>self-hosting development envirionments</a>, |
| and an easy way to set up "hermetic builds" (I.E. builds which provide |
| their own dependencies, isolating the build logic from host command version |
| skew with a simple known build environment). In both cases, external |
| dependencies defeat the purpose.</p> |
| |
| <p>This means toybox should provide full functionality without relying |
| on any external dependencies (other than libc). But toybox may optionally use |
| libraries such as zlib and openssl to improve performance for things like |
| deflate and sha1sum, which lets the corresponding built-in implementations |
| be simple (and thus slow). But the built-in implementations need to exist and |
| work.</p> |
| |
| <p>(This is why we use an external https wrapper program, because depending on |
| openssl or similar to be linked in would change the behavior of toybox.)</p> |
| |
| <a name="codestyle" /> |
| <h2>Coding style</h2> |
| |
| <p>The real coding style holy wars are over things that don't matter |
| (whitespace, indentation, curly bracket placement...) and thus have no |
| obviously correct answer. As in academia, "the fighting is so vicious because |
| the stakes are so small". That said, being consistent makes the code readable, |
| so here's how to make toybox code look like other toybox code.</p> |
| |
| <p>Toybox source uses two spaces per indentation level, and wraps at 80 |
| columns. (Indentation of continuation lines is awkward no matter what |
| you do, sometimes two spaces looks better, sometimes indenting to the |
| contents of a parentheses looks better.)</p> |
| |
| <p>I'm aware this indentation style creeps some people out, so here's |
| the sed invocation to convert groups of two leading spaces to tabs:</p> |
| <blockquote><pre> |
| sed -i ':loop;s/^\( *\) /\1\t/;t loop' filename |
| </pre></blockquote> |
| |
| <p>And here's the sed invocation to convert leading tabs to two spaces each:</p> |
| <blockquote><pre> |
| sed -i ':loop;s/^\( *\)\t/\1 /;t loop' filename |
| </pre></blockquote> |
| |
| <p>There's a space after C flow control statements that look like functions, so |
| "if (blah)" instead of "if(blah)". (Note that sizeof is actually an |
| operator, so we don't give it a space for the same reason ++ doesn't get |
| one. Yeah, it doesn't need the parentheses either, but it gets them. |
| These rules are mostly to make the code look consistent, and thus easier |
| to read.) We also put a space around assignment operators (on both sides), |
| so "int x = 0;".</p> |
| |
| <p>Blank lines (vertical whitespace) go between thoughts. "We were doing that, |
| now we're doing this." (Not a hard and fast rule about _where_ it goes, |
| but there should be some for the same reason writing has paragraph breaks.)</p> |
| |
| <p>Variable declarations go at the start of blocks, with a blank line between |
| them and other code. Yes, c99 allows you to put them anywhere, but they're |
| harder to find if you do that. If there's a large enough distance between |
| the declaration and the code using it to make you uncomfortable, maybe the |
| function's too big, or is there an if statement or something you can |
| use as an excuse to start a new closer block?</p> |
| |
| <p>If statments with a single line body go on the same line if the result |
| fits in 80 columns, on a second line if it doesn't. We usually only use |
| curly brackets if we need to, either because the body is multiple lines or |
| because we need to distinguish which if an else binds to. Curly brackets go |
| on the same line as the test/loop statement. The exception to both cases is |
| if the test part of an if statement is long enough to split into multiple |
| lines, then we put the curly bracket on its own line afterwards (so it doesn't |
| get lost in the multple line variably indented mess), and we put it there |
| even if it's only grouping one line (because the indentation level is not |
| providing clear information in that case).</p> |
| |
| <p>I.E.</p> |
| |
| <blockquote> |
| <pre> |
| if (thingy) thingy; |
| else thingy; |
| |
| if (thingy) { |
| thingy; |
| thingy; |
| } else thingy; |
| |
| if (blah blah blah... |
| && blah blah blah) |
| { |
| thingy; |
| } |
| </pre></blockquote> |
| |
| <p>Gotos are allowed for error handling, and for breaking out of |
| nested loops. In general, a goto should only jump forward (not back), and |
| should either jump to the end of an outer loop, or to error handling code |
| at the end of the function. Goto labels are never indented: they override the |
| block structure of the file. Putting them at the left edge makes them easy |
| to spot as overrides to the normal flow of control, which they are.</p> |
| |
| <p>When there's a shorter way to say something, we tend to do that for |
| consistency. For example, we tend to say "*blah" instead of "blah[0]" unless |
| we're referring to more than one element of blah. Similarly, NULL is |
| really just 0 (and C will automatically typecast 0 to anything, except in |
| varargs), "if (function() != NULL)" is the same as "if (function())", |
| "x = (blah == NULL);" is "x = !blah;", and so on.</p> |
| |
| <p>The goal is to be |
| concise, not cryptic: if you're worried about the code being hard to |
| understand, splitting it to multiple steps on multiple lines is |
| better than a NOP operation like "!= NULL". A common sign of trying too |
| hard is nesting ? : three levels deep, sometimes if/else and a temporary |
| variable is just plain easier to read. If you think you need a comment, |
| you may be right.</p> |
| |
| <p>Comments are nice, but don't overdo it. Comments should explain _why_, |
| not how. If the code doesn't make the how part obvious, that's a problem with |
| the code. Sometimes choosing a better variable name is more revealing than a |
| comment. Comments on their own line are better than comments on the end of |
| lines, and they usually have a blank line before them. Most of toybox's |
| comments are c99 style // single line comments, even when there's more than |
| one of them. The /* multiline */ style is used at the start for the metadata, |
| but not so much in the code itself. They don't nest cleanly, are easy to leave |
| accidentally unterminated, need extra nonfunctional * to look right, and if |
| you need _that_ much explanation maybe what you really need is a URL citation |
| linking to a standards document? Long comments can fall out of sync with what |
| the code is doing. Comments do not get regression tested. There's no such |
| thing as self-documenting code (if nothing else, code with _no_ comments |
| is a bit unfriendly to new readers), but "chocolate sauce isn't the answer |
| to bad cooking" either. Don't use comments as a crutch to explain unclear |
| code if the code can be fixed.</p> |
| |
| <!--#include file="footer.html" --> |