Source-Changes-D archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: null-terminated vs. nul-terminated

On 2022-03-26 11:57, Roland Illig wrote:
The term "null-terminated string" is quite common when talking about C.
In contrast, the word "nul" in "nul-terminated" always reminds me of
the character abbreviation in ASCII, which has a narrower scope than C.
I prefer to keep "null-terminated" here.

Hi all,

While I don't really want to prolong this debate, as the committer who
triggered this discussion, I felt I should respond, in part to explain
why I made my choice (which I reverted, though I don't agree "null-
terminated" is more correct). TL;DR: there is no consistency here in
NetBSD's code base in man pages or comments in source code, and no
applicable style guide I know of, but "NUL-terminated" is the most
common form found. It seems there was also an attempt at standardization
in man pages made in 2005-2006, settling on "nul-terminated".

I was taught (several decades ago) that the short form for the null
byte or null character was NUL in ANSI C parlance (not just ASCII), and
that "null-terminated" was incorrect as it's ambiguous. If someone were
to say "null-byte-terminated", "null-character-terminated", or for the
other context "null-pointer-terminated", that would be fine.
"NUL-terminated" was the unambiguous contraction. (As others have
pointed out, a cleverer way to avoid this debate would be to use
entirely different terms.)

The most common form found in man pages at present installed in NetBSD
-current is actually "NUL-terminated", by a significant margin. That's
in part because many of those are from third-party projects, e.g.,
OpenBSD and OpenSSL, which standardized on that form. The next most
common is "null-terminated", then (following slightly behind) "nul-
terminated", then (much less commonly) "NULL-terminated" (which seems
quite incorrect to me). I didn't look as closely at comments, but a
similar pattern emerged, with "NUL-terminated" the most common under
/usr/include, for example (in part due to the origins of some upstream
code). (It's not my intent here to quote or debate exact statistics, so
I haven't provided any. I'm sharing my perception of practice, rightly
or wrongly.)

"nul-terminated" and "null-terminated" seemed more common in man pages
that originated from historical BSD sources, so, lacking any style
guide, I inferred the lowercase "nul" was more "correct" as "BSD style"
(excepting modern OpenBSD), even though that looks a bit odd to me. I
then examined where "nul-terminated" came from, and found these bulk
commits, which imply a standard.

date: 2005-01-02 18:38:04 +0000;  author: wiz;
Mark up NULL, and replace null by nul where appropriate.

date: 2006-10-16 08:48:45 +0000;  author: wiz;
nul/null/NULL cleanup:
when talking about characters/bytes, use "nul" and "nul-terminate"
when talking about pointers, use "null pointer" or ".Dv NULL"

So that seemed to me the established style.



Home | Main Index | Thread Index | Old Index