Subject: Re: tftpd
To: David Laight <david@l8s.co.uk>
From: Alex <xela@MIT.EDU>
List: tech-userlevel
Date: 03/02/2003 18:05:28
> > Your workaround prevents the sysadmins from noticing that their PXE
> > ROMs are broken and applying negative reinforcement to the vendor. I
> > agree with what others have said that having the ability to debug this
> > kind of lossage is more important than a workaround for one specific
> > case. What if the next rev of this buggy PXE ROM appends a 0xfe rather
> > than 0xff?
> 
> 
> I agree - the thing to do is to ensure that tftpd's error message
> has the 0xff byte 'suitably' escaped (probably even if 0xff the the
> printable character ij or y").  The sysadmin can always
> link the actual file to that name (isn't that the way you find the
> filename that is being requested anyway?)

Ok, clearly I should have gone into more detail in the first
place.  The fact that it's 0xff isn't arbitrary; it's a protocol
implementation mistake.  What happens is this:  the PXE ROM
broadcasts a DHCPDISCOVER.  Assuming dhcpd.conf has been
configured for it, e.g.:

    match if substring
        (option vendor-class-identifier,0,20)="PXEClient:Arch:00000";
    option bootfile-name "pxeboot_ia32.bin";

dhcpd replies with a DHCPOFFER with the bootfile-name option.  In
conformance with rfc2132, dhcpd sets the byte after the last
option in the options field, which is in this case necessarily the
bootfile-name, to 255.  The PXE ROM incorrectly treats this as
part of the filename, and tries to tftpboot pxeboot_ia32.bin0xff.
My initial "0xff appended" was shorthand for all of this; there's
no risk they'll append a different value in the next rev.

I agree that just silently working around the bug, as I initially
intended to do, would be poor; I'll add logging.  But I still
believe that the default behavior should be to accept the
malformed bootfile name:  to do otherwise would violate both the
principle of least surprise and the robustness principle.

Most sysadmins use tftp rarely.  They don't have its intricacies
loaded into their forebrains, and they don't especially want to:
the least surprising thing for a sysadmin who's trying to netboot
a machine with one of these early-generation Intel PXE ROMs is 
for it to just boot.  I got surprised a few months ago, and
consequently became a lot more intimately acquainted with tftp
than I ever wanted to be.  I don't see any point in any other
sysadmin having to tread over that same ground.

IMHO Postel's language in RFC 791, where I believe he first stated
the robustness principle, could hardly be more clear:

    The implementation of a protocol must be robust.  Each
    implementation must expect to interoperate with others
    created by different individuals....  In general, an
    implementation must be conservative in its sending
    behavior, and liberal in its receiving behavior.  That
    is, it must be careful to send well-formed datagrams,
    but must accept any datagram that it can interpret
    (e.g., not object to technical errors where the
    meaning is still clear).

---Alex