Subject: Re: Is this a backward compatability failure in SIOCGIFCONF?
To: Alan Barrett <apb@cequrux.com>
From: Greg Troxel <gdt@ir.bbn.com>
List: current-users
Date: 10/17/2007 08:43:19
Alan Barrett <apb@cequrux.com> writes:

> A slightly old dhclient (built for NetBSD-4.99.30) doesn't work with
> a current kernel (NetBSD-4.99.33/i386).  The following error message
> appears in syslog:
>
> 	Can't get interface flags for I@`:pL: Device not configured
>
> This message is printed in src/dist/dhcp/common/discover.c,
> when it tries to use ioctl(,SIOCGIFFLAGS,) on an interface name that
> was obtained from ioctl(,SIOCGIFCONF,).
>
> The failing dhclient was built with revision 1.9 of
> src/dist/dhcp/common/discover.c, and I see the change that was made in
> revision 1.10 of that file (chanmging the way dhcp/dhclient interprets
> the result from SIOCGIFCONF), and the change that was made in revision
> 1.200 of src/sys/net/if.c (changing the way the kernel implements
> SIOCGIFCONF).
>
> My question is: Is the change in revision 1.200 of src/sys/net/if.c a
> change to a kernel interface (in which case I'd argue that there should
> be binary backward compatibility for old applications), or is it a
> kernel bug fix (in which case I'd argue that it's fine to break old
> applications that had been relying on the previous buggy behaviour).

The situation is even messier than a kernel bug fix.

Basically a change was made in May that added struct sockaddr_storage to
struct ifreq, and this was a real API/ABI change.  SIOCGIFCONF was
versioned.  But, the new version had bugs, and the code that's been in
dhclient all along was arguably incorrect.  I fixed both the kernel and
dhclient, and there is a short period of time where dhclient/kernel from
the same sources don't work, plus dhclient from before discover.c 1.10
won't work with kernels after if.c 1.186 (May 29) and before the if.c
fix (1.200 and 1.201).

So, the question is if we maintain binary compatibility when fixing bugs
within current, including for programs that were using interfaces
incorrectly.  Put that way, it's a clear no, I think, but that's perhaps
not a fair characterization.

dhclient's error was in assuming that struct ifreq had struct sockaddr,
instead of using the size of the union that holds sockaddr and (now
sockaddr_storage).  The original SIOCIFCONF API is not specified
clearly, but it's very hard to say the dhclient code was correct.

If 4.0 or earlier dhclient fails with a current kernel, please let me
know; that I think should be fixed.  But from looking at the compat code
Christos put in when he added sockaddr_storage to ifreq, I think it's
ok, and I've heard no complaints.