Subject: Re: Revised a little, was Re: Multicast oddity
To: Greg Troxel <gdt@ir.bbn.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-net
Date: 04/29/2005 17:39:17
--S96ff6o5osL9bGCL
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Apr 29, 2005 at 08:09:49PM -0400, Greg Troxel wrote:
>   Turns out my printfs were somehow messed up. Now the code's reporting i=
t=20
>   recvfrom()s the loopback address, and then does a sendto to it.
>=20
>   So we have a valid destdir, and we're a socket bound to unspec in a=20
>   multicast group.

And somehow that's turning into something weird. See way below.

> I wonder what happens if the same socket should get the packet as sent
> it.  But since the packet is queued on the IP input queue, this
> special case should not matter.
>=20
>   The one thing that may be going on is that when we join the multicast=
=20
>   group, we don't specify which interface to use, we use the "default"=20
>   interface. I bet that's not being remembered right...
>=20
> It should be looked up at join time, and remembered as an ifnet *.
> This code has been around forever (well, probably late 80s), and I
> doubt it is broken.  In ip_setmoptions:
>=20
> 		/*
> 		 * If no interface address was provided, use the interface of
> 		 * the route to the given multicast address.
> 		 */
> 		if (in_nullhost(mreq->imr_interface)) {
> 			bzero((caddr_t)&ro, sizeof(ro));
> 			ro.ro_rt =3D NULL;
> 			dst =3D satosin(&ro.ro_dst);
> 			dst->sin_len =3D sizeof(*dst);
> 			dst->sin_family =3D AF_INET;
> 			dst->sin_addr =3D mreq->imr_multiaddr;
> 			rtalloc(&ro);
> 			if (ro.ro_rt =3D=3D NULL) {
> 				error =3D EADDRNOTAVAIL;
> 				break;
> 			}
> 			ifp =3D ro.ro_rt->rt_ifp;
> 			rtfree(ro.ro_rt);
> 		} else {
> 			ifp =3D ip_multicast_if(&mreq->imr_interface, NULL);
> 		}
> 		/*
> 		 * See if we found an interface, and confirm that it
> 		 * supports multicast.
> 		 */
> 		if (ifp =3D=3D NULL || (ifp->if_flags & IFF_MULTICAST) =3D=3D 0) {
> 			error =3D EADDRNOTAVAIL;
> 			break;
> 		}
>=20
>=20
>=20
> Typically the interface with the default route will be found.  On a
> machine with one Ethernet, the right thing happens - the group is
> joined on "the" interface.
>=20
> Being joined should have absolutely nothing to do with sending.
> There is no requirement that a socket be joined to a group to send to
> it (and if it snuck in it's a bug - that violates the RFC1112
> multicast service model).
>=20
> Note that a socket has a default interface for where to send multicast
> packets (see netinet/ip_output.c:ip_setmoptions and IP_MULTICAST_IF).
> But this should not matter when sending to 127.0.0.1, since that's not
> a multicast address.

Ok, I've been snopping some more, and I know more about the IPv4 case=20
(which I'll look at first as you're more up-to-speed on it).

I'm sending to 10.0.0.6 (which is the address on fxp0, the default nic for=
=20
v4). My socket is bound to 0.0.0.0, and the multicast address is=20
239.255.255.253.

As I mentioned, I'm getting a statistics increase for each error (I can=20
corelate three errors to three stat incs). The stat is ips_odropped, and=20
the error is EADDRNOTAVAIL. That combination shows up in ip_output() in=20
the place where we check the source address being multicast!

How in the world did we go from unbound sending to 10.0.0.6 to having a=20
multicast (239.255.255.253) address as our source!

Take care,

Bill

--S96ff6o5osL9bGCL
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFCctO1Wz+3JHUci9cRAqXFAKCP8/9Lu2LJkA7I1TurQTsrF7WuHQCfU4/F
frCI2iCrPm3XY3iZzcsgaUg=
=3zk5
-----END PGP SIGNATURE-----

--S96ff6o5osL9bGCL--