tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: checking m->m_pkthdr.csum_flags in ip_output()



On Fri, May 16, 2008 at 03:05:03PM +0100, Patrick Welche wrote:
> On Sun, May 04, 2008 at 12:33:05PM +0900, Takahiro Kambe wrote:
> > Hi,
> > 
> > In message <20080415.203216.41648300.taca%back-street.net@localhost>
> >     on Tue, 15 Apr 2008 20:32:16 +0900 (JST),
> >     Takahiro Kambe <taca%back-street.net@localhost> wrote:
> > > Today, NetBSD 4.0_STABLE machine paniced in ip_output() when
> > > forwarding IPv4 multicast packet.  The packet was short (36 octets)
> > > UDP/IP pakcet.
> > ...
> > > The kernel has DIAGNOSTIC option enabled and corresponding code
> > > fragments in ip_output().
> > > 
> > > #ifdef    DIAGNOSTIC
> > >   if ((m->m_flags & M_PKTHDR) == 0)
> > >           panic("ip_output: no HDR");
> > > 
> > >   if ((m->m_pkthdr.csum_flags & (M_CSUM_TCPv6|M_CSUM_UDPv6)) != 0) {
> > >           panic("ip_output: IPv6 checksum offload flags: %d",
> > >               m->m_pkthdr.csum_flags);
> > >   }
> > > 
> > >   if ((m->m_pkthdr.csum_flags & (M_CSUM_TCPv4|M_CSUM_UDPv4)) ==
> > >       (M_CSUM_TCPv4|M_CSUM_UDPv4)) {
> > >           panic("ip_output: conflicting checksum offload flags: %d",
> > >               m->m_pkthdr.csum_flags);
> > >   }
> > > #endif
> > > 
> > > It seems that this diagnostic code checking M_CSUM_TCPv4 and
> > > M_CSUM_UDPv4 are exclusive one.
> > I confirmed that bge(4) sets both M_CSUM_TCPv4 and M_CSUM_UDPv4 to
> > m->m_pkthdr.csum_flags with usual unicast IP packets.
> > 
> > I don't know it is bug of bge(4) or above DIAGNOSTIC is wrong or
> > obsolete.
> 
> Don't know whether relevant, but a 4.99.60/i386 box with bge gave:
> 
> uvm_fault(0xcdfae574, 0, 1) -> 0xe
> kernel: supervisor trap page fault, code=0
> Stopped in pid 22172.1 (dhcpd) at       0xc03a6f25:     movl    
> 0x14(%eax),%eax
> db{1}> bt/l
> m_length(0,0,cd985abc,c0377c4f,5) at 0xc03a6f25
> bpf_mtap(c2d822c0,0,cd985aec,c03a8f5d,cd985a05) at netbsd:bpf_mtap+0x17
> bge_start(c2da7004,178,9000003,3,0) at netbsd:bge_start+0x10c
> ifq_enqueue(c2da7004,c3111300,c2da7004,2,cdfae574) at netbsd:ifq_enqueue+0x13f
> ether_output(c2da7004,c3111300,c06077a0,0,c06077a0) at 
> netbsd:ether_output+0x71e
> bpf_write(cdc82300,cdc82300,cd985c60,d5bf99c0,1) at netbsd:bpf_write+0x126
> do_filewritev(7,bfbfc668,3,cdc82300,1) at netbsd:do_filewritev+0x270
> sys_writev(cdfac900,cd985d04,cd985cfc,cd985d10,c03d0d79) at 
> netbsd:sys_writev+0x3f
> syscall(cd985d48,b3,ab,bfbf001f,bfbf001f) at netbsd:syscall+0x141
> 
> yesterday...

It looks like IFQ_POLL()/IFQ_DEQUEUE() did not honor their contract.  In
order to reach the bpf_mtap() statement, IFQ_POLL() had to return m_head
!= NULL.  According to altq(9), "It is guaranteed that IFQ_DEQUEUE()
immediately after IFQ_POLL() returns the same packet."

Are you using ALTQ?

Dave

-- 
David Young             OJC Technologies
dyoung%ojctech.com@localhost      Urbana, IL * (217) 278-3933 ext 24


Home | Main Index | Thread Index | Old Index