Subject: port-sparc64/27134: sparc64 unaligned access in m_tag_find
To: None <gnats-bugs@gnats.NetBSD.org>
From: None <carton@Ivy.NET>
List: netbsd-bugs
Date: 10/04/2004 06:32:20
>Number:         27134
>Category:       port-sparc64
>Synopsis:       sparc64 unaligned access in m_tag_find
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    port-sparc64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Oct 04 06:33:00 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator:     Miles Nordin
>Release:        NetBSD 2.0_BETA 2004-08-15
>Organization:
Ivy Ministries
>Environment:
System: NetBSD lucette 2.0_BETA NetBSD 2.0_BETA (LUCETTE-$Revision: 1.1 $) #3: Sun Oct 3 23:07:46 EDT 2004 carton@castrovalva:/scratch/src/sys/arch/sparc64/compile/LUCETTE sparc64
 netinet/fil.c                   pullup from pr#26666, uncommitted patches from Darren and Pavel in pr#26839
 kern/uipc_mbuf.c                1.80.2.3           pr#26733, remove KASSERT on line#713
 sys/mbuf.h                      1.90.2.3           pr#26733
 netinet/ip_fil_netbsd.c         1.3.2.10           pr#26733
 netinet6/raw_ip6.c              1.63.2.2           pr#26733
 kern/kern_lock.c                1.75.2.1

Architecture: sparc64
Machine: sparc64
>Description:
I removed the KASSERT.  the m = xxxxxxxx, n = xxxxxxxx show where
that assertion would have paniced if the
KASSERT at uipc_mbuf.c:713 were still there.  Instead it kept
running for a couple hours.  I was trying to figure out kern/26937.
I think this is a different panic, but not sure.

[...]
NetBSD 2.0_BETA (console on lucette)

login: m = 0x336d290
n = 0x323d840
tlp1: transmit underrun; new threshold: 160/1024 bytes
tlp2: transmit underrun; new threshold: 96/256 bytes
tlp2: transmit underrun; new threshold: 128/512 bytes
trap type 0x34: pc=123618c npc=1236190 pstate=44820006<PRIV,IE>
kernel trap 34: mem address not aligned
Stopped at      netbsd:m_tag_find+0x14: lduh            [%o0 + 0x8], %g1
db> bt
ip_output(5dc, 3142830, 1895e00, c6c1800, 240, 0) at netbsd:ip_output+0x930
ip_forward(3258960, 0, 23, ce, 15, 15) at netbsd:ip_forward+0x2a4
ip_input(3258960, 3258960, 0, 800, 1, 1) at netbsd:ip_input+0x4f0
ipintr(1, 9080010, 80000000, 7fff0000, 47, 1750) at netbsd:ipintr+0x10c
softnet(4, 0, e0017ed0, 5, 130ffe4, 21d800) at netbsd:softnet+0x98
sparc64_ipi_flush_all(0, 0, 137c79c, 0, ffffffffffffffff, 0) at netbsd:sparc64_i
pi_flush_all+0x23c
db> reboot 0x104
Frame pointer is at 0xe00164c1
Call traceback:
1310df8(1, 2d7d900, d, f, 0, d, e0016581) fp = e0016581
11d3bdc(104, 0, 0, e0017e7c, 8, e00170f8, e0016641) fp = e0016641
11d3604(123618c, 0, ffffffffffffffff, e0016fe0, 0, 4, e0016711) fp = e0016711
11d32ec(1812cb8, 0, 2b, 8, 0, c1f83fe0, e0016871) fp = e0016871
11d70c8(1236190, 0, 0, 0, 0, 0, e0016951) fp = e0016951
131c228(0, 0, 0, 0, 6, 1000000, e0016a21) fp = e0016a21
1319258(34, e0017500, e0017390, 59, 3082078, 1, e0016ae1) fp = e0016ae1
1008b98(e0017500, 34, 123618c, 44820006, e00176d8, 1000000, e0016c51) fp = e0016c51
108f1a8(67656d310003ba0f, 3, 0, 2, e00175d0, 1000000, e0016e31) fp = e0016e31
102f00c(3303c70, 0, fffffffffffffffc, 8a05, ffffffffffff8a06, ffff, e0016ef1) fp = e0016ef1
102aab0(5dc, 3142830, 1895e00, c6c1800, 240, 0, e0017001) fp = e0017001
10289e0(3258960, 0, 23, ce, 15, 15, e00173d1) fp = e00173d1
10284d8(3258960, 3258960, 0, 800, 1, 1, e00174a1) fp = e00174a1
131007c(1, 9080010, 80000000, 7fff0000, 47, 1750, e0017561) fp = e0017561
100906c(4, 0, e0017ed0, 5, 130ffe4, 21d800, e0017621) fp = e0017621
0(0, 0, 137c79c, 0, ffffffffffffffff, 0, dc530d1) fp = dc530d1

dumping to dev 7,9 offset 1130493
dump esiop0: unable to load cmd DMA map: -1
starting dump, blkno 1130496
device not ready
rebooting

Res
LOM event: +23d+7h36m25s host reset
etting ... 

(gdb) list *0x123618c
0x123618c is in m_tag_find (../../../../kern/uipc_mbuf2.c:313).
308             if (t == NULL)
309                     p = SLIST_FIRST(&m->m_pkthdr.tags);
310             else
311                     p = SLIST_NEXT(t, m_tag_link);
312             while (p != NULL) {
313                     if (p->m_tag_id == type)
314                             return (p);
315                     p = SLIST_NEXT(p, m_tag_link);
316             }
317             return (NULL);

>How-To-Repeat:
don't know.  system has IPv6, bind9, ipfilter, and five active ethernet 
interfaces.  it crashes almost every day.  I will surely get more crashes, 
but I don't know why kernel coredumps are broken.
>Fix:
	
>Release-Note:
>Audit-Trail:
>Unformatted: