[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: port-xen/53074: Daily 8.99.12 DOMU panic
The following reply was made to PR port-xen/53074; it has been noted by GNATS.
From: Brad Spencer <brad%anduin.eldar.org@localhost>
Cc: port-xen-maintainer%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
Subject: Re: port-xen/53074: Daily 8.99.12 DOMU panic
Date: Wed, 07 Mar 2018 20:43:56 -0500
> The following reply was made to PR port-xen/53074; it has been noted by GNATS.
> From: coypu%sdf.org@localhost
> To: gnats-bugs%NetBSD.org@localhost
> Subject: Re: port-xen/53074: Daily 8.99.12 DOMU panic
> Date: Tue, 6 Mar 2018 01:49:26 +0000
> You can parse the backtrace like so (normally, it would report the
> functions, but that is a bug).
> gdb /my-netbsd-kernel
> info symbol 0xffffffff804f74b8
> and so on.
I noticed something odd about the daily panic I am having. It always
happened around 9:15 in the morning, not every 24 hours from a reboot.
At that time, the DOMU was performing a backup using a line much like
/sbin/dump -0u -a -f - $fs | ssh backup%server.eldar.org@localhost "/usr/pkg/bin/buffer | /usr/bin/bzip2 -9c > some_backup.bz2"
The dump would proceed for a while and the at some point panic 100% of
the time. The latest kernel that I compiled up for debugging symbols
produced a slightly different panic then was reported initially in the
PR. I have attached the more recent panic to the end of this email along
with the decoding of the symbols. The original panic that was reported
does not happen with the new kernel.
I suspected some sort of file system damage, but that was ruled out.
Multiple fsck are clean, and even coping the data using pax to a new
volume produced the same panic on the new volume. The file system is a
FFSv2, fslevel 5. WAPBL was enabled on the file system, but that was
also removed, although the panic was slightly different when WAPBL was
present. The volumes are presented to the DOMU as raw lvm devices.
This DOMU was updated from a 6.99 era build to 8.99.12 recently and
performed this sort of backup every day in the past. The DOMU is has a
DOM0 which is a NetBSD/amd64 Xen 4.5.1 DOM0 running 7.1_STABLE, also
recent, and is given a sched_credit weight of 32 and CPU cap of 10.
I can help if given some guidance as to what may be needed.
panic: biodone2 already
cpu0: Begin traceback...
?() at ffffffff804f74b8
?() at ffffffff804f7575
?() at ffffffff80536841
?() at ffffffff8053689c
?() at ffffffff804cc257
cpu0: End traceback...
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip 0xffffffff802057a5 cs 0xe030 rflags 0x202 cr2 0xffffa0002b9fe000 ilevel 0 rsp 0xffffa0002b353dc0
curlwp 0xffffa00000741000 pid 0.4 lowest kstack 0xffffa0002b3502c0
Stopped in pid 0.4 (system) at ffffffff802057a5: leave
(gdb) l *(0xffffffff804f74b8)
0xffffffff804f74b8 is in vpanic (../../../../kern/subr_prf.c:342).
339 #ifdef DDB
342 cpu_reboot(bootopt, NULL);
346 * kernel logging functions: log, logpri, addlog
(gdb) l *(0xffffffff804f7575)
0xffffffff804f7575 is in snprintf (../../../../kern/subr_prf.c:1075).
1071 * snprintf: print a message to a buffer
1074 snprintf(char *bf, size_t size, const char *fmt, ...)
1076 int retval;
1077 va_list ap;
1079 va_start(ap, fmt);
(gdb) l *(0xffffffff80536841)
0xffffffff80536841 is in biointr (../../../../kern/vfs_bio.c:1654).
1652 static void
1653 biointr(void *cookie)
1655 struct cpu_info *ci;
1656 buf_t *bp;
1657 int s;
(gdb) l *(0xffffffff8053689c)
0xffffffff8053689c is in biointr (./x86/intr.h:187).
183 static inline int
184 splraiseipl(ipl_cookie_t icookie)
187 return splraise(icookie._ipl);
190 #include <sys/spl.h>
(gdb) l *(0xffffffff804cc257)
0xffffffff804cc257 is in softint_thread (/usr/src/sys/arch/amd64/compile/XEN3_DOMU_ISCSI/xen-ma/machine/cpu.h:55).
50 __inline static struct cpu_info * __unused
53 struct cpu_info *ci;
55 __asm volatile("movq %%gs:%1, %0" :
56 "=r" (ci) :
58 (*(struct cpu_info * const *)offsetof(struct cpu_info, ci_self)));
59 return ci;
Brad Spencer - brad%anduin.eldar.org@localhost - KC8VKS
http://anduin.eldar.org - & - http://anduin.ipv6.eldar.org [IPv6 only]
Main Index |
Thread Index |