Subject: Re: amd64 domU panic w/ raidframe re-write
To: None <jakllsch@kollasch.net>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-xen
Date: 02/07/2008 17:59:20
On Thu, Feb 07, 2008 at 12:23:09AM -0600, jakllsch@kollasch.net wrote:
> Hi,
> 
> I'm running raidframe in a amd64 domU. I had had to do a hard reboot
> because I did something stupid in the linux dom0 (xm mem-set 0 0).
> 
> Anyway, now when I go to rebuild the parity on the RAID1 array:
> 
> 
> # raidctl -P raid0
> /dev/rraid0d: Parity status: DIRTY
> /dev/rraid0d: Initiating re-write of parity
> panic: kernel diagnostic assertion "seg <= BLKIF_MAX_SEGMENTS_PER_REQUEST" failed: file "/local/jakllsch/nbsd50/src/sys/arch/xen/xen/xbd_xenbus.c", line 759
> Stopped in pid 0.20 (system) at netbsd:breakpoint+0x1:  ret
> breakpoint() at netbsd:breakpoint+0x1
> __kernassert() at netbsd:__kernassert+0x2d
> xbdstart() at netbsd:xbdstart+0x4e8
> dk_start() at netbsd:dk_start+0x3f
> dk_strategy() at netbsd:dk_strategy+0x115
> spec_strategy() at netbsd:spec_strategy+0x5e
> VOP_STRATEGY() at netbsd:VOP_STRATEGY+0x26
> rf_DispatchKernelIO() at netbsd:rf_DispatchKernelIO+0x1ef
> rf_DiskReadFuncForThreads() at netbsd:rf_DiskReadFuncForThreads+0xd7
> FireNodeList() at netbsd:FireNodeList+0x76
> rf_FinishNode() at netbsd:rf_FinishNode+0x306
> rf_DispatchDAG() at netbsd:rf_DispatchDAG+0x12d
> rf_VerifyParityRAID1() at netbsd:rf_VerifyParityRAID1+0x4c4
> rf_VerifyParity() at netbsd:rf_VerifyParity+0x66
> rf_RewriteParity() at netbsd:rf_RewriteParity+0xac
> rf_RewriteParityThread() at netbsd:rf_RewriteParityThread+0x41
> ds          0x5
> es          0xa97c
> fs          0x10d8
> gs          0xaa4f
> rdi         0
> rsi         0xd
> rbp         0xffffa0001247f6c0
> rbx         0x1
> rdx         0
> rcx         0
> rax         0x1
> r8          0xffffffff80581520  cpu_info_primary
> r9          0
> r10         0xffffa0001247f4e0
> r11         0xffffffff80381680  xenconscn_putc
> r12         0x100
> r13         0xffffffff804710d8  copyright+0x8c118
> r14         0xffffa000122f45f0
> r15         0x5000
> rip         0xffffffff8036abb9  breakpoint+0x1
> cs          0xe030
> rflags      0x246
> rsp         0xffffa0001247f5c8
> ss          0xe02b
> netbsd:breakpoint+0x1:  ret
> db> 
> 
> This kernel is 4.99.52 from approximately a week ago.
> 
> Is this because my sectPerSU is 128, or 64KiB,
> which is greater than MAXPHYS in a domU?
Yes. I think I have a PR open about "RAIDframe can send requests larger than
MAXPHYS", or something like that
> 
> And how hard would it be to have a 64KiB
> MAXPHYS Just Work?
quite hard. If we did it, we could just as well do it at some upper
layer, and get rid of MAXPHYS completely (and have each device advertize their
MAXPHYS instead). I think it's the way to go, for other reasons: all but
legacy ESDI drives would probably be happier with MAXPHYS of 128k, and
SCSI and SATA would probably be much faster with a MAXPHYS of 256 or 512k.
I had proposed a google SOC on this, but it didn't get selected.
-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--