NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/43986: ataraid(4) doesn't seem to handle weird array configs very gracefully



> uvm_fault(0xc0ab2c40, 0, 1) -> 0xe
> kernel: supervisor trap page fault, code=0
> Stopped in pid 0.1 (swapper) at netbsd:strncmp+0x23:    movzbl  0(%ebx),%eax
> db{0}> trace
> strncmp(c095397a,1c,6,282,c0a9564c) at netbsd:strncmp+0x23

This is an attempt to access the page zero.  Not exactly a null
pointer dererence, but a near-null pointer dereference -- the second
argument is 0x1c.

> devsw_name2blk(1c,0,0,0,0) at netbsd:devsw_name2blk+0x7e

It probably happened here:

    535 		if (strncmp(conv->d_name, name, len) != 0)

https://nxr.netbsd.org/xref/src/sys/kern/subr_devsw.c?r=1.28#535

> ld_ataraid_attach(c3d9c940,c3dede00,c3da0600,c37d4000,c37f4000) at netbsd:ld_ataraid_attach+0x1a8

The devsw_name2blk call is probaby this one:

     78 	bmajor = devsw_name2blk(device_xname(adi->adi_dev), NULL, 0);

https://nxr.netbsd.org/xref/src/sys/dev/ata/ata_raid_subr.c?r=1.2#78

0x1c is offsetof(struct device, dv_xname) on i386 as of around 2010:

https://nxr.netbsd.org/xref/src/sys/sys/device.h?r=1.137#141

So adi->adi_dev is probably null here.

ata_raid_disk_vnode_find assumes that adi->adi_dev is nonnull.  This
is probably a bug on its own: the caller handles a null return, so
either if the input is null, the output should be null, or the caller
should avoid calling ata_raid_disk_vnode_find with a null input.

    225 		adi = &aai->aai_disks[i];
    226 		vp = ata_raid_disk_vnode_find(adi);
    227 		if (vp == NULL) {
    228 			/*
    229 			 * XXX This is bogus.  We should just mark the
    230 			 * XXX component as FAILED, and write-back new
    231 			 * XXX config blocks.
    232 			 */
    233 			break;
    234 		}

https://nxr.netbsd.org/xref/src/sys/dev/ata/ld_ataraid.c?r=1.37#225

How did adi->adi_dev come to be null?

Well, the struct ataraid_disk_info *adi structure is one of the
aai->aai_ndisks entries in a struct ataraid_array_info *aai structure
which is created by ataraid_get_array_info:

    270 	aai = malloc(sizeof(*aai), M_DEVBUF, M_WAITOK | M_ZERO);
...
    289 	TAILQ_INSERT_TAIL(&ataraid_array_info_list, aai, aai_list);

https://nxr.netbsd.org/xref/src/sys/dev/ata/ata_raid.c?r=1.34#258

This is called by the various ata_raid_*.c drivers.  From the dmesg
you shared, it looks like this was an Adaptec adapter, so aai_ndisks
is filled in from the first component here:

    156 		aai->aai_ndisks = be16toh(info->configs[0].total_disks);

https://nxr.netbsd.org/xref/src/sys/dev/ata/ata_raid_adaptec.c?r=1.9#156

And aai->aai_disks[drive].adi_dev is filled in for each component
found here:

    182 	adi = &aai->aai_disks[drive];
    183 	adi->adi_dev = sc->sc_dev;

https://nxr.netbsd.org/xref/src/sys/dev/ata/ata_raid_adaptec.c?r=1.9#182

If any components are missing, there will be some number `drive' for
which aai->aai_disks[drive].adi_dev remains null, and the logic will
crash where you saw this.  That seems to be consistent with your
explanation of the state of the disks.

None of this code has changed much since this PR was filed, so I
suspect the bug is still there.  I'm inclined to say that
ld_ataraid_attach should just check adi->adi_dev == NULL and handle it
like when ata_raid_disk_vnode_find returns null -- in both cases, it
needs to deal with a missing component.

--- ld_ataraid.c
+++ ld_ataraid.c
@@ -226,8 +226,8 @@
 	 */
 	for (i = 0; i < aai->aai_ndisks; i++) {
 		adi = &aai->aai_disks[i];
-		vp = ata_raid_disk_vnode_find(adi);
-		if (vp == NULL) {
+		if (adi == NULL ||
+		    (vp = ata_raid_disk_vnode_find(adi)) == NULL) {
 			/*
 			 * XXX This is bogus.  We should just mark the
 			 * XXX component as FAILED, and write-back new

Now, the failure branch logic here may be wrong -- it just gives up
instead of trying to deal with a missing component -- but that's a
separate issue which, with any luck, should lead to a more graceful
failure than crash, even if the more graceful failure isn't ideal.


Home | Main Index | Thread Index | Old Index