Subject: Re: ccd(4): kernel memory corruption?
To: Brian Buhrow <buhrow@lothlorien.nfbcal.org>
From: Greg Oster <oster@cs.usask.ca>
List: current-users
Date: 06/27/2007 09:50:02
Brian Buhrow writes:
> 	Hello.  I suspect raidframe is in the same boat as the other two users
> because I've had issues with working on second raid sets in multiuser mode
> similar to the ccd issues described in this thread. 

Please file a PR with as much detail as you are able to.... 

> I was able to deal
> with them by not trying to manipulate two raid frame sets at the same time,
> i.e. not trying to configure a second raid set while the first is
> calculating its paritY, 

You should be able to do that just fine.. 

> but I definitely think there's a similar problem
> lurking in there.

There may be an issue, but I don't think it is related to this... 
(it'd be nice if it was, cause that'd be less work for me, but I 
don't think it is...)

Thanks.

Later...

Greg Oster

> On Jun 26, 12:59pm, Quentin Garnier wrote:
> } Subject: Re: ccd(4): kernel memory corruption?
> } 
> } --IiVenqGWf+H9Y6IX
> } Content-Type: text/plain; charset=us-ascii
> } Content-Disposition: inline
> } Content-Transfer-Encoding: quoted-printable
> } 
> } On Tue, Jun 26, 2007 at 12:04:10PM +0200, Jukka Salmi wrote:
> } [...]
> } > Note the stange characters where I would have expected "/dev/wd1e".
> } >=20
> } > I added some debug printfs to ccdioctl() in sys/dev/ccd.c and noticed
> } > that *(ccio->ccio_disks+1) is NULL even if ccio->ccio_ndisks is 2,
> } > causing cpp[1] to contain garbage, but I'm not familiar with kernel
> } > code to find the problem.
> } 
> } This is a nice bug.  What ccdioctl does wrong is passing cpp[i] to
> } dk_lookup, because it's a userspace pointer and dk_lookup does ND_INIT()
> } on it with UIO_SYSSPACE.  My tentative explanation is that the kernel
> } sometimes sleeps when resolving the first name, and when it comes back,
> } the userspace is different and UIO_SYSSPACE will not have the effect of
> } having the relevant pages replaced with the correct ones.  And as those
> } pointers come from argv[], they're unlikely to ever fault.  I might be
> } completely wrong about how the second component is corrupted, but the
> } UIO_SYSSPACE part is a bug nonetheless.
> } 
> } There are 3 users of dk_lookup:  ccd, cgd and raidframe.  cgd is in the
> } same situation as ccd, I'm unsure about raidframe.
> } 
> } If you don't use the latter, can you try simply changing UIO_SYSSPACE
> } into UIO_USERSPACE in dev/dksubr.c:dk_lookup()?
> } 
> } --=20
> } Quentin Garnier - cube@cubidou.net - cube@NetBSD.org
> } "You could have made it, spitting out benchmarks
> } Owe it to yourself not to fail"
> } Amplifico, Spitting Out Benchmarks, Hometakes Vol. 2, 2005.
> } 
> } --IiVenqGWf+H9Y6IX
> } Content-Type: application/pgp-signature
> } Content-Disposition: inline
> } 
> } -----BEGIN PGP SIGNATURE-----
> } Version: GnuPG v1.4.6 (NetBSD)
> } 
> } iQEVAwUBRoDxl9goQloHrPnoAQI3wAf7B/KIJH82T8S5gnuL7KxCmxQtmV+GRVYF
> } ZZw7qmlzE7PrnQtTYanRHQ3PzVKXQXOuLbkwY6eVhDfKFfw+VXLAxU1P1CnCic4J
> } WyZbsT9z9qLeOigtTZFjrEQ/kkkjzDcgvTvMfEtE7buUw+Y9u1JSNbApUGfZ+dom
> } Cyj7KHbqmaGRNsPs+qyFk5V3gAQILmB6C4K8fA9aRz+P02590oPA2w7LekWikzOv
> } Vdaorvp3/2/ImfWgfemukppb9/M0JJCDRsKCdOJrM9w0zlrQA0avhAsHh4XbnmQ7
> } pPO/bFBQmB5V4ynfm662F31x87fVzOhDccYwu99P/AhKa9PmkOBDHw==
> } =AZM8
> } -----END PGP SIGNATURE-----
> } 
> } --IiVenqGWf+H9Y6IX--
> >-- End of excerpt from Quentin Garnier
>