tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: assumption about a device's maxphys

On Wed, Oct 10, 2012 at 10:11:42AM -0400, Thor Lancelot Simon wrote:
> > But this assumes you knows the disks at config time. The problem is that
> > you can add or remove drives from a raid volume, which may change the
> > maxphys. 
> In the short term, I think RAIDframe should simply disallow such
> operations.  Teaching RAIDframe to split I/O is not going to be pleasant
> and, in practice, adding devices with smaller maxphys to an existing
> RAID set while it's running should be very uncommon.

I'm not sure it's that hard. This would happen when doing write
to the real disks, and if we centralize buffer splitting, it should
just be a matter of calling a function a bit more intelligent than
plain bdev_strategy().

> RAIDframe could, also, of course, impose the old MAXPHYS default on a
> per-component basis, ensuring we cannot ever make the problem any worse
> than it was -- only better.  We could allow overriding this on a per-set
> basis.
> > > That is my thinking: a device driver whose maxphys can change should split
> > > (and potentially even combine, though this is a matter of performance not
> > > correctness -- xbdback can already do this, though) requests as
> > > necessary, since there is no real atomicity guarantee for a request
> > > larger than a single sector anyhow.
> > 
> > This is one solution, but I think it should be centralised. No need to
> > replicate this in every drivers which needs this.
> There are very, very few drivers that would ever need this -- only drivers
> that can hide multiple disks beneath a single virtual device.  We could
> provide a common utility function, perhaps (I notice since the last time
> I looked at implementing this in disksort(), it may have become easier,
> because we grew something called nestiobuf, though I am not sure it is
> right for this purpose) but I don't think we should force all I/O through
> a layer that does this; it is almost never wanted.

I guess the cost of such a layer for the common case would just be
a if (__predict_false(bp->b_bcount > dev->dv_maxphys)) (and the
cost of looking up dev->dv_maxphys). But we could also have a
bdev_strategy_split() or equivalent for drivers that need it.

Manuel Bouyer <>
     NetBSD: 26 ans d'experience feront toujours la difference

Home | Main Index | Thread Index | Old Index