Subject: Re: Custom FIFO filesystem from userspace
To: None <tech-kern@netbsd.org>
From: Matthew Mondor <mm_lists@pulsar-zone.net>
List: tech-kern
Date: 11/29/2005 13:25:55
On Tue, 29 Nov 2005 08:23:07 -0800
Bill Studenmund <wrstuden@netbsd.org> wrote:

> Modern disks don't have a single geometry. 63 sectors/track sounds like a 
> remnant of the BIOS geometry. I doubt you'll gain much advantage using the 
> track size. I think you will, however, gain an advantage using large 
> writes. Note that our i/o system has a limit in i/o transaction size, 
> which is 64k as I recall.

Ah nice, so I can indeed write 64k blocks and use that size as base
storage blocks, it'll simplify some things and will probably be
performant

> 
> > I have seen some cache related ioctls in wd.c.  I'm unsure if this
> > should be used to retreive/set/flush (DIOCGCACHE, DIOCSCACHE,
> > DIOCCACHESYNC).  I guess that since operations on raw devices are
> > unbuffered, that the use of fsync(2) will probably be irrelevant to
> > ensure that a the data is synchronized to disk, though.  Would it
> > however make any sense to set the buffer size related to my block size,
> > and to use the flush ioctl after commiting transaction data and related
> > log entries?
> 
> You are partially incorrect. The cache in question for these operations is 
> the cache in the drive. While using the raw device ensures that the kernel 
> does no caching, chances are that the disk itself is doing caching. You'll 
> probably be disappointed with the performance if it doesn't cache.

Ah, makes sense.  Would simply using the DIOCCACHESYNC ioctl after
commiting a transaction logs block be considered adequate?  Or should I
simply leave the disk/OS take care of that asynchroneously still?

> I think what will happen is you will either get the unmodified block 
> (write never made it out of cache), you will get the written block, or 
> there will be an i/o error with the block.

Getting the written block or getting the unmodified one would require
not special treatment in my case then, which looks fine.  As for
possible I/O errors, that would be to expect for read(2) surely then?  I
could then simply reclaim that block for later reuse by write(2) without
problems surely, and can consider those transactions lost.

Thank you very much for all these useful details,
Matt

-- 
Note: Please only reply on the list, other mail is blocked by default.
Private messages from your address can be allowed by first asking.