NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/38643: [dM] st tape drive loses data
The following reply was made to PR kern/38643; it has been noted by GNATS.
From: der Mouse <mouse%Rodents-Montreal.ORG@localhost>
To: David Holland <dholland-bugs%netbsd.org@localhost>, Havard Eidnes
<he%netbsd.org@localhost>,
gnats-bugs%netbsd.org@localhost, mouse%Rodents.Montreal.QC.CA@localhost,
kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
netbsd-bugs%netbsd.org@localhost, yamt%netbsd.org@localhost,
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
Cc:
Subject: Re: kern/38643: [dM] st tape drive loses data
Date: Tue, 2 Sep 2008 03:42:18 -0400 (EDT)
> However, it looks like st.c flatly ignores the passed-in block number
> (b_blkno) and always reads whatever's under the tape head.
I'd expect this, actually; tapes are not random access, and the Unix
tape model treats them accordingly. I don't consider this a problem
any more than I consider it a problem that lseek doesn't work on tty
devices.
> So I think what's happening is that physio is firing off sixteen 64k
> reads, since PHYSIO_CONCURRENCY is 16 and MAXPHYS is 64k;
Ooh. Yes, that explains it very neatly: where the data is going and
why exactly 15 records get lost.
> So there are at least two things wrong: (1) physio assumes b_blkno is
> honored, and st doesn't; and (2) physio assumes st will read 64k when
> asked, but it in fact apparently only reads one 10k block at a time.
> (What does it do if the tape is written in 16k blocks?
I would expect it to read 16k at a time. The traditional Unix tape
model is that a tape is a stream of records (with interspersed
filemarks, but let's not complicate things with them), each of which
has an inherent size recorded on the tape along with the data -
basically, the way half-inch nine-track tapes work. If you try to read
R byutes and the next record's size is N bytes, then if R>=N you get
the next record and a return value of N, while if R<N you get...I'm not
sure, probably either an error or a return value of N with the
remaining N-R bytes dropped.
Some tapes don't fit this model, notably quarter-inch cartridge tapes,
but it's still the paradigm most of the tape subsystems and drivers
I've seen are designed around - and the model all the tape software
I've seen expects.
> Or worse, say, 80k blocks?)
You lose. :( Trying to write - or read - tape records over MAXPHYS has
never worked, not clear back to 4.2, possibly even 4.1c. I remember
running into a 63K limit because of this on the VAX under 4.x.
Well, not with tape drives fitting the nine-track model. Classic
quarter-inch cartridge tapes (eg, the DC600A) are streams of 512-byte
records, with no record boundaries; trying to read or write huge
records (eg, 1MB) just sources or sinks enough 512-byte physical
records to satisfy the request.
> I have no idea what the proper way to resolve these discrepancies is.
> It appears the immediate problem can be hacked around by having st
> allocate a buf and pass it to physio, because that will cause physio
> to use only that buf instead of up to PHYSIO_CONCURRENCY of its own;
> but that's hardly a fix and doesn't even cover all the possible
> failure cases.
I offer the thought that maybe PHYSIO_CONCURRENCY should effectively be
forced to 1 for DV_TAPE devices (or, possibly, non-DV_DISK devices).
The physio model of breaking big requests up into multiple concurrent
small requests really doesn't make sense except for random-access
devices, so maybe it shouldn't be done except for random-access
devices.
However, this is coming from someone who doesn't really understand
4.0's physio paradigm, so it could be pure nonsense.
/~\ The ASCII der Mouse
\ / Ribbon Campaign
X Against HTML mouse%rodents-montreal.org@localhost
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Home |
Main Index |
Thread Index |
Old Index