Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: EINVAL from copyin/out - how?



Hi,

i'm not a NetBSD expert, but had reason to look into its cd9660
code for developing two PRs (48808, 48959).


Robert Elz:
> At this point it looks to be some kind of setup problem on amd64, when it
> reads nested directories, building some data struct that resuts in EINVAL
> from the copyout

I did not see any architecture specific code in cd9660.

Given the fact that you located the origin of EINVAL outside
of cd9660 (resp. underneath), i expect that ISO 9660 structure
aspects are only indirectly involved.
Especially the difference between i386 and amd64 can hardly
be explained by ISO 9660 aspects.

The connection between ISO 9660 and data file content blocks is
made by cd9660_bmap() as implementation of VOP_BMAP(9).


> diff -r of the mount points of the real DVD (/cdrom) and the
> mounted vnd0a (/mnt) - that completed without error.
> What's more, after that, tar had no problem reading the DVD either!

It is unlikely that the DVD delivers varying data blocks
from the ISO 9660 directory files, which would lure cd9660_bmap()
into requesting invalid data block addresses for file content.

My favority suspect would now be the code underneath bread(9)
and especially the mechanism which looks up the buffer and
decides whether a physical read operation on the DVD is needed.


> If anyone has any suggestiions for
> possible sources, I'll happily mangle my kernel

How about having a wrapper around bread(9), which checks for
error replies of bread(9) and eventually prints some message
which tells the failed block number.
Then use that wrapper instead of all the bread(9) calls in cd9660.
(Should be possible with a few vi commands on the few cd9660*.c
 files.)
This would make clear whether the problem is indeed underneath
bread(9) and whether it gets an implausible block number from
cd9660.

If the block number of a failure is in the size range of the ISO
image, the its content could be looked up by help of dd and be made
human readable by od or alike. Directory records show some
characteristic redundancy. So there is hope we can tell whether
it is metadata or data file content.
If the content is inconclusive, then i'd need the whole ISO
image in order to tell what the affected block shall mean.


Alternatively to an inspection of the ISO image, or if the block
number is implausible, you could equip the wrapper and its calls
with a string parameter which identifies the wrapper's caller.
So we could make a connection to particular cd9660 operations
and see whether it is always the same cd9660 gesture failing.


> ps: I can make the 2.1GiB .iso image available,

If you have an image which never shows problems and one that
reliably shows problems (e.g. on the first tar) then we could
look at the block addresses of the data files and directory
files which are involved.

As said, it would be interesting to look up the blocks of
failing bread(9) calls.


Have a nice day :)

Thomas


Home | Main Index | Thread Index | Old Index