Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: EINVAL from copyin/out - how?



    Date:        Fri, 15 Aug 2014 13:38:09 +0200
    From:        "Thomas Schmitt" <scdbackup%gmx.net@localhost>
    Message-ID:  <17202659085853950856%scdbackup.webframe.org@localhost>


  | One should also find out, whether cd9660_read() is called
  | in the course of read(2) on a regular file.

It is, that's where the calls to the functions that eventually call
copyout() occurs - it's that that returns the EINVAL to user space.

  | (In that case one would have to learn how this function actually
  |  works by help of ubc_uiomove(9) resp. how ubc_uiomove(9) learns
  |  about the data block numbers of the regular file.)

In theory if it works the way I understand it should) the data should
already be present in buffers somewhere by the time it reaches there,
that's also where the error occurs... (cs9660_read() calls ubc_uimove()
which calls .. which calls .. to copyout).

  | Well, cd9660 has a few EINVALs. But except one in cd9660_read()
  |         if (uio->uio_offset < 0)
  |                 return (EINVAL);

No, it isn't that one - it is quite certain that it is happening inside
copyoyt (as the code - does "return copyout(k, ua, len);", and that
return is returning EINVAL (I changed it to
        error = copyout(...);
        if (error == EINVAL) printf();
        return error;
and the printf fires).

Since copyout itself cannot return EINVAL (that is the .S code that implements
it), it must be happening (so it seems) via some trap, that returns with EINVAL
in ra - causing copyout to return EINVAL.  The question is whether it is
some kind of trap related to reading the kernel address data (the address
itself looks OK, the printf prints the args to copyout), or a trap related
the write into user space I have no idea at the minute (it could even be some
unrelated interrupt not restoring registers properly - but given the way the
error occurs, when it occurs, that's incredibly unlikely.)

I did wrap all the bread() (and breadn()) calls in the cd9660
filesystem code (only the ones in that directory - occurrences in 4 files)
and tried it on one of my discs that previously always gave errors on
files in the sub-dirs...  this time it read the whole thing perfectly
(all 4.3GB of it).  No errors reported anywhere.  So, I went back to
an unmodified kernel (before any of my trace hackery) which used to fail
every time, and that one also read the whole disc without errors!

I really hate non-deterministic bugs!   Must be something related to the
phase of the moon!

Thanks anyway - assuming I can get it to start failing again, I'll keep
digging.

kre



Home | Main Index | Thread Index | Old Index