Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: fsck seg fault failure on vmware -i386?



    Date:        Mon, 8 Feb 2010 03:47:12 -0000
    From:        yancm%sdf.lonestar.org@localhost
    Message-ID:  
<6a4ac0db64a70fbf1ddfa220b5aae58e.squirrel%webmail.freeshell.org@localhost>

  | Hopefully not a lot of calls prior to dump call...

That's why I suggest a breakpoint on ctime() - no-one ever calls that
unless they're going to print the result (somewhere) - and for fsck, the
only possible somewhere is the screen - if you run the fsck_ffs once
(and correct nothing) you'd know how many dates (pieces of dates actually)
appear on the screen before the one in question should appear.  From your
earlier messages, I think there were just one or two of them.

  | I should be, but gdb is in /usr so I can copy it out...don't know about
  | dependencies,

It has a few libraries it needs - copy those to /usr/lib in your root
filesys (it to /xxx/lib then rename /xxx to /usr in single user mode).
That's the easiest way to avoid needing to fiddle the path list.
It will also want the termcap database (/usr/share/misc/termcap*)
which wants to also start in /xxx/share/... so it becomes /usr/share
when you move /xxx.

  | Kinda annoying though since I can't copy/paste
  | or scroll back on the console.

Yes...   But fortunately not too much should be needed, I'd just really
like to know what is failing in the time code to make sure that there aren't
any more lingering bugs in there (the 64 bit code has been tested pretty
throughly now in "normal" environments, but I suspect has seen less use
in unusual circumstances than we're like - hence the problem that we have
seen).


  | To do what you ask, I'll need a debug version of libc.

Just localtime really but ...

  | Christos had me copy localtime.c into src/sbin/fsck_ffs which
  | for whatever reason did not crash and leave core...

That is a problem, but please check to make sure that you're not using
the new fixed version of localtime (Christos fixed the problem we found
already, which if we're right, should stop the core dumps - now we just want
to know why this happened in the first place, given the time values that
it should have been converting - there doesn't seem to be any rational reason
why your test should have exposed the bug that we did find - so there must
be something irrational, and it would be really nice to know what that is).

If you don't have time, and the problem has gone away for practical purposes
for you with an up to date version of current (corrected localtime, and
corrected fsck_ffs) then we can just write this off as one of those things,
great to have found a problem, no idea why we found the problem type of
issue - with just the lingering doubt that perhaps something is still not
quite right under the hood.

  | Using build.sh (modifying the src/lib/libc/Makefile ?); how can
  | I make a debug version of libc (or at least the localtime.c pieces)?

Don't rely upon me for this, someone else who understands the build
machinery better will be able to give a better answer, but I'd think
that adding

COPTS+=-g

in src/lib/libc/Makefile might probably work (totally untested advice...)
Whether that would result in a full release that would work is not important
just whether it results in a libc that has synbols in it.

But as I said, I don't think you will need this in any case, just localtime.

But if you cannot make the corrected localtime crash, then I can't think
of a reason why it would make a difference whether that localtime is in
libc or in the fsck_ffs source - it is the same code compiled by the same
compiler, it either works, or doesn't, either way - that's why copying the
source file from libc into the application was a good (and easy) way to
get it compiled with -g so gdb can look inside.

  | Hmm. May look at it if time... be nice to not have to work in the
  | console window...

Yes, being able to connect via an xterm makes life easier - if you're
doing this, since you most likely aren't overly concerned with security
of communications between your xterm and the process inside vmware, I'd
be trying telnet rather than ssh, must less will be needed to make telnet
function than is needed for ssh.

kre



Home | Main Index | Thread Index | Old Index