Re: soft-halts machine on 4.99.49 kernel, with 4.99.48 userland.

On Fri, Jan 11, 2008 at 03:33:05PM -0800, Marc Tooley wrote:

I'll double-check the actual filesystem type (v1 or v2.. I think it was
newfs -o time -O 2) tonight, as it's paused again.

Looks like you are using softdep - I fixed a problem yesterday that could
have caused this. Can you please update your kernel sources and try again?

You are correct; good guess.

/dev/wd0a on / type ffs (local)
/dev/wd0e on /usr type ffs (soft dependencies, local)
/dev/wd0f on /v type ffs (soft dependencies, local)
/dev/raid0e on /v2 type ffs (soft dependencies, local)
kernfs on /kern type kernfs (local)

... I am using softdep. And I'm using ffsv2:

history | grep -i newfs
 7973  newfs -O 2 -o time /dev/raid0e

I have rsync'd and updated my sources as of a change you made:


"Initialize caches at IPL_SOFTBIO (not IPL_NONE) so that we are allocating from kmem_map."

.. I'm hoping that's the fix you're talking about. I've built a fresh kernel, installed it, and after monitoring a for approximately ten minutes, found it doing the same thing. This time it appeared to be triggered by my execution in another terminal:


Now it's been frozen for about five minutes. I wouldn't think a 2GB mem machine would generate that kind of complete pause.

Breaking into the kernel debugger nets me..

A bunch of active processes that appear stuck in vm_map again. I had to reboot the machine. I stripped out softdep from the mount options in fstab and am trying again...

... and getting much, much further. Lots of disk activity, lots of processes, I've restarted all the standard system daemons again, everything.

Well, at least I have a working system again! Thanks for the hint. If you need more testing, or if it looks like I managed to build a kernel without your fix in it, please let me know.


