tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: continued zfs-related lockups



On Thu, Oct 24, 2024 at 09:42:55AM -0400, Greg Troxel wrote:
>   I see processes in flt_noram5 and they persistently remain there after
>   RAM becomes available.

"flt_noram5" means "wait until the pagedaemon signals that it has finished
a cycle of trying to free pages".  when threads stay stuck here even after
pages have been freed then that usually means the pagedaemon is hung in
a locking deadlock.  what is the stack trace of the pagedaemon thread
in your hangs?


>   - Is there a way in ddb to issue a wakeup on flt_noram5?

you could do the ddb equivalent of "wakeup(&uvmexp.free)",
ie. "call wakeup(ADDR)" where ADDR is the value of "&uvmexp.free".


>   - If I wanted to change the kernel to every so often (30s?) issue a
>     wakeup to flt_noram5, where/how should I do this?  Or, should there
>     be a once/second that goes to the next process and wakes it up, as a
>     debug option?  Or, why I am wrong to want to do this?

there's no "next process", the pagedaemon always wakes up every thread
that has gone to sleep waiting for the pagedaemon to make some progress.
you could use a periodic wakeup as a debugging tool, sure.  but it's
usually enough to check the stack trace of the pagedaemon thread
to see if the problem is that the pagedaemon thread is hung.


>   - Somehow, processes waiting on pools do not get woken up when
>     presumably the pool code was waiting on RAM, and RAM becomes
>     available.  Or at least it seems that way.  How is this supposed to
>     work?

the pagedaemon thread isn't supposed to get stuck in locking deadlocks.  :-)


>   - My belief is that even if zfs is piggy, the system should not lock
>     up, and that absent bugs I would be complaining "zfs piggyness leads
>     to paging out stuff and making the system slow" instead.  Correct?

yes, that is correct.

-Chuck


Home | Main Index | Thread Index | Old Index