Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: random lockups (now suspecting zfs)



Paul Ripke <stix%stix.id.au@localhost> writes:

>  Fri, Oct 20, 2023 at 01:11:15PM -0400, Greg Troxel wrote:
>> A different machine has locked up, running recent netbsd-10.  I was
>> doing pkgsrc rebuilds in zfs, in a dom0 with 4G of RAM, with 8G total
>> physical.  It has a private patch to reduce the amount of memory used
>> for ARC, which has been working well.

I have had an additional lockup each on my main machine and my xen/pkg
machine.

>> All 3 tmux windows show something like
>> 
>>   [ 373598.5266510] load: 0.00  cmd: bash 21965 [flt_noram5] 0.37u 2.89s 0% 6396k
>> 
>> and I can switch among them and ^T, but trying to run top is stuck (in
>> flt_noram5).  I'll give it an hour or so, and have a look at the
>> console.
>
> Curious - do you have swap configured? On what kind of device?
> I'm wondering if a pageout is wedged waiting for memory...

I do have swap configured.

  wd0 at atabus0 drive 0
  wd0: <Samsung SSD 870 EVO 4TB>
  wd0: drive supports 1-sector PIO transfers, LBA48 addressing
  wd0: 3726 GB, 7752021 cyl, 16 head, 63 sec, 512 bytes/sect x 7814037168 sectors
  wd0: GPT GUID: 7f026840-bd44-4063-be7c-647727ac10d6
  dk2 at wd0: "GDT-3276-4/swap", 83886080 blocks at 4458496, type: swap
  root on dk1 dumps on dk2
  Device      1024-blocks     Used    Avail Capacity  Priority
  /dev/dk2       41943040        0 41943040     0%    0

  wd0 at atabus0 drive 0
  wd0: <SanDisk SD8SB8U1T001122>
  wd0: drive supports 1-sector PIO transfers, LBA48 addressing
  wd0: 953 GB, 1984533 cyl, 16 head, 63 sec, 512 bytes/sect x 2000409264 sectors
  Device      1024-blocks     Used    Avail Capacity  Priority
  /dev/wd0b      16777656    49384 16728272     0%    0

The first is a GPT partition mounted by NAME, and the second is a
disklabel partition.  The first machine I don't expect to really swap,
and the second definitely has memory pressure.   Interestingly none of
the xen domUs have locked up, meaning I've never found them wedged and
the dom0 ok.

So to me this feels like a locking botch in a rare path in zfs.




Home | Main Index | Thread Index | Old Index