tech-kern: Re: security/36712: tar extraction cause cannot create pipe, too

Subject: Re: security/36712: tar extraction cause cannot create pipe, too
To: None <gnats-bugs@NetBSD.org, tech-kern@netbsd.org>
From: George Georgalis <george@galis.org>
List: tech-kern
Date: 10/04/2007 12:41:35
On Tue, Oct 02, 2007 at 06:48:55PM -0400, George Georgalis wrote:
>On Fri, Sep 14, 2007 at 05:16:56PM +0100, Andrew Doran wrote:
>>On Fri, Sep 14, 2007 at 11:51:00AM -0400, George Georgalis wrote:
>>
>>> I've been working on reproducing an archive (with public data)
>>> that causes a kernel panic on netbsd-3 and RC1. It was suggested
>>> I use sysctl to get stats on how much memory softupdates is using,
>>> but that seems available in FreeBSD only.
>>> 
>>> Is there a way I can get memory stats for soft updates in a netbsd
>>> generic?  It would also be useful to know how many pipe resources
>>> are being used.
>>> 
>>> Maybe there is some other resources I could look at too? The
>>> crash happens when extracting a 21Gb tar.bz2 archive with lots of
>>> hardlinks and data with 10:1 compression ratio.
>>
>>Yes, have a look at the output of vmstat -m. The softdep pools are as below
>>Obviously this system is not running softdep, so there are no numbers. You
>>probably also want to track the number of buffers in use. Look at the buf*
>>pools and/or use 'systat bufcache'.
>>
>>The maximum number of softdep operations is bounded by max_softdeps, by
>>default it's set to:
>>
>>	max_softdeps = desiredvnodes * 4;
>>
>>Andrew
>>
>>Memory resource pool statistics
>>Name        Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
>>sdpcpool     124        0    0        0     0     0     0     0     0   inf    0
>>pagedeppl     68        0    0        0     0     0     0     0     0   inf    0
>>inodedeppl    88        0    0        0     0     0     0     0     0   inf    0
>>newblkpl      36        0    0        0     0     0     0     0     0   inf    0
>>bmsafemappl   36        0    0        0     0     0     0     0     0   inf    0
>>allocdirectpl 80        0    0        0     0     0     0     0     0   inf    0
>>indirdeppl    32        0    0        0     0     0     0     0     0   inf    0
>>allocindirpl  64        0    0        0     0     0     0     0     0   inf    0
>>freefragpl    40        0    0        0     0     0     0     0     0   inf    0
>>freeblkspl   172        0    0        0     0     0     0     0     0   inf    0
>>freefilepl    36        0    0        0     0     0     0     0     0   inf    0
>>diraddpl      36        0    0        0     0     0     0     0     0   inf    0
>>mkdirpl       32        0    0        0     0     0     0     0     0   inf    0
>>dirrempl      36        0    0        0     0     0     0     0     0   inf    0
>>newdirblkpl   20        0    0        0     0     0     0     0     0   inf    0
>>
>
>Thanks. These must be the items related specifically to soft updates?
>
>Trying to narrow my problem I've applied the note in 
>ftp://ftp.netbsd.org/pub/NetBSD-daily/netbsd-3/200709300000Z/LAST_MINUTE
>regarding setting vm.bufmem_hiwater and vm.bufmem_lowater, my
>hiwater was negative 1.7 GB (-1718112256) in a 16Gb quad core
>(2x2cpu) opteron. The problem persisted with reasonable values set
>at boot. Next step was to reduce to 2GB RAM, remove FC altogether,
>and stress test sata (w/o softupdates) --- no problem. I'm now
>stressing over LSI FC with 2Gb RAM. If that passes, I'll try again
>with soft updates. If that works I think I've identified the cause
>as having over 2GB of RAM on this host. (it failed with 4GB too)
>
>As I began this run, I got "deadbeef" to stderr with the vmstat -H
>command... it has doesn't do it now, but how significant is a
>corrupted hash chain?
>
>All testing of late has been on pre RC2, netbsd-4 kernel and
>userland.

I've not been able to reproduce the lockups on LSI fibre, with
RAM physically reduced to 2GB. If someone thinks this is a
softupdates issue I can try using them. But it seems a kernel
initialization issue, something in addition to vm.bufmem_lowater
and vm.bufmem_hiwater settings when there is more than 2GB on
amd64.

BTW this is a Dual Opteron supermicro H8DME-2 Socket F Motherboard
w/ 2216HE processors.

I don't think this is a security issue anymore, the category
should be kern.  Also this is happening with netbsd-4 in addition
to netbsd-3.

>Category:      kern
>Release:       NetBSD 3.1_STABLE, NetBSD 4.0_RC1 Wed Sep 26 14:52:57 EDT 2007
>How-To-Repeat: Heavy disk I/O with RAM over 2GB on amd64
>Fix:           Reduce RAM to 2GB or less

Also, still getting deadbeef with vmstat -H

 # grep -v deadbeefdeadbeef tmp/chime.fd2 | wc -l
   26850
 # grep -v deadbeefdeadbf67 tmp/chime.fd2 | wc -l
   25209
 # grep -v deadbeefdeadbf0f tmp/chime.fd2 | wc -l
   41849
 # tail -n1 tmp/chime.fd2                   
vmstat: kptr deadbeefdeadbeef: hash chain corrupted: kvm_read: Bad address

seems a 16 byte hex register?

// George


-- 
George Georgalis, information system scientist <IXOYE><