NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: bin/57179 (occasional pkg_add core dumps)
The following reply was made to PR bin/57179; it has been noted by GNATS.
From: Christof Meerwald <cmeerw%cmeerw.org@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost, riastradh%netbsd.org@localhost
Subject: Re: bin/57179 (occasional pkg_add core dumps)
Date: Mon, 11 Dec 2023 20:31:30 +0100
 Ok, just did some debugging on this now.
 
 To reproduce the issue it seems to be important that PKG_PATH is
 actually set to
 
   http://cdn.NetBSD.org/pub/pkgsrc/packages/NetBSD/aarch64/10.0/All
 
 A consequence of this is that this URL actually redirects to
 
   http://cdn.netbsd.org/pub/pkgsrc/packages/NetBSD/aarch64/10.0/All/
 
 So we'll actually see two different hosts in the connection cache:
 cdn.NetBSD.org and cdn.netbsd.org
 
 Source code I am referring to is
 http://cvsweb.netbsd.org/bsdweb.cgi/src/external/bsd/fetch/dist/libfetch/common.c?annotate=1.4&only_with_tag=MAIN
 
 First bug I noticed is in "fetch_cache_get" where last_conn is
 initialized to NULL, but never updated. This is probably just a
 resource/memory leak as we'll always get into the "else" branch in
 line 382 (and throw away the initial part of the connection_cache).
 
 But the main issue (the one that is then resulting in the core dumps)
 is in fetch_cache_put. There is actually two parts to it.
 
 First part (the one that leads to the memory corruption) is that after
 closing a connection in line 421, on the next iteration, "last" will
 be set to that closed connection. So if we then also close the next
 connection on that next iteration, the "next_cached" link in the list
 isn't updated correctly (as we are updating the "next_cached" of the
 closed connection). This then leads to the core dump on the next call
 to "fetch_cache_put".
 
 Now the remaining issue is, why are we even closing two connections
 from the connection_cache in one single call to fetch_cache_put? The
 issue here is that once we reach the host_count limit, we don't reset
 the "host_count" and ultimately close all remaining connections in
 connection_cache (even if those connections are for different hosts
 that haven't reached the host_count limit). In my case
 connection_cache contained 4 connections for "cdn.NetBSD.org",
 followed by a connection for "cdn.netbsd.org". When trying to put
 another "cdn.NetBSD.org" connection into the cache, it realised that
 the fourth connection in the cache is over the host limit, closed it,
 and continued to the last "cdn.netbsd.org" connection. But as
 host_count wasn't decremented, it then proceeded to also close that
 "cdn.netbsd.org" connection (resulting in the linked-list corruption).
 
 Hope that helps.
 
Home |
Main Index |
Thread Index |
Old Index