NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Finding bottlenecks on a proxy server
I'm using squid and danguardian as a content-filtering web-proxy combo.
These are arranged in the chain:
clients -> squid (1) -> dansguardian -> squid (2) -> Internet
The first squid provides the most flexible filtering (time of day, MAC 
address, user authentication, client IP/port, etc.). It then speaks to 
dansguardian as an upstream proxy. dansguardian then speaks to 
second squid instance to do the actual fetching (some requests bypass 
dansguardian on the basis of certain access rules).
squid (1) run as user squid and runs as a single process. Its process 
limits are:
proc.868.rlimit.cputime.soft = unlimited
proc.868.rlimit.cputime.hard = unlimited
proc.868.rlimit.filesize.soft = unlimited
proc.868.rlimit.filesize.hard = unlimited
proc.868.rlimit.datasize.soft = 8589934592
proc.868.rlimit.datasize.hard = 8589934592
proc.868.rlimit.stacksize.soft = 4194304
proc.868.rlimit.stacksize.hard = 134217728
proc.868.rlimit.coredumpsize.soft = unlimited
proc.868.rlimit.coredumpsize.hard = unlimited
proc.868.rlimit.memoryuse.soft = 6220451840
proc.868.rlimit.memoryuse.hard = 6220451840
proc.868.rlimit.memorylocked.soft = 2073483946
proc.868.rlimit.memorylocked.hard = 6220451840
proc.868.rlimit.maxproc.soft = 1024
proc.868.rlimit.maxproc.hard = 2068
proc.868.rlimit.descriptors.soft = 24576
proc.868.rlimit.descriptors.hard = 24576
proc.868.rlimit.sbsize.soft = unlimited
proc.868.rlimit.sbsize.hard = unlimited
proc.868.rlimit.vmemoryuse.soft = unlimited
proc.868.rlimit.vmemoryuse.hard = unlimited
proc.868.rlimit.maxlwp.soft = 1024
proc.868.rlimit.maxlwp.hard = 2048
dansguardian run as user dangrdn and runs as a traditional 
forking parent/child pool. It listens on port 8124. There is a limit of 
250 child processes. The parent process has the following limits:
proc.1231.rlimit.cputime.soft = unlimited
proc.1231.rlimit.cputime.hard = unlimited
proc.1231.rlimit.filesize.soft = unlimited
proc.1231.rlimit.filesize.hard = unlimited
proc.1231.rlimit.datasize.soft = 268435456
proc.1231.rlimit.datasize.hard = 8589934592
proc.1231.rlimit.stacksize.soft = 4194304
proc.1231.rlimit.stacksize.hard = 134217728
proc.1231.rlimit.coredumpsize.soft = unlimited
proc.1231.rlimit.coredumpsize.hard = unlimited
proc.1231.rlimit.memoryuse.soft = 6220451840
proc.1231.rlimit.memoryuse.hard = 6220451840
proc.1231.rlimit.memorylocked.soft = 2073483946
proc.1231.rlimit.memorylocked.hard = 6220451840
proc.1231.rlimit.maxproc.soft = 320
proc.1231.rlimit.maxproc.hard = 320
proc.1231.rlimit.descriptors.soft = 320
proc.1231.rlimit.descriptors.hard = 320
proc.1231.rlimit.sbsize.soft = unlimited
proc.1231.rlimit.sbsize.hard = unlimited
proc.1231.rlimit.vmemoryuse.soft = unlimited
proc.1231.rlimit.vmemoryuse.hard = unlimited
proc.1231.rlimit.maxlwp.soft = 1024
proc.1231.rlimit.maxlwp.hard = 2048
The parent process has a unix socket connection to each child and a 
few extra descriptors for log, files, etc. It appears to use 6 more 
descriptors than current children, so the limits above should be ample.
Each child uses 10 (if dormant) or 12 (if active) descriptors and they 
inherit the above limits.
The second squid process runs as user nobody and is (currently) configured 
to do no caching or access logging. I'll refer to it as squidnc (nc = no 
cache). It listens on port 8123. It has the same process limits as the 
first squid process.
At busy times, the dansguardian processes stack up and hit the 250 limit. 
Web access then slows to a crawl as requests queue waiting for a 
free child. At that point users start shouting and time to 
investigate is short.
Both squid processes have a similar amount of file descriptors open. The 
main squid is around 40 higher reflecting the fact that it is manipulating 
its cache and there are a few helpers running. CPU usage is low. top shows 
the squids are in kqueue state. Inbound bandwidth is not limiting this.
squidnc does not log anything. danguardian logs:
dansguardian[11094]: Error 9 (Bad file descriptor) connecting to proxy 
127.0.0.1:8123 by client 10.4.4.2
My theory is that squidnc is the bottleneck (even though it is doing the 
least work), however I do not have any hard evidence of this. I am looking 
for help on finding and fixing any such bottlenecks or, if I'm looking in 
entirely the wrong place, suggestions of better places to look.
All on NetBSD 7.2_STABLE amd64 with adequate RAM.
--
Stephen
Home |
Main Index |
Thread Index |
Old Index