tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

timeouts connecting to pgsql database



Not sure if anyone else has experienced the below, postgres configs can be shared if needed, we have a few database servers running on 7.12 and 9

We recently noticed some problems with timeouts on some postgres database servers. The machines don't appear to be heavily loaded, although they are being used steadily. What we're seeing is that the machine is working fine. No swapping, load average is below 1, and then it doesn't accept database connections for about 10-20 seconds, and any queries on active connections fail to return for that same time period. I tried running these commands in a screen to get a better sense of the system state when the problem occurs:

while true; do { uptime; iostat -d -w 5 -c 20; } >> iostatlog; done
while true; do { uptime; vmstat 5 20; } >> vmstatlog; done

What I saw on this machine was that the CPU spiked with 100% in the system time column, the runnable processes went way up and the page faults went way down.  I can say about our work load is it's database queries and updates, and I don't believe there's an appreciable difference in load between the problem times and the rest of the time. There are only around 40 sessions active on the system most of the time.

We've had one dmesg entry this week:

Feb 11 15:34:07 adb3 /netbsd: coretemp1: workqueue busy: updates stopped


Here's the output from the vmstat command for one problem period (problem is when the runnable column increeases, the rest is for context):

Details


10:59AM up 903 days, 21:46, 3 users, load averages: 0.97, 0.97, 1.49
procs memory page disks faults cpu
r b avm fre flt re pi po fr sr f0 c0 in sy cs us sy id
0 0 9420036 68852 4933 24 0 0 21 20 0 0 335 9345 2138 2 2 96
0 0 9434852 54592 7729 0 0 0 0 0 0 0 556 11519 2323 3 2 95
0 0 9447464 41252 6714 0 0 0 0 0 0 0 320 14205 3321 3 2 96
0 0 9461720 25864 8054 0 0 0 0 0 0 0 645 29889 7565 6 4 90
0 0 9441260 47096 5061 6 0 0 1641 1648 0 0 563 28897 7485 6 2 92
2 15 9451456 38140 3014 0 0 0 0 0 0 0 418 17954 4874 3 14 82
31 12 9420264 69328 44 0 0 0 0 0 0 0 81 682 106 0 74 26
49 3 9398768 102852 5 0 0 0 0 0 0 0 68 662 90 0 100 0
48 4 9370392 149884 20 0 0 0 0 0 0 0 104 821 151 0 100 0
2 0 9468232 28684 41407 1 0 0 1679 1680 0 0 876 35761 7269 20 28 52
1 0 9448488 49812 8582 0 0 0 1656 1657 0 0 1162 34065 6814 11 5 84
0 0 9469184 31380 8118 0 0 0 0 0 0 0 702 13079 2294 4 3 93
0 0 9460184 38040 7910 0 0 0 1443 1443 0 0 556 7832 1206 2 3 95
1 0 9468816 29488 5877 0 0 0 0 0 0 0 386 6910 1000 2 1 97
0 0 9449376 48788 8022 0 0 0 1647 1647 0 0 642 8821 1286 3 3 94
0 0 9460292 38116 7449 0 0 0 0 0 0 0 693 8402 1407 2 2 96
0 0 9475872 21932 7863 0 0 0 0 0 0 0 603 9650 1585 3 3 94
0 0 9446044 51012 6970 1 0 0 1731 1732 0 0 579 8643 1177 2 2 96
procs memory page disks faults cpu
r b avm fre flt re pi po fr sr f0 c0 in sy cs us sy id
1 0 9457652 38916 6176 0 0 0 0 0 0 0 492 7198 1032 2 3 95
1 0 9472320 24152 6621 0 0 0 0 0 0 0 538 10150 1921 3 2 96
11:01AM up 903 days, 21:47, 3 users, load averages: 4.30, 2.53, 2.06



Here's the corresponding iostat output (these are just running in a loop exactly as shown above, so they're not quite synched):

Details


10:59AM up 903 days, 21:45, 3 users, load averages: 0.99, 0.95, 1.52
fd0 cd0 sd0 sd1
KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s
0.000 0 0.000 0.000 0 0.000 31.72 6 0.190 26.87 239 6.262
0.000 0 0.000 0.000 0 0.000 39.27 10 0.399 15.36 65 0.981
0.000 0 0.000 0.000 0 0.000 19.59 3 0.061 17.50 117 1.996
0.000 0 0.000 0.000 0 0.000 39.85 9 0.342 16.40 105 1.675
0.000 0 0.000 0.000 0 0.000 33.59 3 0.105 16.13 258 4.062
0.000 0 0.000 0.000 0 0.000 41.11 10 0.417 15.06 185 2.718
0.000 0 0.000 0.000 0 0.000 28.79 3 0.095 16.04 212 3.323
0.000 0 0.000 0.000 0 0.000 41.05 10 0.392 15.35 165 2.481
0.000 0 0.000 0.000 0 0.000 22.39 3 0.061 17.03 199 3.309
0.000 0 0.000 0.000 0 0.000 39.68 9 0.348 16.87 99 1.637
0.000 0 0.000 0.000 0 0.000 23.28 5 0.104 16.72 187 3.056
0.000 0 0.000 0.000 0 0.000 64.00 0 0.012 15.07 130 1.912
0.000 0 0.000 0.000 0 0.000 35.06 15 0.513 15.92 215 3.343
0.000 0 0.000 0.000 0 0.000 48.50 1 0.038 14.95 181 2.637
0.000 0 0.000 0.000 0 0.000 40.21 9 0.361 15.64 164 2.512
0.000 0 0.000 0.000 0 0.000 18.12 2 0.041 17.78 3 0.060
fd0 cd0 sd0 sd1
KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s
0.000 0 0.000 0.000 0 0.000 0.000 0 0.000 34.79 1 0.046
0.000 0 0.000 0.000 0 0.000 32.00 0 0.000 21.44 44 0.924
0.000 0 0.000 0.000 0 0.000 40.78 11 0.429 15.24 372 5.537
0.000 0 0.000 0.000 0 0.000 28.23 3 0.083 14.52 395 5.597
11:00AM up 903 days, 21:46, 3 users, load averages: 8.74, 2.89, 2.16
fd0 cd0 sd0 sd1
KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s
0.000 0 0.000 0.000 0 0.000 31.72 6 0.190 26.87 239 6.262
0.000 0 0.000 0.000 0 0.000 64.00 0 0.000 15.37 257 3.858
0.000 0 0.000 0.000 0 0.000 33.69 14 0.473 17.01 209 3.477
0.000 0 0.000 0.000 0 0.000 32.00 0 0.000 15.10 98 1.446
0.000 0 0.000 0.000 0 0.000 38.87 8 0.311 16.25 108 1.717




We had a similar problem on miadb7, which has a lot more memory and cores, and running NetBSD 9:

Details


11:52AM up 85 days, 1:57, 3 users, load averages: 0.21, 0.39, 0.49
procs memory page disks faults cpu
r b avm fre flt re pi po fr sr w0 d0 in sy cs us sy id
0 0 236841096 391840 6533 5 4 4 118 124 6 2 313 8794 1762 0 1 99
0 0 236844824 383760 5372 0 0 0 0 0 1 1 597 8418 1219 0 0 99
0 0 236842660 384344 5482 0 0 0 0 0 0 0 468 6478 889 0 0 100
6 11 236828996 402360 3262 0 0 0 0 0 0 0 636 6890 894 0 5 94
17 0 236804600 453488 3898 0 0 0 0 0 0 0 426 3536 711 0 12 88
25 0 236703876 558724 51 0 0 0 0 0 0 0 270 1487 314 0 30 69
0 0 236705268 567480 10763 0 0 0 0 0 1 1 802 15384 2432 0 1 99
1 0 236720432 554272 8114 0 0 0 0 0 0 0 585 9921 1283 0 0 99
0 0 236702444 579780 4842 0 0 0 0 0 3 3 636 8262 1201 0 1 99
2 0 236715144 572140 6528 0 0 0 0 0 0 0 698 10282 1281 0 0 99
0 0 236715140 568568 5584 0 0 0 0 0 0 0 533 7955 1187 0 0 100
0 0 236725208 557356 3764 0 0 0 0 0 1 1 580 42026 8687 1 0 99
0 1 236727048 550256 5089 0 0 0 0 0 1 1 713 41936 8444 0 1 99
0 0 236742324 529312 4714 0 0 0 0 0 7 7 632 33430 6896 1 0 99
0 0 236738568 530152 4883 0 0 0 0 0 1 1 474 18325 3524 0 0 99
1 0 236797320 486532 16181 0 0 0 0 0 1 1 530 18286 1927 1 1 99
0 0 236793224 495888 20480 0 0 0 0 0 1 1 575 25225 1701 0 1 99
0 0 236745420 536612 9697 0 0 0 0 0 0 0 780 14561 2160 1 1 98
procs memory page disks faults cpu
r b avm fre flt re pi po fr sr w0 d0 in sy cs us sy id
2 0 236732116 548652 5674 0 0 0 0 0 1 1 534 10997 1789 0 0 100
0 0 236760860 521512 6577 0 0 0 0 0 1 1 804 55986 10949 1 1 99
11:54AM up 85 days, 1:58, 3 users, load averages: 1.46, 0.99, 0.72

 

Details


11:51AM up 85 days, 1:56, 3 users, load averages: 0.34, 0.45, 0.51
wd0 dk0 dk1 wd1
KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s
18.53 6 0.100 41.63 2 0.069 8.241 4 0.031 31.40 129 3.943
24.80 1 0.018 24.80 1 0.018 0.000 0 0.000 27.10 82 2.163
24.00 0 0.000 24.00 0 0.000 0.000 0 0.000 29.21 107 3.066
0.000 0 0.000 0.000 0 0.000 0.000 0 0.000 23.09 50 1.133
27.20 0 0.000 27.20 0 0.011 0.000 0 0.000 21.95 88 1.892
32.00 0 0.000 32.00 0 0.000 0.000 0 0.000 25.21 44 1.073
32.00 8 0.000 32.00 8 0.000 0.000 0 0.000 25.65 87 2.176
12.00 0 0.002 12.00 0 0.002 0.000 0 0.000 25.04 54 1.325
29.33 1 0.000 29.33 1 0.000 0.000 0 0.000 25.15 71 1.746
64.00 0 0.020 64.00 0 0.020 0.000 0 0.000 23.01 52 1.168
4.000 0 0.000 4.000 0 0.000 0.000 0 0.000 24.03 62 1.464
33.19 4 0.000 33.19 4 0.000 0.000 0 0.000 25.37 59 1.472
0.000 0 0.000 0.000 0 0.000 0.000 0 0.000 24.92 55 1.334
27.00 0 0.000 27.00 0 0.000 0.000 0 0.000 25.98 57 1.441
0.000 0 0.000 0.000 0 0.000 0.000 0 0.000 26.00 1 0.000
24.00 0 0.003 24.00 0 0.003 0.000 0 0.000 26.55 76 1.973
wd0 dk0 dk1 wd1
KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s
32.00 1 0.000 32.00 1 0.000 0.000 0 0.000 20.97 77 1.581
4.000 1 0.000 4.000 1 0.000 0.000 0 0.000 23.32 81 1.848
0.000 0 0.000 0.000 0 0.000 0.000 0 0.000 26.79 53 1.393
32.59 3 0.000 32.59 3 0.000 0.000 0 0.000 23.88 70 1.633
11:53AM up 85 days, 1:58, 3 users, load averages: 3.10, 1.13, 0.75
wd0 dk0 dk1 wd1
KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s
18.53 6 0.100 41.63 2 0.069 8.241 4 0.031 31.40 129 3.943
16.00 0 0.000 16.00 0 0.000 0.000 0 0.000 26.47 84 2.161
38.00 1 0.027 38.00 1 0.027 0.000 0 0.000 21.34 61 0.000
64.00 1 0.066 64.00 1 0.066 0.000 0 0.000 17.73 113 1.950
49.00 2 0.087 49.00 2 0.087 0.000 0 0.000 14.34 80 1.114
35.00 8 0.000 35.00 8 0.266 0.000 0 0.000 25.53 166 4.150
37.00 1 0.031 37.00 1 0.031 0.000 0 0.000 21.97 50 1.072
26.67 0 0.000 26.67 0 0.000 0.000 0 0.000 24.96 77 1.865
44.00 1 0.026 44.00 1 0.026 0.000 0 0.000 24.98 62 1.520
64.00 0 0.000 64.00 0 0.000 0.000 0 0.000 25.03 68 1.663
41.00 1 0.049 41.00 1 0.049 0.000 0 0.000 16.50 61 0.987
64.00 1 0.000 64.00 1 0.000 0.000 0 0.000 18.87 135 2.479
36.48 6 0.219 36.48 6 0.219 0.000 0 0.000 18.88 94 1.738

 



Home | Main Index | Thread Index | Old Index