Port-sparc archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ntpd wedged by libc?



On Mar 6,  3:01pm, hart%ntp.org@localhost (Dave Hart) wrote:
-- Subject: Re: ntpd wedged by libc?

| On Tue, Mar 6, 2012 at 14:22, Christos Zoulas <christos%zoulas.com@localhost> 
wrote:
| > On Mar 5, =A08:03pm, agcarver+netbsd%acarver.net@localhost (AGC) wrote:
| > | I had to bump it up to 30,000 but it finally did finish:
| >
| > Then I don't think it is leaking... You can try with the old libc,
| > and you'll see it will run out of memory.
| 
| The problem has evolved.  At first, ntpd stopped responding due to out
| of memory due to a leak triggered by lots of snprintf with floating
| point.  With the leak so identified now fixed, it's still ntpd is now
| reported to be "wedging" (I assume meaning spinning using lots of CPU
| and not responding to network traffic) and it's still apparently
| related to snprintf of floating points.  The opening message of this
| thread has a stack trace which I assume came from attaching a debugger
| to the spinning ntpd:
| 
| =3D=3D=3D=3D=3D=3D
| Seems I'm still having issues with libc on 5.1/sparc specifically with
| ntpd wedging when doing math:
| 
| #0  0x103d38c8 in __pow5mult_D2A () from /usr/lib/libc.so.12
| #1  0x103d3ac4 in __muldi3 () from /usr/lib/libc.so.12
| #2  0x103d34dc in __mult_D2A () from /usr/lib/libc.so.12
| #3  0x103d3728 in __pow5mult_D2A () from /usr/lib/libc.so.12
| #4  0x103c61d4 in __dtoa () from /usr/lib/libc.so.12
| #5  0x103c315c in __vfprintf_unlocked () from /usr/lib/libc.so.12
| #6  0x103330c4 in snprintf () from /usr/lib/libc.so.12
| #7  0x000256f4 in ctl_putdblf (tag=3D0x87d79 "", fmt=3D0x88458 "%.3f",
| d=3D4.5623779296875)
|    at ntp_control.c:1431
| =3D=3D=3D=3D=3D=3D
| 
| There have been over 50 messages in the thread, so I think we can all
| be forgiven forgetting a detail or two along the way, but I don't
| think anyone has suggested the original leak bug hasn't been fixed.
| Rather, it seems there is still some sort of problem on "5.1" (not
| -current, clearly) on sparc with ntpd being polled every few seconds
| by ntpq triggering a hang snprintf'ing with floating point.
| 
| The stack trace looks very similar to the first go-around.  If
| accurate, it suggests the same code still has issues that ntpd's abuse
| tickles but t_printf.c doesn't.

Sure, let's change the test to be closer to the ntp one, let's make the
format %.3f for example. The way I tracked it down initially was by
instrumenting all malloc/free's in the dtoa code...

christos


Home | Main Index | Thread Index | Old Index