Port-sparc archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [ntp:questions] Ntpd in uninterruptible sleep?



From the archives, it appears top-posting is common on this list, so
I'll follow suit.  I'm happy with either top or inline/bottom posting.
 As I said in a followup a few minutes ago on questions%lists.ntp.org@localhost,
I was wrong on two points.

First, NetBSD 5.x snprintf() likely is C99-compliant.  Still, with
ntpd 4.2.7, you can force the use of C99-snprintf's rpl_vsnprintf() by
configuring the NTP package with --enable-c99-snprintf, which will
avoid the code that is implicated in the stack trace.  However...

Secondly, I'm having a hard time believing this could be a bug in the
NetBSD 5.x SPARC dtoa() implementation or routines upon which it is
built.  If that code were prone to infinite looping, surely that bug
would have been identified long ago.  My best guess is hardware bug or
failure, but that's the issue I hope we can get some help nailing
down.

Thanks,
Dave Hart

On Sat, Nov 12, 2011 at 05:46, AGC <agcarver+netbsd%acarver.net@localhost> 
wrote:
> Hi everyone,
>
> Below is the most recent conversation I've been having over on the ntpd list
> trying to resolve an issue with ntpd locking up on my IPX.  After much back
> and forth with various debugging data, Dave Hart appears to have identified
> the bug as being inside NetBSD's copy of libc and it's dtoa().  So the
> thread has bee brought over here to continue the discussion and try to
> resolve the issue.
>
> In the text below is a gdb stack trace of ntpd at the point when it is stuck
> and running at near 100% CPU.  This event occurs after ntpd has been running
> for some period of time but two individual stack traces (another one created
> after the quoted one below) show exactly the same stack, ultimately dying in
> libc.
>
>
>
> On 11/11/2011 19:19, Dave Hart wrote:
>>
>> On Fri, Nov 11, 2011 at 20:23, A C<agcarver+ntp%acarver.net@localhost> 
>>  wrote:
>>>
>>> First attempt with gdb and a back trace after attaching gdb to the hung
>>> process (note this particular running of ntpd was not using the debug
>>> command line options):
>>>
>>>> #0  0x103d1458 in .umul () from /usr/lib/libc.so.12
>>>> #1  0x103c38d4 in __pow5mult_D2A () from /usr/lib/libc.so.12
>>>> #2  0x103c3ac4 in __muldi3 () from /usr/lib/libc.so.12
>>>> #3  0x103c34dc in __mult_D2A () from /usr/lib/libc.so.12
>>>> #4  0x103c3728 in __pow5mult_D2A () from /usr/lib/libc.so.12
>>>> #5  0x103b61d4 in __dtoa () from /usr/lib/libc.so.12
>>>> #6  0x103b315c in __vfprintf_unlocked () from /usr/lib/libc.so.12
>>>> #7  0x103230c4 in snprintf () from /usr/lib/libc.so.12
>>>> #8  0x00023afc in ctl_putarray (tag=<value optimized out>, arr=0xa8fe0,
>>>> start=1)
>>>>    at ntp_control.c:1307
>>>> #9  0x00024a7c in ctl_putpeer (varid=30, peer=0xa8e70) at
>>>> ntp_control.c:1777
>>>> #10 0x0002744c in read_variables (rbufp=0x1050d000, restrict_mask=0) at
>>>> ntp_control.c:2334
>>>> #11 0x0002664c in process_control (rbufp=0x1050d000, restrict_mask=0) at
>>>> ntp_control.c:809
>>>> #12 0x00035594 in receive (rbufp=0x1050d000) at ntp_proto.c:370
>>>> #13 0x00022c00 in ntpdmain (argc=<value optimized out>, argv=<value
>>>> optimized out>) at ntpd.c:1150
>>>> #14 0x0001381c in ___start ()
>>>> #15 0x00013754 in _start ()
>>
>> Excellent.  I assume the stack trace is from ntpd 4.2.6p3.  I think
>> you've found a bug in your system's libc dtoa() exposed by its
>> snprintf(s, " %.2f", ...).  I believe you will not be able to
>> reproduce the bug using 4.2.7, as that version of ntpd uses
>> C99-snprintf [1] if the system snprintf() is not C99-compliant.
>> C99-snprintf's rpl_vsnprintf() does not use dtoa(), it hand-rolls the
>> double-to-ascii conversion.  Below is the code in ntpd.  NTP_SHIFT is
>> 8.  I claim the ntpd code is correct and your system's dtoa() and
>> thereby snprintf() of double (floating point) is subject to infinite
>> looping for some values.
>>
>> I suggest we move this discussion to the appropriate NetBSD mailing
>> list.  Please cc me, and I'll subscribe.
>>
>> /*
>>  * ctl_putarray - write a tagged eight element double array into the
>> response
>>  */
>> static void
>> ctl_putarray(
>>        const char *tag,
>>        double *arr,
>>        int start
>>        )
>> {
>>        register char *cp;
>>        register const char *cq;
>>        char buffer[200];
>>        int i;
>>        cp = buffer;
>>        cq = tag;
>>        while (*cq != '\0')
>>                *cp++ = *cq++;
>>        i = start;
>>        do {
>>                if (i == 0)
>>                        i = NTP_SHIFT;
>>                i--;
>>                NTP_INSIST((cp - buffer)<  sizeof(buffer));
>>                snprintf(cp, sizeof(buffer) - (cp - buffer),
>>                         " %.2f", arr[i] * 1e3);
>>                cp += strlen(cp);
>>        } while(i != start);
>>        ctl_putdata(buffer, (unsigned)(cp - buffer), 0);
>> }
>>
>> [1] http://www.jhweiss.de/software/snprintf.html
>>
>> Cheers,
>> Dave Hart
>>
>
>


Home | Main Index | Thread Index | Old Index