Subject: Re: binary/character floating point conversion
To: Christos Zoulas <christos@tac.gw.com>
From: J.T. Conklin <jtc@acorntoolworks.com>
List: tech-userlevel
Date: 03/03/2005 10:55:22
christos@tac.gw.com (Christos Zoulas) writes:
>>As mentioned in the PRs lib/14168 and lib/18803, our *printf does not
>>support long double.  It also appears to have a problem with thread-
>>safety (__dtoa() returns a Bigint / char* that is freed the next time
>>it is called, if one thread is time-sliced out before it is done with
>>the buffer and another calls *printf with a floating point argument,
>>it looks like bad things may happen).  David M. Gay's gdtoa binary/
>>character library could be used to fix both problems.
>
> Sounds good to me. I don't know how hard it would be to add long double
> support in our dtoa() through, and changing our dtoa() api to be thread
> safe does not seem that difficult. Our compiler long double support is
> a bit in flux too, so I am not sure if that is going to be a short term
> win. It looks like this library is supported, and it would be a good
> thing for the long term. Someone with floating point clue should give
> an opinion here.

I'll make the claim that the gdtoa.tgz library is (the master copy) of
dtoa.c with added long double support, so it makes little sense for us
to reproduce that work.  Integrating gdtoa should not be difficult, no
more so than modifying the -current version to avoid the thread safety
issues.  Both dtoa.c and gdtoa.tgz are maintained.  The latest dtoa.c
was released on 4/12/2004, the latest gdtoa.tgz on 1/16/2005.

>>* Where the dtoa implementation was combined with strtod.c in a single
>>  file, the full gdtoa library is ~50 files.  It probably makes sense
>>  to import it into dist/gdtoa.  But in the past (long past), it was
>>  a requirement for libc and the kernel subtrees to be self contained
>>  without vpaths out.  What are the current guidelines?
>
> When I replaced the bind code in libc, I copied the files in the
> libc directories, instead of making reachout changes to the Makefiles.
> My incentive was:
> 	- to be able to upgrade bind independently from the libc
> 	  resolver.
> 	- to keep all the libc code in one place
> 	- to clearly separate out parts of the bind code that were
> 	  not used in libc and would only contribute to bloat.
> I don't know what is best in this case, but I think that following
> what FreeBSD did [s/contrib/dist] and providing the whole package
> is reasonable. On the other hand there are only 16 files used from
> the gdtoa package... I am a bit torn on that. If was really pressed
> to vote, I'd vote for splitting it like FreeBSD did. But don't take
> my word on that.

That was my preference as well, so that's what I'm doing.  It will be
easy enough to change libc/gdtoa/Makefile.inc to remove the reachover
.PATH: if the final decision is to copy the files into libc itself.

>>* dtoa's strtod.c was modified with platform specific #define's that
>>  describe the floating point type, etc.  The gdtoa library includes
>>  the "arithchk" program which figures it out and generates "arith.h";
>>  Similarly, it includes the "qnan" program which figures out the bit
>>  patterns for quiet NANs.  Like gdtoa.h, I think these headers would 
>>  be private to libc.
>>
>>  Should we have a single "arith.h", or use arithchk to generate one
>>  header per architecture and check it in libc/arch/<cpu>/arith.h or 
>>  libc/arch/<cpu>/gdtoa/arith.h?  Likewise for gd_qnan.h?
>
> It depends on how different they are. Is every platform different, or
> they fall into categories? How ugly would the ifdefs be putting them
> into a single file? I can't answer that without looking into this in
> detail.

The i386 arith.h is:
#define IEEE_8087
#define Arith_Kind_ASL 1

The ppc arith.h is:
#define IEEE_MC68k
#define Arith_Kind_ASL 2
#define Double_Align

The amd64 arith.h is:
#define IEEE_8087
#define Arith_Kind_ASL 1
#define Long int
#define Intcast (int)(long)
#define Double_Align
#define X64_bit_pointers

My m68k, mips, and alpha boxes are in storage.

I imagine that there will be commonality that could be factored out
with preprocessor conditionals (big/little endian, ieee/vax fp, 32/64
bit).  Whether we end up using separate files or not, we will need to
run the arithchk program on all NetBSD architectures to collect the
data.

> Thanks for doing all that! 

You're welcome.

I've run into a slight problem with building gdtoa lint libraries that
I'll bring up in a separate message as I have to run off to a meeting.

> PS: Any updates on your string regression tests? I'd really like to add
> them to src/regress when they are ready!

I've finished my part for the next ACE/TAO release, so I'm free to
work on that as soon as I get C++ wide character strings working.
Most likely that will be this weekend.

    --jtc

-- 
J.T. Conklin