Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

SOLVED (mostly) Re: odd behaviour of some programs on i386 cross-built from amd64

So, I can now report I've been a victim of my own aging eyes and
clumsiness.  :-)

In summary the problem was due to accidentally typing an errant
character in a source file while browsing it (sometime back in
February), and worse yet I saved it without knowing I had done so, and
further having the bad luck for that character to not trigger any errors
during compilation.

The errant character was a tilde ('~'), and it landed at the beginning
of line #122 in src/lib/libc/gen/ctype_.c.  Obviously, but unfortunately
for me, this did not generate a syntax error, but instead just changed
the value of one entry in the _ctype_tab_ table (the one for the space

The long version of the story is that after all the previously mentioned
problems with debuggers, etc., I started debugging by inserting some
better error messages in usr.bin/hexdump/parse.c to see if I could
discover exactly what line the problem was occurring on, and sure enough
it seemed to be with the <ctype.h> macros.  Then I found I was able to
work around the problem by locally defining a naive version of isdigit()
(probably (I have not verified) this worked because the new value for
the space character in _ctype_tab_ was now identified as a digit, and my
naive replacement avoided this problem).

The final mystery is why the affected programs work when run with either
a newer kernel, or on amd64.  Although I can reproduce the bug in
hexdump, I cannot seem to reproduce it exactly.  I.e. if I reproduce the
bug by locally defining _ctype_tab_ et al with the errant value then
hexdump, when compiled for i386, exhibits the same problem on both i386
and amd64 with matching and newer kernels.  I.e. the reproduced bug does
not disappear in the scenarios where it disappeared before.  The old
buggy binary still only exhibits the bug only on a real i386 with a
matching kernel, and of course it still works OK on both amd64 with a
matching kernel and on a real i386 with a newer kernel.  Keep in mind
this is a static-linked binary.

Here's the buggy version working fine on a real i386 with a newer kernel:

$ uname -a
NetBSD once.local 9.0 NetBSD 9.0 (GENERIC) #0: Fri Feb 14 00:06:28 UTC 2020 i386
$ /more/home/more/woods/tmp/hexdump-
0000000 7361 6664 000a
$ file /more/home/more/woods/tmp/hexdump-
/more/home/more/woods/tmp/hexdump-: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, for NetBSD 8.99.32, stripped

Here's the buggy version working fine on amd64 (with a matching kernel):

$ uname -a
NetBSD future 8.99.32 NetBSD 8.99.32 (XEN3_DOMU) #1: Thu Nov 28 18:31:36 PST 2019  woods@future:/build/woods/future/current-amd64-amd64-obj/more/work/woods/m-NetBSD-current/sys/arch/amd64/compile/XEN3_DOMU amd64
$ ~/tmp/hexdump-
0000000 7361 6664 000a

Here's the buggy version failing on a real i386 with a matching kernel:

$ uname -a
NetBSD lilbit 8.99.32 NetBSD 8.99.32 (NET5501) #3: Fri May  1 16:55:04 PDT 2020  woods@once.local:/build/woods/once.local/current-i386-i386-ppro-obj/more/work/woods/m-NetBSD-current/sys/arch/i386/compile/NET5501 i386
$ /more/home/more/woods/tmp/hexdump-
hexdump-: ""%07.7_ax " 8/2 "%04x " "\n"": bad format

I guess the most interesting test would be to step instruction by
instruction through the execution on the real i386 with a newer kernel
and see if I can understand how it manages to work.  I don't think I
kept a copy of hexdump.debug though -- I may have to rebuild the whole
tree with the original error to make that less arduous to do.  Oh well,
I guess it only takes about 4 hours on my speediest build machine.

					Greg A. Woods <>

Kelowna, BC     +1 250 762-7675           RoboHack <>
Planix, Inc. <>     Avoncote Farms <>

Attachment: pgpGpjxf9FKjb.pgp
Description: OpenPGP Digital Signature

Home | Main Index | Thread Index | Old Index