Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

clang, arch/i386/stand and the calling convention

Hi all,
the last unresolved issues with clang integration are the CMSG* hack and
the size of the boot loader. The former is outside the scope of this
mail, leaving the size of the boot loader. I have some size optimised
versions of the common string routines and with that all of the normal
bootxx images except msdos and ustar fit. The netboot images don't so

The major reason is that LLVM currently doesn't support a push[bwl]
based argument handling like GCC uses for size-optimised code. That
means instead of e.g. using
        pushl $3
        pushl $2
        pushl $1
        call foo
it is always using the larger
        movl $3, 8(%esp)
        movl $2, 4(%esp)
        movl $1, (%esp)
        call foo
sequence for foo(1,2,3). The second form is normally preferable as it
allows better scheduling.

There are two basic approaches for dealing with this:
(1) Implement the other argument handling style for size optimised code.
This is relatively much work for something not necessarily used often.

(2) Use the more efficient register-based fastcall convention for stand.
This requires changes to the assembler code (but quite a bit is begging
for size optimisation anyway, e.g. libkern), but doesn't involve changes
to the code generator. It is benefical for GCC as well.

Just compiling with -mregparm -mrtd for bootxx gives:
        before  after
ffsv1   6524    6404
ffsv2   6876    6780
lfsv1   6556    6332
lfsv2   6476    7384
msdos   7512    7372
ustarfs 7612    7372
ext2fs  6916    6856

and allows fitting everything with clang as well. I believe (2) is less
work and helps free some space in general.


Home | Main Index | Thread Index | Old Index