Subject: Re: Adventures in Assembly (Is it even possible? Yes! Here's the
To: Nathan J. Williams <nathanw@MIT.EDU>
From: Marc Tooley <sudog@sudog.com>
List: port-i386
Date: 08/21/2001 12:41:41
Hello Nathan,

Thanks for your note. There's a couple of things I want to address and
say thanks for...:

> Why in the name of anything holy do you want to write in assembly?
> It's painful, it's tedious, there's no error checking, and you have to
> rewrite it for each of NetBSD's 14-odd CPU types. If you really enjoy
> assembly this much, divert your energies somewhere more useful, like
> writing compilers.
>
> Maybe you shouldn't answer that. I'm going to try to help you anyway.

That's okay. I'll answer anyway--the power, man! The power! Assembly
lets me see precisely what's going on on the CPU, lets me control what
operations are going where, and by jove, it's just plain cool to be
able to fiddle with this sort of thing. "I code Assembly on NetBSD.
Let's see you try." What a t-shirt that would make! :)

Seriously, though, the ability to understand, at the hardware level, a
piece of software and what it's doing seems to me to be a pretty
important skill if you want to do anything interesting. When I first
began to learn assembly on x86 back in the stone ages (university) it
opened a whole new set of doors beyond which there was nothing outside
our grasp. Seems like a silly philosophy now that I put it down. :)

It's also important to know this sort of thing when I have to deal
with exploit code and security problems. I used to anyway. =]

> If this bothers you, link static binaries.

Right--hadn't thought of that. =] Thanks.

> > int main(){write(0,"Hey there\n",10);}
> >
> > ..and gdb stepi through it some more. God hates a coward, after all.
>
> Unnecessairly painful. Use gdb, or better yet objdump --source
> --disassemble on a binary compiled with -g, to show you where the call
> you want is and set a breakpoint there. Using your code as an example:

A simple disassembly didn't show me the calling convention though--for
some reason the return address was needed for the syscall--it wasn't
like this before and that's why I was feeling in such a Eureka sort of
mood. :)

> 38 crash-test-dummy:nathanw>gdb foo
> ...
> (gdb) disassemble main
> Dump of assembler code for function main:
> 0x804878c <main>:       pushl  %ebp
> 0x804878d <main+1>:     movl   %esp,%ebp
> 0x804878f <main+3>:     pushl  $0xa
> 0x8048791 <main+5>:     pushl  $0x80488e7
> 0x8048796 <main+10>:    pushl  $0x0
> 0x8048798 <main+12>:    call   0x80484dc <write>

I forgot to take into account call and its effects on the stack. :) I
suppose I was a little confused--since the assembler howtos/etc don't
push a return address on the stack for each syscall. :)

> > int 0x80 is a generic "let's visit the kernel" interrupt, a 0x4 is the
> > write() kernel syscall according to /usr/src/sys/kern/syscalls.master
>
> You should look at src/lib/libc/arch/<whatever>/SYS.h to see how
> system calls are defined on a given platform.

I'll do that if I ever get my hands on a non-x86 machine aside from my
trusty Amiga. :)

> Yes. main() doesn't invoke a syscall directly; it calls the function
> write() in libc, which invokes a syscall. The call instruction puts
> the return address on the stack.
>
> You have missed the forest for the trees.

But my particular "tree" worked before--that's what was confusing me.

> Here's where you're wrong. System calls are complex beasts, and you
> need to know everything that the processor does with the trap invoked
> by the program and everything that the kernel does with that
> information before passing it off to the sys_write() C routine. In the
> i386 case, you'd need to look at IDTVEC(syscall) in
> src/sys/arch/i386/i386/locore.s, line 2413 in my version, and at
> syscall.c in the same directory.

Thank you for the extra pointers--you're very kind to assist me like
this. :) Don't worry--everything I do write is in C/whatever, but I
think learning assembly is important still.

Thanks again for your response,
Sincerely,
Marc Tooley