Subject: Adventures in Assembly Chapter 2 (Is this the last we see of our
To: None <port-i386@netbsd.org>
From: Marc Tooley <sudog@sudog.com>
List: port-i386
Date: 09/03/2001 02:23:01
(apologies if this is a second posting--first one didn't appear to go
through..)

Second in a series of NetBSD Assembly expositions for those of us stubborn,
crazy, and insane enough to care about this sort of thing, sudog presents:

Marc's Adventures in Assembly, Chapter 2

When we last left our hero, Marc was in the process of celebrating and
bringing home from his NetBSD work machine his prize: A so-called working
assembly program that accessed syscalls directly and was a mere few hundred
bytes--his first real NetBSD-based assembly program!

Little did most of us know, that when the code was brought home, it failed
utterly to produce anything but an odd ktrace:

 16249 ktrace   EMUL  "netbsd"
 16249 ktrace   RET   ktrace 0
 16249 ktrace   CALL  execve(0xbfbfdc73,0xbfbfdbd0,0xbfbfdbd8)
 16249 ktrace   NAMI  "./hello"
 16249 hello    EMUL  "linux"
 16249 hello    RET   oldolduname -1 errno -2 No such file or directory
 16249 hello    CALL  write(0xbfbfdff0,0,0)
 16249 hello    RET   write -1 errno -9 Bad file descriptor
 16249 hello    CALL  exit(0xbfbfdff0)

Linux emulation strikes us down where we stand, and makes us humble--if a
little overly curious or zealous in our single-minded drive to continue this
to its natural completion. I suddenly realize that by their silence, a very
rare few who actually knew about what I was trying to do were either:

i. Mocking me.

ii. Waiting patiently, buddha-like, for me to arrive at a "true"
answer by myself, and thus granting me enlightenment when I finally
discovered it.

iii. Completely ignoring the thread. :)

Anyhow, to continue. To solve our problem, I suspect we need to learn more
in this case about how NetBSD recognizes code and how to tell the NetBSD
loader code NOT to try to use Linux emulation. Just as a test, we need to
alter our original program: instead of pushing our arguments and return
address onto the stack (see Adventures in Assembly Chapter 1,) we move the
arguments into the EAX,EBX,ECX,EDX registers and make the syscall:

section .data
msg	db	"Hello World!",0x0a
len	equ	$-msg

section .text
	global _start
_start:
	mov	eax, 0x04
	mov	ebx, 0x01
	mov	ecx, msg
	mov	edx, len
	int	0x80
	mov	eax,0x1
	int	0x80

Compile and execute:

nasm -f elf hello.s
ld -s -o hello hello.o
./hello
Hello World!

ktrace tells us NetBSD thinks it's a Linux binary too.

So! How do we find out more about the binary formats? Let's do some
searching, shall we? First round, we find the following *very* interesting
web page off the NetBSD main pages that turns out to be a slightly
misleading key to our problem:

NetBSD info on binary file formats..
http://www.netbsd.org/Documentation/kernel/elf-notes.html#note-creation

In there we learn that NetBSD knows about binary file types--at least in ELF
executables--via an odd little ELF extension called a PT_NOTE. Unfortunately
not only is the page GNU as-centric (which is perfectly reasonable,) but the
page describes a fictional note that no one ever uses, called "NaMe", with
useless values.. well pretty much for everything.

Trying to create a set of bits that matches the "english" description
above that fictional note yields nothing--my copy of NetBSD still
tries to apply Linux emulation no matter what I try to do.

Perhaps the section is marked incorrectly. Searching the NASM manual for
PT_NOTE tells us nothing. But in the index there is a section on sections
and custom "progbits". Specifically:

NASM elf extensions..
http://web-sites.co.uk/nasm/docs/nasmdoc6.html#section-6.5.1

This tells us about the NASM ability to mark elf sections with ALLOC,
NOWRITE, DATA/CODE, etc. This is crucial, as we'll see later.

But we're still on the right track. A chance click brings up the
following thread from netbsd-bugs:

About .note.netbsd.ident PT_NOTEs:
http://mail-index.netbsd.org/netbsd-bugs/2001/08/01/0006.html

What's that tool they're using to dump object files.. Hrm. Object files.
Dumping. Duh. objdump! Looks like the "--full-contents" option is what we
need:

objdump --all-headers --full-contents `which cal` | less

... showed me the light. But that's awfully odd. Looks like two PT_NOTE
sections are crammed together into the ".note.netbsd.ident". Being ever the
one to veer off into side-tracks, I wonder if this is legal. According to
the ELF specification, which mentions "segments" (notice the plural) when
referring to PT_NOTEs, and has a vivid description of a note section (with
two notes in it), looks like it is.

(See: http://www.muppetlabs.com/~breadbox/software/ELF.txt)

And so here's the objdump of the relevant part, from "/usr/bin/cal":

Contents of section .note.netbsd.ident:
 804810c 07000000 04000000 01000000 4e657442  ............NetB
 804811c 53440000 e10c0300 07000000 07000000  SD..............
 804812c 02000000 4e657442 53440000 6e657462  ....NetBSD..netb
 804813c 73640000                             sd..

Note in the objdump that there is another .note section, but it turns
out that section is unnecessary to get our program working.

Anyhow, with the objdump and the NetBSD page on PT_NOTEs we are armed to the
teeth and ready to make our first attempt at generating our .note in NASM:

section .note.netbsd.ident progbits alloc noexec nowrite
	; (remember the NASM page about section bits? we needed to match them
	; up with the bits that objdump output. Specifically:
	; CONTENTS, ALLOC, LOAD, READONLY, DATA
; this is the OS version note, or stamp if you will.
	dd	0x00000007	; note name length (namesz in elf.txt)
	dd	0x00000004	; note desc length (descsz in elf.txt)
	dd	0x00000001	; type integer (OS version note type)
				; 0x02 will be emulation note later on
	db	0x4e,0x65,0x74,0x42,0x53,0x44,0x00,0x00
				; "NetBSD\0" (actual "name")
				; objdump shows padded out to 4 byte
				; boundaries, as per doc
	db	0xe1,0x0c,0x03,0x00	; NetBSD os version??! (199905?)
				; see /usr/src/sys/sys/param.h, NetBSD define
	db      "netbsd",0x00,0x00
				; desc part. :)
; finally, we have the emulation note that says "use NetBSD to handle"
	dd	0x00000007	; note name length
	dd	0x00000007	; desc name length
	dd	0x00000002	; type (emulation note)
	db	0x4e,0x65,0x74,0x42,0x53,0x44,0x00,0x00
				; "NetBSD\0" (padded then to 4 bytes)
	db	0x6e,0x65,0x74,0x62,0x73,0x64,0x00,0x00
				; this is the emulation "name"
				; "netbsd\0" (padded then to 4 bytes)
section .data
msg	db	"Hello World!",0x0a
len	equ	$-msg

section .text
	global _start
_start:
	push	dword len
	push 	dword msg
	push	dword 0x01
	push	dword myret
	mov	eax,0x04
	int	0x80
myret:
	mov	eax,0x1
	int	0x80
	; we should prolly clean up esp if we were to make this into
	; a real program...

Compile it with:

nasm -f elf hello.s
ld -s -o hello hello.o
./hello
Hello World!

Success! On the first try! Perhaps this time, this is a bit more of a
lasting euphoria? With objdump at our disposal, we can alter our
program section header to conform to future changes, since it looks
like the NetBSD loader code is in regular flux.

NOW, we are more ready to create standalone i386 1.5.2 NetBSD assembly. Ha
ha, I wonder how many people would actually run our software, let alone have
the ability to. :) But this is an exercise, nothing more. After all, the
only thing our program does is "Hello World!\n" right? :)

So, thus concludes Chapter 2 of Marc's Adventures in Assembly. Tune in
next time; same bat-brain, same bat-channel, where we decide how this
information might be useful in self-modifying encrypted executables!
Or rediscover out how to run Hello World on the next incarnation of
the NetBSD loader anyway.

Good night!
Marc Tooley