Subject: what's in a name? fingerprinted exec
To: None <tech-security@netbsd.org>
From: Brett Lymn <blymn@baesystems.com.au>
List: tech-security
Date: 10/12/2002 22:25:22
<firstly, please CC me on any replies, thanks>

Folks,
        Some of you may be aware that myself and others have been
working on an idea I have had for some time.  Basically the idea is to
provide the ability of the kernel to verify an executable has not been
modified before it is allowed to be executed.  You can think of this
as a kernel level tripwire if you like.  In the past I have called
this "signed exec" but it has been pointed out to me that this is
probably not the best name.  I am considering calling it "hashed exec"
instead since this is closer to what the thing does but I am open to
better naming suggestions.  If there are no strenuous objections to
the naming or there is a better name suggested I shall commit what I
have done to the main NetBSD tree.

For those that are not aware of what hashed exec does, it is a
modification to the exec path of the kernel.  When an executable is
requested to be exec'ed, the kernel performs checks to ensure the
target is a valid thing to execute.  We have added another check that
evaluates a cryptographic hash (either md5 or sha1 at the moment) of
the on disk file and compares this hash to an in kernel list of hashes
that was loaded at boot time.  If the two hashes match then the file
has not been tampered with and the execution is permitted to proceed.
If the hashes do not match or there is no hash for the binary then the
execution is denied.  This does not just apply to binaries - any file
that is exec'ed such as shell scripts can be verified in this manner.
Also, as an extension, a flag can be set so that a binary may be used
as a script interpreter (ie in the #! at the top of a shell script)
but is not allowed to be invoked from the command line - this means
you can have a powerful script interpreter binary that can be used for
approved (and validated) scripts but cannot be used to run
unauthorised scripts.  The hash exec scheme can also fingerprint
arbitrary files and prevent the open of a file that has been modified
or tampered with - this is used to allow the protection of shared
libraries, preventing a trojan being inserted into a shared library.
Though protecting the shared libraries was the main aim, the upshot is
that any file on the system may be protected from tampering.

This scheme is not perfect, it is not going to magically make the
system totally secure.  What it does do prevent a number of common
attacks from working and thus "raises the bar" significantly.  This
scheme also paves the way (I am NOT saying this will be done but
simply that it will be possible) for the NetBSD project to release a
kit that contains the bootloader, kernel, fingerprint loader and
fingerprint list with a signature on it.  The package can be
downloaded and the signature on it verified independently of the
target system.  The contents can then be loaded onto the target system
and the machine booted.  If care is taken doing this then the system
will then be automatically validated and trust in the binaries on the
machine established.  Given the current trend of trojaned
distributions I feel this may be a desirable thing to do.

Note that hashed exec is not something that would be desirable on
every machine, indeed, it is a major pain in the arse for a user's
workstation because updates to the system can only be done in single
user mode.  This scheme is more intended for those infrastructure type
machines (firewalls, routers, servers) where it is important that the
correct operation of the machine is ensured in a hostile environment.

Over the years there have been some recurring questions asked when I
have talked about this scheme, here are some of them:

Q: How are the fingerprints loaded into the kernel?
A: They are passed into the kernel via a pseudo-device by a loader
   app.

Q: So, how do you stop the list being updated later?
A: by using securelevel - the fingerprints can only be loaded at
   securelevel == 0

Q: Why not embed the fingerprints in the binary somehow?
A: two reasons 1) my idea was to cause the minimal amount of change to
   the system as possible, embedding fingerprints would mean special
   tools and other complications.  2) shell scripts would not be
   covered.

Q: Does this blow demand paging out of the water?
A: yes and no.  Yes, the first time an executable is loaded the whole
   file will be read thus demand paging goes out the window.  If the
   executable is then re-run, demand paging works because the result
   of the previous evaluation of the fingerprint was cached in the
   file's vnode, as long as the vnode sticks in memory the fingerprint
   will not be re-evaluated.

Q: So, you trust the file on the disk... what if someone overwrites
   it?
A: Any fingerprinted file is automatically made read-only.  If the
   file is moved/removed and a new one put in it's place there will
   not be a fingerprint in the kernel list for the new file so it will
   not be used.  Even if the file did have the same inode if the
   contents are modified then the fingerprint will not match anyway.

Q: What if I have my files on a SAN? I cannot trust that my files will
   not be modified by another machine!
A: Previously I would have said "tough, don't do that" but now we have
   cgd(4) in NetBSD-current you have the option of encrypting the file
   system on the SAN thus preventing unauthorised tampering.

Q: Is there a performance decrease?
A: Not that I could realistically measure.  Yes, theoretically there
   should be a performance hit because there is a lot of work involved
   in evaluating the fingerprint.  The real world impact is not as
   great due to the caching of the fingerprint comparison result.
   Since most machines execute the same set of binaries over and over
   there is little overhead added.

Q: Have you seen the unofficial signed exec OpenBSD patch (aka TrojanProof)?
A: Yes.  My work predates it by a long while.  Version 1 of
   TrojanProof was pretty simple.  It evaluated the signature each time
   thus killing both demand paging and performance.  In version 2 it
   looks like the TrojanProof author has used some of my ideas to
   improve performance, giving me due credit which was good.

Q: What hash are you using?
A: either md5 or sha1, it can be selected on a file by file basis

Q: You know that md5 is broken?
A: yes.  The two algorithms are there to demonstrate the framework for
   adding different hashes.  The internals have been done to try and
   make it easier to "bolt on" new crytographic hashes with a minimum
   of fuss - having two different hashes is supposed to show how to do
   this.

Q: Doesn't chflags(1) do all this already?
A: Not really.  It can be used to do some of the work but there are
   some things it cannot do like prevent a file from being executed
   nor can it give any confidence that what you are executing has not
   been tampered with.

Q: Is this finished?
A: not by a long shot.  The basic system works at protecting the
   executables and files but there is work that needs to be done to
   strengthen the protection offered during the early stages of the
   boot process.

-- 
Brett Lymn