How to keep the kernel from crashing on cd9660 error ?

To: tech-kern%netbsd.org@localhost
Subject: How to keep the kernel from crashing on cd9660 error ?
From: "Thomas Schmitt" <scdbackup%gmx.net@localhost>
Date: Tue, 03 Jun 2014 18:31:52 +0200

Hi,

i could need advise about getnewvnode(9) and how to revoke
the creation of the vnode.

While testing my next change proposal for stability with
undigestible ISO 9660 files, i experienced kernel crashes which
look like memory corruption.

To prove that my changes are not to blame, i installed a little
error generator in the current cd9660_vfsops.c, at the place
where my new code will throw EOPNOTSUPP because of an undigestible
file.

It triggers the same crash as the real error complaint in my
changed code. So the problem already sits in cd9660.

I could possibly fake an ISO image which would trigger an error
condition that is already in function cd9660_vget_internal() and
very near to the spot where my test causes havoc.

So this could be a DoS attack path.

---------------------------------------------------------------

What happens in cd9660_vget_internal() is about this:

- Input is the inode number.

- Shortcut is tried for cached vnode. No problem if it triggers.

- getnewvnode() obtains a new vnode.
  (It is needed at latest, when the directory record of the desired
   ino number shall be read. So this creation cannot be delayed after
   the error situation which is triggered by that record.)

- pool_get() obtains a iso_node for the new vnode.

- Obviously a check for race condition is made. (No problem.)

- Several operations are done which have the potential to cause
  an error. Most of them do in this case

                vput(vp);
                if (bp != 0)
                        brelse(bp, 0);
                return (E...);

  So wanted i. But that seems to be a bad idea.

My mock-up in current cd9660_vfsops.c throws an error with every
third VOP_LOOKUP(9) or VOP_VGET(9) call.
It survives the first such error occasion and crashes on the
second occasion.
---------------------------------------------------------------
--- cd9660_vfsops.c.patch_006   2014-06-01 13:16:27.000000000 +0000
+++ cd9660_vfsops.c     2014-06-03 15:47:32.000000000 +0000
@@ -858,6 +858,19 @@ cd9660_vget_internal(struct mount *mp, i
                break;
        }
 
+/* <<< Error mock-up */
+{ static uint64_t error_cycler = 0;
+       error_cycler++;
+       if ((error_cycler % 3) == 0) {
+               printf("cd9660_vfsops.c: Deliberate error EOPNOTSUPP\n");
+               vput(vp);
+               if (bp != 0)
+                       brelse(bp, 0);
+               return (EOPNOTSUPP);
+       }
+}
+
+
        if (bp != 0)
                brelse(bp, 0);
---------------------------------------------------------------
(I am aware there is a resource leak about iso_node.)

With this kernel booted, i do

  netbsd# mount_cd9660  '/dev/wd1f' '/mnt/iso'
  netbsd# ls -l /mnt/iso
  ls: my: Operation not supported
  total 8
  dr-x------  1 thomas  wheel  2048 May  3 14:58 dev
  dr-x------  1 thomas  wheel  2048 Jan 19 14:41 reg
  -r--------  1 thomas  dbus      6 May  6 15:34 small_file
  netbsd# ls -l /mnt/iso

This yields crash and reboot.

  netbsd# crash
  crash> dmesg
  ...
  cd9660_vfsops.c: Deliberate error EOPNOTSUPP
  panic: kernel diagnostic assertion "(*vpp)->v_size != VSIZENOTSET && 
(*vpp)->v_writesize != VSIZENOTSET" failed: file 
"/usr/src/sys/kern/vnode_if.c", line 124
cpu0: Begin traceback...
  vpanic(c2f5d840,c26a8800,2,0,daabdf68,c093e072,c2fb4d40,ffffff9c,bb90a3f4,0) 
at netbsd:vpanic+0x120
  cpu0: End traceback...

I only see the message of the first occasion. The second one did
not come through. But i am quite sure a second one happened.
At least i had to "ls -l" my bad file two times, before i began
to worsen the situation by adding diagnostic code.

---------------------------------------------------------------

What makes me think of memory corruption:

- Varying last screams in crash command "dmesg", when i tried to hunt
  down the problem in my changed code.

- Unplausible code paths. E.g. above KASSERT in
    /usr/src/sys/kern/vnode_if.c
  is supposed to get in effect with error == 0, but triggers
  only if cd9660 is supposed to have returned error != 0.

- Symptoms getting worse if i insert printf() to trace the
  upward propagation of the error return value.
  It crashes already on the first error occasion and with more
  dramatic messages in crash's dmesg:

    uvm_fault(0xc2a2a920, 0, 2) -> 0xe
    fatal page fault in supervisor mode
    trap type 6 code 2 eip c02579d1 cs 8 eflags 10282 cr2 1c ilevel 0 esp 
c0943d3d
    curlwp 0xc2fb5d40 pid 815 lid 1 lowest kstack 0xda96b2c0
    panic: trap
    cpu0: Begin traceback...
    uvm_fault(0xc2a2a920, 0, 1) -> 0xe
    fatal page fault in supervisor mode
    trap type 6 code 0 eip c029e324 cs 8 eflags 10246 cr2 6 ilevel 0 esp 0
    curlwp 0xc2fb5d40 pid 815 lid 1 lowest kstack 0xda96b2c0
    Skipping crash dump on recursive panic
    panic: trap
    Faulted in mid-traceback; aborting...

---------------------------------------------------------------

My question is: How i shall repair this function, so that it
can revoke the creation of the vnode in case of errors which
tell that the vnode will be unusable or worse.

(The actual test object is a data file with two sections.
 The first is not aligned to block size. So VOP_BMAP(9) cannot
 neatly map file blocks to partition blocks.
 Debian 6 GNU/Linux tolerates such a file but shows wrong
 content, partly from a different data file.)


Have a nice day :)

Thomas

Follow-Ups:
- Re: How to keep the kernel from crashing on cd9660 error ?
  - From: J. Hannken-Illjes

Prev by Date: Re: RFC: mpsafe bridge and NIC drivers (vioif and wm)
Next by Date: Re: RFC: mpsafe bridge and NIC drivers (vioif and wm)
Previous by Thread: RFC: mpsafe bridge and NIC drivers (vioif and wm)
Next by Thread: Re: How to keep the kernel from crashing on cd9660 error ?
Indexes:

Home | Main Index | Thread Index | Old Index