Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Strange boot problems on amd64-current (6.99.40)



(Resending without attachments, as I seem to have exceeded a size limit)

My current working kernel (amd64) is a customized 6.99.28 built from sources that were current on 2013-12-14 at 23:34:52 UTC. Everything works correctly, although there have been some strange errors coming from my cd drive (compaints about "incompatable media").

I just tried to update to a 6.99.40 (sources from 2014-04-14 at 15:43:14). With no other changes than the updated kernel (and modules), it crashes with

kernel: pagefault trap, code=0
uvm_fault(0xfffffe813aec5e60, 0x0, 2) -> e
fatal page fault in supervisor mode
trap type 6 code 2 rip ffffffff80230fe2 cs 8 rflags 10246 cr2 0 ilevel 8 rsp fffffe813aebb048
curlwp 0xfffffe813aec8880 pid 1.1 lowest kstack fffffe8a3aebc2c0

The above messages repeat several times, until reaching the bottom of the screen, with a db-more prompt.

Funny thing is, the rip reported seems to be in the middle of setting the keyboard!

(gdb) disass 0xffffffff80230fe2
Dump of assembler code for function pckbport_set_poll:
   0xffffffff80230fd7 <+0>:     push   %rbp
   0xffffffff80230fd8 <+1>:     mov    %rsp,%rbp
   0xffffffff80230fdb <+4>:     movslq %esi,%rax
   0xffffffff80230fde <+7>:     mov    (%rdi,%rax,8),%rax
   0xffffffff80230fe2 <+11>:    mov    %edx,(%rax)
   0xffffffff80230fe4 <+13>:    mov    0x90(%rdi),%rax
   0xffffffff80230feb <+20>:    mov    0x98(%rdi),%rdi
   0xffffffff80230ff2 <+27>:    mov    0x28(%rax),%rax
   0xffffffff80230ff6 <+31>:    pop    %rbp
   0xffffffff80230ff7 <+32>:    jmpq   *%rax
End of assembler dump.


So, I tried to reboot, this time with a PS2 keyboard attached (as well as the original USB keyboard). This time, it gets all the way through boot, right until the very end, where it says:

WARNING: double match for boot device (wd0, wd1)
raid0: RAID Level 1
raid0: Components: /dev/wd0e /dev/wd1e
raid0: Total sectors 488395008 (238474 MB)
raid1: RAID Level 1
raid1: Components: /dev/wd2e /dev/wd2e
raid1: Total sectors 976770944 (476938 MB)
boot device: raid0
root on raid0e dumps on raid0b
warning: no /dev/console
exec /sbin/init: error 2

It then tries to exec oinit, init.bak, and rescue/init all resulting in the same "error 2"

For now, I have "retreated" to my older 6.99.28 kernel, with no problems. But I'd really like to move forward with this upgrade.

FWIW, I have several other machines, all running the exact same 6.99.40 kernel. Those other machines are running just fine. There are some significant differences in hardware configuration on those machines (as compared to the machine-with-the-problems). I suspect that one of the most important differences is that those other machines are NOT running raid.


The custom kernel config file, as well as the dmesg text from both kernels, are available.



-------------------------------------------------------------------------
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:       |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com    |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |                          | pgoyette at netbsd.org  |
-------------------------------------------------------------------------


Home | Main Index | Thread Index | Old Index