Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Strange boot problems on amd64-current (6.99.40)



I have sumilar problem - Ive discussed it earlier.


C/P from my earlier mail to this list:
...
root file system type: ffs
fatal protection fauilt in supervisor mode
trap type 4 code 0 rip ffffffff80511a2a cs 8 rflags 10246 cr2
ffff800036c88000 ilevel 0 rsp fffffe8006d08a20
curlwp 0xfffffe8006d4da00 pid 1.1 lowest kstack 0xfffffe8006d052c0
kernel: protection fault trap, code=0
Stopped in pid 1.1 (init) at    netbsd:check_exec+0x319:      call     *8(%rax)

db{1}>  bt
check_exec() at netbsd:check_exec+0x319
execve_loadvm() at netbsd:execve_loadvm+0x1ca
execve1() at netbsd:execve1+0x28
start_init() at netbsd:start_init+0x26f
db{1}>
.....

I've tried bisecting with no success. We've located the exact place of
the crash (Masao pointed to me, I sprinkled printfs in the code).

Still no idea what is the problem.

.


On 11 May 2014 14:15, Paul Goyette <paul%whooppee.com@localhost> wrote:
> BTW, I just checked a 6.99.41 GENERIC kernel, and it fails in the same
> manner.  So this is not likely a result of my customized kernel config.
>
> I'm suspecting that this is due to booting from an auto-configured raid
> mirror.
>
>
>
> On Sat, 10 May 2014, Paul Goyette wrote:
>
>> On Sat, 10 May 2014, Jonathan A. Kollasch wrote:
>>
>>> On Sat, May 10, 2014 at 04:17:10PM -0700, Paul Goyette wrote:
>>>>
>>>> 15:43:14). With no other changes than the updated kernel (and
>>>> modules), it crashes with
>>>>
>>>> kernel: pagefault trap, code=0
>>>> uvm_fault(0xfffffe813aec5e60, 0x0, 2) -> e
>>>> fatal page fault in supervisor mode
>>>> trap type 6 code 2 rip ffffffff80230fe2 cs 8 rflags 10246 cr2 0
>>>> ilevel 8 rsp fffffe813aebb048
>>>> curlwp 0xfffffe813aec8880 pid 1.1 lowest kstack fffffe8a3aebc2c0
>>>>
>>>> The above messages repeat several times, until reaching the bottom
>>>> of the screen, with a db-more prompt.
>>>>
>>>> Funny thing is, the rip reported seems to be in the middle of
>>>> setting the keyboard!
>>>>
>>>
>>> Is this a DEBUG and/or DIAGNOSTIC kernel?

The funny thing is, with DEBUG and DIAGNOSTIC it works fine. And yes,
I also have autoconfigured RAID1 root - and another RAID5 array in
this system

>>
>> No.  Neither option is included in the kernel config file.
>>
>>>> WARNING: double match for boot device (wd0, wd1)
>>>> raid0: RAID Level 1
>>>> raid0: Components: /dev/wd0e /dev/wd1e
>>>> raid0: Total sectors 488395008 (238474 MB)
>>>> raid1: RAID Level 1
>>>> raid1: Components: /dev/wd2e /dev/wd2e
>>>> raid1: Total sectors 976770944 (476938 MB)
>>>> boot device: raid0
>>>> root on raid0e dumps on raid0b
>>>> warning: no /dev/console
>>>> exec /sbin/init: error 2
>>>
>>>
>>> raid0e looks weird.
>>
>>
>> There is no raid0e.
>>
>> wd0e and wd1e are partitioned as follows:
>>
>> screamer:netbsd-local {129} disklabel wd0
>> # /dev/rwd0d:
>> type: ESDI
>> disk: ST3250318AS
>> <snip>
>> 5 partitions:
>> #        size    offset  fstype [fsize bsize cpg/sgs]
>> c: 488395120      2048  unused      0     0        # (Cyl.   2*- 484520)
>> d: 488397168         0  unused      0     0        # (Cyl.   0 - 484520)
>> e: 488395120      2048    RAID                     # (Cyl.   2*- 484520)
>>
>>
>> And raid0 is partitioned as
>>
>> # /dev/rraid0d:
>> type: RAID
>> disk: raid
>> <snip>
>> 6 partitions:
>> #        size    offset     fstype [fsize bsize cpg/sgs]
>> a:  41943040         0  4.2BSD   2048 16384     0  # (Cyl.      0 -
>> 40959)
>> b:  62914560  41943040    swap                     # (Cyl.  40960 -
>> 102399)
>> c: 488395008         0  unused      0     0        # (Cyl.      0 -
>> 476948*)
>> d: 488395008         0  unused      0     0        # (Cyl.      0 -
>> 476948*)
>> e: 125829120 104857600  4.2BSD   2048 16384     0  # (Cyl. 102400 -
>> 225279)
>> f: 257708288 230686720  4.2BSD   2048 16384     0  # (Cyl. 225280 -
>> 476948*)
>>
>>
>> And raid0 config looks like this:
>>
>> # raid0.conf RAID-1 configuration
>> #
>> # This array should be made bootable!
>>
>> # Describe the array
>> START array
>>
>> #numrow numcol numspare
>> 1 2 0
>>
>> # Identify physical disks
>> START disks
>> /dev/wd0e
>> /dev/wd1e
>>
>> # Layout is simple - 64 sectors per stripe
>> START layout
>>
>> #Sect/StripeUnit StripeUnit/ParityUnit StripeUnit/ReconUnit RaidLevel
>> 128 1 1 1
>>
>> # No spares
>> #START spare
>>
>> # Command queueing
>> START queue
>> fifo 100
>>
>>
>>
>>
>>
>> -------------------------------------------------------------------------
>> | Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:       |
>> | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com    |
>> | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
>> | Kernel Developer |                          | pgoyette at netbsd.org  |
>> -------------------------------------------------------------------------
>>
>
> -------------------------------------------------------------------------
> | Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:       |
> | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com    |
> | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
> | Kernel Developer |                          | pgoyette at netbsd.org  |
> -------------------------------------------------------------------------


Chavdar Ivanov


-- 
----


Home | Main Index | Thread Index | Old Index