Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Strange boot problems on amd64-current (6.99.40)



Someone have to step up and bisect the whole kernel tree.

On Sun, May 11, 2014 at 11:04 PM, Chavdar Ivanov <ci4ic4%gmail.com@localhost> 
wrote:
> I have sumilar problem - Ive discussed it earlier.
>
>
> C/P from my earlier mail to this list:
> ...
> root file system type: ffs
> fatal protection fauilt in supervisor mode
> trap type 4 code 0 rip ffffffff80511a2a cs 8 rflags 10246 cr2
> ffff800036c88000 ilevel 0 rsp fffffe8006d08a20
> curlwp 0xfffffe8006d4da00 pid 1.1 lowest kstack 0xfffffe8006d052c0
> kernel: protection fault trap, code=0
> Stopped in pid 1.1 (init) at    netbsd:check_exec+0x319:      call     
> *8(%rax)
>
> db{1}>  bt
> check_exec() at netbsd:check_exec+0x319
> execve_loadvm() at netbsd:execve_loadvm+0x1ca
> execve1() at netbsd:execve1+0x28
> start_init() at netbsd:start_init+0x26f
> db{1}>
> .....
>
> I've tried bisecting with no success. We've located the exact place of
> the crash (Masao pointed to me, I sprinkled printfs in the code).
>
> Still no idea what is the problem.
>
> .
>
>
> On 11 May 2014 14:15, Paul Goyette <paul%whooppee.com@localhost> wrote:
>> BTW, I just checked a 6.99.41 GENERIC kernel, and it fails in the same
>> manner.  So this is not likely a result of my customized kernel config.
>>
>> I'm suspecting that this is due to booting from an auto-configured raid
>> mirror.
>>
>>
>>
>> On Sat, 10 May 2014, Paul Goyette wrote:
>>
>>> On Sat, 10 May 2014, Jonathan A. Kollasch wrote:
>>>
>>>> On Sat, May 10, 2014 at 04:17:10PM -0700, Paul Goyette wrote:
>>>>>
>>>>> 15:43:14). With no other changes than the updated kernel (and
>>>>> modules), it crashes with
>>>>>
>>>>> kernel: pagefault trap, code=0
>>>>> uvm_fault(0xfffffe813aec5e60, 0x0, 2) -> e
>>>>> fatal page fault in supervisor mode
>>>>> trap type 6 code 2 rip ffffffff80230fe2 cs 8 rflags 10246 cr2 0
>>>>> ilevel 8 rsp fffffe813aebb048
>>>>> curlwp 0xfffffe813aec8880 pid 1.1 lowest kstack fffffe8a3aebc2c0
>>>>>
>>>>> The above messages repeat several times, until reaching the bottom
>>>>> of the screen, with a db-more prompt.
>>>>>
>>>>> Funny thing is, the rip reported seems to be in the middle of
>>>>> setting the keyboard!
>>>>>
>>>>
>>>> Is this a DEBUG and/or DIAGNOSTIC kernel?
>
> The funny thing is, with DEBUG and DIAGNOSTIC it works fine. And yes,
> I also have autoconfigured RAID1 root - and another RAID5 array in
> this system
>
>>>
>>> No.  Neither option is included in the kernel config file.
>>>
>>>>> WARNING: double match for boot device (wd0, wd1)
>>>>> raid0: RAID Level 1
>>>>> raid0: Components: /dev/wd0e /dev/wd1e
>>>>> raid0: Total sectors 488395008 (238474 MB)
>>>>> raid1: RAID Level 1
>>>>> raid1: Components: /dev/wd2e /dev/wd2e
>>>>> raid1: Total sectors 976770944 (476938 MB)
>>>>> boot device: raid0
>>>>> root on raid0e dumps on raid0b
>>>>> warning: no /dev/console
>>>>> exec /sbin/init: error 2
>>>>
>>>>
>>>> raid0e looks weird.
>>>
>>>
>>> There is no raid0e.
>>>
>>> wd0e and wd1e are partitioned as follows:
>>>
>>> screamer:netbsd-local {129} disklabel wd0
>>> # /dev/rwd0d:
>>> type: ESDI
>>> disk: ST3250318AS
>>> <snip>
>>> 5 partitions:
>>> #        size    offset  fstype [fsize bsize cpg/sgs]
>>> c: 488395120      2048  unused      0     0        # (Cyl.   2*- 484520)
>>> d: 488397168         0  unused      0     0        # (Cyl.   0 - 484520)
>>> e: 488395120      2048    RAID                     # (Cyl.   2*- 484520)
>>>
>>>
>>> And raid0 is partitioned as
>>>
>>> # /dev/rraid0d:
>>> type: RAID
>>> disk: raid
>>> <snip>
>>> 6 partitions:
>>> #        size    offset     fstype [fsize bsize cpg/sgs]
>>> a:  41943040         0  4.2BSD   2048 16384     0  # (Cyl.      0 -
>>> 40959)
>>> b:  62914560  41943040    swap                     # (Cyl.  40960 -
>>> 102399)
>>> c: 488395008         0  unused      0     0        # (Cyl.      0 -
>>> 476948*)
>>> d: 488395008         0  unused      0     0        # (Cyl.      0 -
>>> 476948*)
>>> e: 125829120 104857600  4.2BSD   2048 16384     0  # (Cyl. 102400 -
>>> 225279)
>>> f: 257708288 230686720  4.2BSD   2048 16384     0  # (Cyl. 225280 -
>>> 476948*)
>>>
>>>
>>> And raid0 config looks like this:
>>>
>>> # raid0.conf RAID-1 configuration
>>> #
>>> # This array should be made bootable!
>>>
>>> # Describe the array
>>> START array
>>>
>>> #numrow numcol numspare
>>> 1 2 0
>>>
>>> # Identify physical disks
>>> START disks
>>> /dev/wd0e
>>> /dev/wd1e
>>>
>>> # Layout is simple - 64 sectors per stripe
>>> START layout
>>>
>>> #Sect/StripeUnit StripeUnit/ParityUnit StripeUnit/ReconUnit RaidLevel
>>> 128 1 1 1
>>>
>>> # No spares
>>> #START spare
>>>
>>> # Command queueing
>>> START queue
>>> fifo 100
>>>
>>>
>>>
>>>
>>>
>>> -------------------------------------------------------------------------
>>> | Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:       |
>>> | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com    |
>>> | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
>>> | Kernel Developer |                          | pgoyette at netbsd.org  |
>>> -------------------------------------------------------------------------
>>>
>>
>> -------------------------------------------------------------------------
>> | Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:       |
>> | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com    |
>> | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
>> | Kernel Developer |                          | pgoyette at netbsd.org  |
>> -------------------------------------------------------------------------
>
>
> Chavdar Ivanov
>
>
> --
> ----


Home | Main Index | Thread Index | Old Index