NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NVMM not working, NetBSD 9x amd64



On Wed, 15 Jul 2020 at 11:08, Chavdar Ivanov <ci4ic4%gmail.com@localhost> wrote:
>
> Hi,
>
> I decided to reuse this thread; nvmm again ceased to work from yesterday.
>
> On
>
> # uname -a
> NetBSD ymir 9.99.69 NetBSD 9.99.69 (GENERIC) #15: Tue Jul 14 11:07:52
> BST 2020  sysbuild@ymir:/home/sysbuild/amd64/obj/home/sysbuild/src/sys
> /arch/amd64/compile/GENERIC amd64
>
> I can 'modload nvmm', but when I try to start a vm with nvmm
> acceleration, I get a hard lock, immediately after the message about
> the interface being initialized. I cannot break into the debugger to
> trace and I don't get a dump on reboot. It appears the machine is in a
> deep CPU loop, although it doesn't appear too  hot.
>
> I then tried booting onetbsd, which is from the 12th of July and on
> which nvmm used to work just fine. It is also the same micro version -
> 9.99.59, so n theory should work - but in this case I get a panic when
> I 'modload nvmm' - again, I see the short panic message on the screen
> and the machine apparently gets into another loop here, which I cannot
> break the usual way into the debugger and the only thing I can do is
> hit the power button. There weren't that many kernel changes in this
> period, most notably the per-CPU IDT patch, but I don't know if it is
> relevant.
>

I rebuilt my system again today, this time I managed to get a core
dump after the panic:

 crash -M netbsd.22.core -N netbsd.22
Crash version 9.99.69, image version 9.99.69.
crash: _kvm_kvatop(0)
Kernel compiled without options LOCKDEBUG.
System panicked: trap
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NARCNET() at 0
?() at ffffa0819ba16000
sys_reboot() at sys_reboot
vpanic() at vpanic+0x15b
snprintf() at snprintf
startlwp() at startlwp
calltrap() at calltrap+0x19
kqueue_register() at kqueue_register+0x43e
kevent1() at kevent1+0x138
sys___kevent50() at sys___kevent50+0x33
syscall() at syscall+0x26e
--- syscall (number 435) ---
syscall+0x26e:

Any ideas?

The dmesg shows, BTW:

Jul 15 14:09:33 ymir /netbsd: [ 108.7517032] nvmm0: attached, using
backend x86-vmx
Jul 15 14:11:40 ymir syslogd[946]: restart
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] fatal protection fault in
supervisor mode
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] trap type 4 code 0x323
rip 0xffffffff80c89e21 cs 0x8 rflags 0x10282 cr2 0x784321f9f000 ilevel
0
 rsp 0xffffa0819ba1ac50
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] curlwp 0xffffd066fc45b100
pid 2869.2869 lowest kstack 0xffffa0819ba162c0
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] panic: trap
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] cpu0: Begin traceback...
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] vpanic() at netbsd:vpanic+0x152
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] snprintf() at netbsd:snprintf
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] startlwp() at netbsd:startlwp
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] alltraps() at netbsd:alltraps+0xc3
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] kqueue_register() at
netbsd:kqueue_register+0x43e
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] kevent1() at netbsd:kevent1+0x138
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] sys___kevent50() at
netbsd:sys___kevent50+0x33
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] syscall() at netbsd:syscall+0x26e
Jul 15 14:11:40 ymir /netbsd: [ 131.2116186] --- syscall (number 435) ---
Jul 15 14:11:40 ymir /netbsd: [ 131.2216185] netbsd:syscall+0x26e:
Jul 15 14:11:40 ymir /netbsd: [ 131.2216185] cpu0: End traceback...
Jul 15 14:11:40 ymir /netbsd:
Jul 15 14:11:40 ymir /netbsd: [ 131.2216185] dumping to dev 168,2
(offset=8, size=5225879):
Jul 15 14:11:40 ymir /netbsd: [ 131.2216185] dump <5>ktrace timeout
Jul 15 14:11:40 ymir /netbsd: ktrace timeout
Jul 15 14:11:40 ymir /netbsd: [ 131.2216185] ktrace timeout
Jul 15 14:11:40 ymir syslogd[946]: last message repeated 2 times

> Chavdar
>
> On Wed, 20 May 2020 at 22:09, Maxime Villard <max%m00nbsd.net@localhost> wrote:
> >
> > Le 09/05/2020 à 10:54, Maxime Villard a écrit :
> > > Le 01/05/2020 à 19:13, Chavdar Ivanov a écrit :
> > >> On Fri, 1 May 2020 at 13:59, Rhialto <rhialto%falu.nl@localhost> wrote:
> > >>>
> > >>> On Sun 26 Apr 2020 at 21:39:12 +0200, Maxime Villard wrote:
> > >>>> Maybe I should add a note in the man page to say that you cannot expect a CPU
> > >>>> from before ~2010 to have virtualization support.
> > >>>
> > >>> Or even better, what one should look for in the output of, for example,
> > >>> "cpuctl identify 0". Since I didn't exactly know, I made some guesses
> > >>> and assumed that my cpu ("Intel(R) Core(TM) i3-2120 CPU @ 3.30GHz")
> > >>> did't have the required features (it is from 2009 or so).  But this
> > >>> thread inspired me to modload nvmm, which actually helped, so I found
> > >>> out that it even works on this cpu.
> > >
> > > On Intel CPUs the information is hidden in privileged registers that cpuctl
> > > cannot access, so no, it won't be possible.
> > >
> > > However the day before I had added clear warnings:
> > >
> > >      https://mail-index.netbsd.org/source-changes/2020/04/30/msg116878.html
> > >
> > > So now it will tell you what's missing.
> > >
> > >>> Of course I immediately tried it with Haiku (the BeOS clone) from
> > >>> https://download.haiku-os.org/nightly-images/x86_64/ and I got mixed
> > >>> results. Once it manages to boot it works fine and nicely fast (much
> > >>> better than without nvmm), but quite often it crashes into its kernel
> > >>> debugger during the first 10 seconds of booting, with different messages
> > >>> (I have seen "General Protection Exception" and "ASSERT failed ...
> > >>> fCPUCount >= 0").  ("qemu-system-x86_64 -accel nvmm -m 2G -cdrom
> > >>> haiku-master-hrev54106-x86_64-anyboot.iso" on a 9.0 GENERIC kernel)
> > >
> > > This was a missing filtering in the CPU identification, on CPUs that have SMT,
> > > leading Haiku to believe it had SMT threads that it didn't.
> > >
> > >      https://mail-index.netbsd.org/source-changes/2020/05/09/msg117188.html
> > >
> > > As far as I can tell, your CPU has SMT.
> > >
> > >> I've never used Haiku so far; upon reading this I decided to try it on
> > >> my NetBSD-current laptop with nvmm.
> > >>
> > >> So far, with several attempts, it works with no problem whatsoever,
> > >> directly booting the newest image on the site pointed above.
> > >>
> > >> Another OS to play with...
> > >>
> > >> The host cpu is Intel(R) Core(TM) i7-3820QM CPU @ 2.70GHz, id 0x306a9.
> > >
> > > This CPU too has SMT.
> > >
> > > Le 01/05/2020 à 20:10, Rhialto a écrit :
> > >> There might well be an improvement between 9.0 and -current, of course.
> > >> It's good to hear that it works for you; I might upgrade to a -current
> > >> kernel.
> > >
> > > Overall, no, each improvement in -current is propagated to 9, so you should
> > > get the same results on both (modulo kernel bugs added in places not
> > > related to NVMM).
> > >
> > > Le 01/05/2020 à 20:52, Chavdar Ivanov a écrit :
> > >> Earlier I had similar issues with OmniOS under qemu-nvmm - sometimes
> > >> it worked without a problem, sometimes I couldn't even boot. I still
> > >> have no idea why.
> > >
> > > Maybe that's the same problem, I'll test.
> >
> > I tested the other day, and I saw no problem. With debugging I noticed that
> > OmniOS, too, uses the CPU information that used to be mis-reported by NVMM,
> > so probably my fix must have helped.
> >
> > Please confirm the issues are fixed (HaikuOS+OmniOS).
>
>
>
> --
> ----



-- 
----


Home | Main Index | Thread Index | Old Index