Subject: Re: current kernel on amd64 crashes
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Christoph Egger <Christoph_Egger@gmx.de>
List: port-xen
Date: 01/14/2008 11:21:45
On Thursday 10 January 2008 14:55:52 Manuel Bouyer wrote:
> On Thu, Jan 10, 2008 at 02:43:02PM +0100, Christoph Egger wrote:
> > > What is the "PIT timer" ?
> >
> > the i8254 thingy.
> >
> > I get a series of endless xen messages:
> >
> > (XEN) i8254.c:534:d1 PIT bad access
> > (XEN) i8254.c:534:d1 PIT bad access
> > (XEN) i8254.c:534:d1 PIT bad access
> > (XEN) i8254.c:534:d1 PIT bad access
> > (XEN) i8254.c:534:d1 PIT bad access
> >
> > The message comes from Xen in xen/arch/x86/hvm/i8254.c,
>
> Ha OK, it's on a HVM guest.
> I guess the relevant NetBSD code is x86/isa/clock.c
>
> Could it be this change ?
> -       low = inb(IO_TIMER1 + TIMER_CNTR0);
> -       high = inb(IO_TIMER1 + TIMER_CNTR0);
> -       count = rtclock_tval - ((high << 8) | low);
> -
> +       /* insb to make the read atomic */
> +       insb(IO_TIMER1+TIMER_CNTR0, &rdval, 2);
> +       count = rtclock_tval - rdval;

Yes, the "rep insb" was it. And it managed to accurately do a cross-page 
access. The problem was, Xen could not handle cross-page accesses for
various emulated devices (PIT, RTC, VGA and more).

Xen 3.2.0-rc6 now can handle that. However, I am still not able to run
an installation in a HVM guest (amd64cd.iso from releng 20080107).
NetBSD crashes when it wants to untar the first set right after selecting 
the "progress bar".  Something is going wrong with the softclk kernel thread.
This is not necessarily a NetBSD bug, it may be Xen handles certain 
instructions wrong.


This is what I get (unfortately install kernels don't have a symbol table):

    Status: Running
  Command: progress -zf /mnt2//amd64/binary/sets/kern-GENERIC.tgz 
tar --chroot -xhepf -

-------------------------------------------------------------------------------------
  0% |                                      |    0     0.00 KiB/s   --:-- ETAk
ernel: page fault tarp, code=0
Stopped in pid 0.5 (system) at 0xffffffff804acd8e: repe insl %dx,%es:(%rdi)
db{0}> show reg
ds   0x10
es  0x3520
fs  0
gs  0x7000
rdi 0xffff8000088c2ffe
rsi 0x170
rbp 0xffff80000a359db8
rbx 0xffff8000005318a8
rdx 0x170
rcx 0xffffffffffffffff
rax 0x1
r8  0
r9  0
r10 0xffff800000533000
r11 0x8
r12 0x2
r13 0xffff8000088c2ffe
r14 0xffff800000533520
r15 0x2
rip  0xffffffff804acd8e
cs  0x8
rflags 0x10246
rsp  0xffff80000a359d90
ss  0x10
0xffffffff804acd8e: repe insl %dx,%es:(%rdi)
db{0}> bt
?() at 0xffffffff804acd8e
?() at 0xffffffff804ba957
?() at 0xffffffff802bc1e7
?() at 0xffffffff8049f43f
?() at 0xffffffff80107e1f
[...]
db{0}> ps /l
PID     LID S     FLAGS   STRUCT LWP *   NAME WAIT
1278    1  3          84         ffff80000ad30ba0 tar pipe
1277    1  7    20000004 ffff80000abf2080  gzip
1274    1  2          4           ffff80000abf2340  progress
1261    1  2         4            ffff80000abf28c0  sysinst
1254    1  3         84          ffff80000abe2060 sh  wait
10        1  3         84          ffff80000abf2b80  mount_mfs mfsidl
1          1  3         84           ffff80000995330   init wait
>0      20 3        204         ffff80000abf2600  physiod physiod
           19 3        204         ffff80000abe2320  vmem_rehash vmem_rehash
           18 2        204          ffff80000abe25e0  aiodoned
           17 3        204          ffff80000abe28a0  ioflush syncer
           16 3        204          ffff80000abe2b60 pgdaemon pgdaemon
           15 3        204          ffff800009953040  atapibus0 sccomp
           14 3        204          ffff8000099535c0  cryptoret crypto_wait
           13 3        204         ffff800009953880  atabus1 atath
           12 3        204         ffff800009953b40  atabus0 atath
           11 3        204         ffff80000994b020  pms0 pmsreset
           10 3        204         ffff80000994b2e0  sysmon smtaskq
             9 3        204         ffff80000994b5a0  pmfevent pmfevent
             8 3        204         ffff80000994b860   vrele vrele
             7 3   80000204    ffff80000994bb20   xcall/0 xcall
             6 1   80000204    ffff800009949000  softser/0
        >  5 7   a0000204    ffff8000099492c0  softclk/0
            4  1   80000204    ffff800009949580  softbio/0
            3  1   80000204    ffff800009949840  softnet/0
            2  1   80000205    ffff800009949b00  idle/0
            1  3         204          ffffffff80b207a0     swapper schedule
db{0}>