Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: guests not starting properly on 7/amd64 dom0 freshly with xen45



Le 29/12/2015 20:53, S.P.Zeidler a écrit :
> I get in /var/log/messages:
> Dec 29 18:39:10 pkgbuild-DOM0 /netbsd: xvif1i0: Ethernet address
> 00:16:3e:31:30:61
> Dec 29 18:39:11 pkgbuild-DOM0 /netbsd: xbd backend domain 1 handle 0xca00
> (51712) using event channel 19, protocol x86_64-abi
> 
> with the config being:
> kernel = "/home/sets/amd64-nb7/netbsd-XEN3_DOMU.gz"
> memory = 8192
> name = "amd64-nb7"
> vcpus = 3
> cpus = [ "13", "14", "15" ]
> vif = [ 'mac=00:16:3e:30:30:61, bridge=bridge0' ]
> disk = [ '/dev/sd0k,raw,xvda,rw' ]
> 
> May I consider the log messages as indication that both xennet0 and xbd0
> in fact do get configured?

At least for network, yes, xennet(4) will not log the transfer mode (in
your case, "RX copy") if it could not initialize properly.

The surest way for this is to have a look in xenstore, and check the
"state" for each devices used. Taking DOMID as "1", within dom0:

# xenstore-ls /local/domain/1/device
...

and look for the "state" entry, where it should be "4".

FWIW, the other possibles values are:

enum xenbus_state {
    XenbusStateUnknown       = 0,
    XenbusStateInitialising  = 1,

    /*
     * InitWait: Finished early initialisation but waiting for information
     * from the peer or hotplug scripts.
     */
    XenbusStateInitWait      = 2,

    /*
     * Initialised: Waiting for a connection from the peer.
     */
    XenbusStateInitialised   = 3,
    XenbusStateConnected     = 4,

    /*
     * Closing: The device is being closed due to an error or an unplug
event.
     */
    XenbusStateClosing       = 5,
    XenbusStateClosed        = 6,

    /*
     * Reconfiguring: The device is being reconfigured.
     */
    XenbusStateReconfiguring = 7,
    XenbusStateReconfigured  = 8
};

Ideally you have to check that their backend counterpart is also in
state "4" too (have a look at the "backend" entry for each domU device
to know their path).

> 
> Interfaces:
> bridge0: flags=41<UP,RUNNING> mtu 1500
> xvif1i0: flags=8822<BROADCAST,NOTRAILERS,SIMPLEX,MULTICAST> mtu 1500
>         capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
> 		enabled=0
> 		address: 00:16:3e:31:30:61

This looks normal.

> disk:
>  k: 188743680 1262524928     4.2BSD      0     0     0  # (Cyl.  43837*-50391*)
> 
> (and no, it is not mounted by anything else)

Does it have a vnd(4) attached to it in dom0?

> 
> xl dmesg only says:
> (XEN) d1 attempted to change d1v1's CR4 flags 00002660 -> 00000620
> (XEN) d1 attempted to change d1v2's CR4 flags 00002660 -> 00000620

Should be harmless for your case.

> 
> and booting xen-debug has not increased verbosity.

I doubt that the hypervisor is at fault here, like John suggested I am
also expecting some backend/frontend (miss-)connection.

> 
> starting the domU with 1 vcpu and 1024 MB ram to make some lists shorter:
> db{0}> ps
> PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
> 1        1 3   0         0   ffffa000021900c0               init lbolt
> 0       30 3   0       200   ffffa00002190900        xen_balloon xen_balloon
> 0       29 3   0       200   ffffa000021904e0          cryptoret crypto_w
> 0       28 3   0       200   ffffa00002191920              unpgc unpgc
> 0       27 3   0       200   ffffa00002191500        vmem_rehash vmem_rehash
> 0       26 3   0       200   ffffa00002192940             xenbus rdst
> 0       25 3   0       200   ffffa000021910e0           xenwatch evtsq
> 0       16 5   0       200   ffffa00002180080           (zombie)
> 0       15 3   0       200   ffffa000021804a0         pmfsuspend pmfsuspend
> 0       14 3   0       200   ffffa000021808c0           pmfevent pmfevent
> 0       13 3   0       200   ffffa00001e25060         sopendfree sopendfr
> 0       12 3   0       200   ffffa00001e25480           nfssilly nfssilly
> 0       11 3   0       200   ffffa00001e258a0            cachegc cachegc
> 0       10 3   0       200   ffffa00001e23040              vrele vrele
> 0        9 3   0       200   ffffa00001e23460             vdrain vdrain
> 0        8 3   0       200   ffffa00001e23880          modunload mod_unld
> 0        7 3   0       200   ffffa00001e19020            xcall/0 xcall
> 0        6 1   0       200   ffffa00001e19440          softser/0
> 0        5 1   0       200   ffffa00001e19860          softclk/0
> 0        4 1   0       200   ffffa00001e16000          softbio/0
> 0        3 1   0       200   ffffa00001e16420          softnet/0
> 0    >   2 7   0       201   ffffa00001e16840             idle/0
> 0        1 3   0       200   ffffffff8060b220            swapper 
> ce: pid 1 lid 1 at 0xffffa0002e476da0
> sleepq_block() at netbsd:sleepq_block+0xa2
> cv_wait() at netbsd:cv_wait+0x9a
> start_init() at netbsd:start_init+0x70
> db{0}> bt/a ffffffff8060b220
> trace: pid 0 lid 1 at 0xffffffff80b0de58
> sleepq_block() at netbsd:sleepq_block+0xa2
> cv_wait() at netbsd:cv_wait+0x9a
> config_finalize() at netbsd:config_finalize+0x30
> main() at netbsd:main+0x406
> db{0}> x/x config_pending
> netbsd:config_pending:  1
> 
> but there's no config thread still running.
> 
> Trying a DEBUG_AUTOCONF kernel next (but since I currently start work
> unholy early, this may not get tested before tomorrow).

Given that you have the infinite hang after balloon got initialized, I
think the issue lies with the block device. There is no xbd0: ... line
that gives information about the geometry of your block device, so I
think it failed initializing it.

Usually when evereything goes well in domU you have something like:

xbd0: *** MB, *** bytes/sect, *** sectors

I suppose it "hangs" as it cannot mount / and execute init from it.

Various possibilities:
- vnd(4) not being able to mount the block device within dom0;
- failed connecting the event-channel between xbdbdack(4) and xbd(4)
(xenstore daemon failure, can happen when mixing xentools of different
revisions);
- invalid label (start and size out of bounds).

Easiest way to debug this is to insert a couple of echoes in
/usr/pkg/etc/xen/scripts/block and see which operation is failing.

Regards

-- 
Jean-Yves Migeon


Home | Main Index | Thread Index | Old Index