Port-alpha archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Making port-alpha usable again



In preparation for NetBSD 8 I'm giving the Alpha port a try again. So far I'm seeing a number of little issues and one big issue. All of the below is observed while testing on two AlphaServer DS20L machines, although I also saw the same disk corruption issue on an AlphaServer DS25 a while back, too.

One, the installer doesn't work on an empty disk. Running disklabel manually to create a disklabel and writing it to disk makes the installer capable of using the disk, newfs'ing it and running installboot against it.

dhpcd on fxp1 (which corresponds to eia0, not eia1, for some reason - I feel like there was an explanation given at some point that I forget) gives a continuous loop of:

hera# dhcpcd fxp1
dhcpcd[2596]: version 6.7.1 starting
dhcpcd[2596]: DUID 00:01:00:01:20:f0:69:c5:00:02:56:00:0e:56
dhcpcd[2596]: fxp1: IAID 56:00:0e:56
dhcpcd[2596]: fxp1: soliciting an IPv6 router
dhcpcd[2596]: fxp1: soliciting a DHCP lease
dhcpcd[2596]: fxp1: offered 10.0.100.106 from 10.0.100.1
dhcpcd[2596]: fxp1: Router Advertisement from fe80::ec4:7aff:feb5:e10b
dhcpcd[2596]: fxp1: adding address 2001:470:a068:1:3132:7c74:e953:c499/64
dhcpcd[2596]: fxp1: adding route to 2001:470:a068:1::/64
dhcpcd[2596]: fxp1: adding default route via fe80::ec4:7aff:feb5:e10b
dhcpcd[2596]: fxp1: carrier lost
dhcpcd[2596]: fxp1: deleting address 2001:470:a068:1:3132:7c74:e953:c499/64
dhcpcd[2596]: fxp1: deleting default route via fe80::ec4:7aff:feb5:e10b
dhcpcd[2596]: fxp1: deleting route to 2001:470:a068:1::/64
dhcpcd[2596]: fxp1: carrier acquired

Note that this happens with the motherboard ethernet, but not with an Intel gigabit PCI card, and the port can do 100 Mbps with no problems if the port is configured manually.

A multiprocessor kernel still hangs (hard - I've never gotten into the debugger when it happens) after a day or two of compiling. This may just be an impression, but this appears to happen less often than with NetBSD from a year ago. The fact that it happens so infrequently makes it difficult to try to narrow down activity which causes it.

We have I2C busses which show 8 and 14 devices, but the device nodes for /dev/iic0 and iic1 needed to be made manually. I also had to uncomment the environment sensors in the kernel config. Is there a reason they're not in the kernel by default?

The times listed for some things in top are screwy:

  PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
11041 root      79    0    12M 5904K select/1   2:21 49.37% 49.37% cvs
    0 root     125    0     0K   11M vdrain/1    ??? 35.30% 35.30% [system]
  134 root      79    0    12M 5896K select/1   1:44 33.50% 33.50% cvs
 1841 john      85    0  3832K 2088K wait/0      ???  0.00%  0.00% sh
    1 root      85    0  3576K 1536K wait/1      ???  0.00%  0.00% init
 2483 root      84    0  3832K 1432K wait/1      ???  0.00%  0.00% sh
  637 john      85    0    15M 5328K select/0   0:01  0.00%  0.00% sshd

Most critically, when the filesystem has been in use for a while, it gets nuked. Sometimes the disklabel goes away, sometimes the filesystem suffers irreparable corruption, and occasionally it's OK (or seems OK) after an fsck.

This happens both with IDE (aceride0) and SCSI (esiop0). It happens when the kernel is booted off a small, read-only ffs filesystem at the beginning of the disk and the root filesystem is the rest of the disk. It happens with and without WAPBL (log), although it seems that with log it happens sooner, although that's just an impression.

If anyone has any thoughts about how best to diagnose the disk corruption issues, please do let me know. I'd love to see NetBSD 8 running properly on Alpha :)

John


Home | Main Index | Thread Index | Old Index