Port-zaurus archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

NetBSD/zaurus 8.1 problems and possible fixes



Hi,

I've tried to install NetBSD/zaurus 8.1 to my SL-C1000 since
last weekend, and notice there are multiple issue on it.

I've managed to solve most of them and even get a working
diskimage, but some of them are not investigated.
I would like to ask arm guys what happens on it.

There are the following problems on NetBSD/zaurus 8.1:

1. Resolevd issue

(1) Incorrect entry point in kernel LINKFLAGS

On netbsd-7, src/sys/arch/zaurus/conf/Makefile.zaurus.inc
explitly sets LINKFLAGS:
> LINKFLAGS=		-T ldscript

In this case, the entry address is specified in
src/sys/arch/zaurus/conf/ldscript.zaurus:
> ENTRY(KERNEL_BASE_phys)

However, between netbsd-7 and netbsd-8
src/sys/arch/zaurus/conf/Makefile.zaurus.inc was changed to use
the default LINKFLAGS, defined in MI src/conf/Makefile.kern.inc:
 http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/zaurus/conf/Makefile.zaurus.inc#rev1.8
 http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/zaurus/conf/Makefile.zaurus.inc#rev1.7
 http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/zaurus/conf/Makefile.zaurus.inc#rev1.6

In this case, the default ENTRYPOINT "start" is used and
"-e start" is specfied in LINKFLAGS via LINKENTRY.

As a result, "ENTRY(KERNEL_BASE_phys)" in ldscript is overriden,
so bootloader (zbsdmod.o) jumps to the wrong address.
On my SL-C1000 this triggers power-off.

Adding explicit LINKENTRY (and TEXTADDR) in Makefile.zaurus.inc
solves this issue:

---
Index: conf/Makefile.zaurus.inc
===================================================================
RCS file: /cvsroot/src/sys/arch/zaurus/conf/Makefile.zaurus.inc,v
retrieving revision 1.9
diff -u -p -d -r1.9 Makefile.zaurus.inc
--- conf/Makefile.zaurus.inc	25 Aug 2015 02:38:15 -0000	1.9
+++ conf/Makefile.zaurus.inc	22 Oct 2019 04:49:32 -0000
@@ -20,6 +20,8 @@ SYSTEM_LD_TAIL_EXTRA+=; \
 KERNEL_BASE_VIRT=	$(LOADADDRESS)
 
 KERNLDSCRIPT=		ldscript
+TEXTADDR=		# defined in ldscript
+LINKENTRY=		# defined in ldscript
 
 EXTRA_CLEAN+=		netbsd.map assym.d ldscript tmp
 
---

(2) Unexpected zbsdmod.o (Linux kmod) behavior

zbsdmod.o (Linux kernel module to load and exec a NetBSD/zaurus kernel)
binaries from 6.1.5 and 7.2 can load and exec NetBSD/zaurus kernels
(even 8.1 ones with the above fix), but zbsdmod.o from 8.1 can't load
them correctly.  On my SL-C1000 this also triggers power-off.

There is no particular changes between netbsd-7 and netbsd-8.
Furthermore, building zbsdmod.c from netbsd-8 tree using
netbsd-7 toolchain makes working zbsdmod.o.
So the problem looks caused by gcc changes (4.8.5 vs 5.5.0).

Only one visible differnece is cacheline alignment of asm code
that flush I$ and jumps to an entry point of the loaded kernel:

asm source in src/sys/arch/zaurus/stand/zbsdmod/zbsdmod.c:
---
		/* Disable MMU and jump to kernel entry address */
		"mov	r0, %0;"
		"mcr	p15, 0, r1, c7, c7, 0;" /* flush I+D cache */
		"mrc	p15, 0, r1, c2, c0, 0;" /* CPWAIT */
		"mov	r1, r1;"
		"sub	pc, pc, #4;"
		"mov	r1, #(0x00000010 | 0x00000020);"
		"mcr	p15, 0, r1, c1, c0, 0;" /* Write new control register */
		"mcr	p15, 0, r1, c8, c7, 0;" /* invalidate I+D TLB */
		"mcr	p15, 0, r1, c7, c5, 0;" /* invalidate I$ and BTB */
		"mcr	p15, 0, r1, c7, c10, 4;" /*drain write and fill buffer*/
		"mrc	p15, 0, r1, c2, c0, 0;" /* CPWAIT_AND_RETURN */
		"sub	pc, r0, r1, lsr #32;"
		:: "r" (addr), "r" (datacacheclean) : "r0", "r1", "r2");
---

objdump -d zbsdmod.o from 7.2:
---
 4a8:	e1a0000c 	mov	r0, ip
 4ac:	ee071f17 	mcr	15, 0, r1, cr7, cr7, {0}
 4b0:	ee121f10 	mrc	15, 0, r1, cr2, cr0, {0}
 4b4:	e1a01001 	mov	r1, r1
 4b8:	e24ff004 	sub	pc, pc, #4
 4bc:	e3a01030 	mov	r1, #48	; 0x30
 4c0:	ee011f10 	mcr	15, 0, r1, cr1, cr0, {0}
 4c4:	ee081f17 	mcr	15, 0, r1, cr8, cr7, {0}
 4c8:	ee071f15 	mcr	15, 0, r1, cr7, cr5, {0}
 4cc:	ee071f9a 	mcr	15, 0, r1, cr7, cr10, {4}
 4d0:	ee121f10 	mrc	15, 0, r1, cr2, cr0, {0}
 4d4:	e040f021 	sub	pc, r0, r1, lsr #32
---

objdump -d zbsdmod.o from 8.1:
---
 534:	e1a00003 	mov	r0, r3
 538:	ee071f17 	mcr	15, 0, r1, cr7, cr7, {0}
 53c:	ee121f10 	mrc	15, 0, r1, cr2, cr0, {0}
 540:	e1a01001 	mov	r1, r1
 544:	e24ff004 	sub	pc, pc, #4
 548:	e3a01030 	mov	r1, #48	; 0x30
 54c:	ee011f10 	mcr	15, 0, r1, cr1, cr0, {0}
 550:	ee081f17 	mcr	15, 0, r1, cr8, cr7, {0}
 554:	ee071f15 	mcr	15, 0, r1, cr7, cr5, {0}
 558:	ee071f9a 	mcr	15, 0, r1, cr7, cr10, {4}
 55c:	ee121f10 	mrc	15, 0, r1, cr2, cr0, {0}
 560:	e040f021 	sub	pc, r0, r1, lsr #32
---

I'm not sure about ARM and XScale restrictions,
but adding the following alignment adjustment as
src/sys/arch/arm/xscale/pxa2x0_apm_asm.S makes
zbsdmod.o built for 8.1 working:

---
Index: stand/zbsdmod/zbsdmod.c
===================================================================
RCS file: /cvsroot/src/sys/arch/zaurus/stand/zbsdmod/zbsdmod.c,v
retrieving revision 1.9
diff -u -p -d -r1.9 zbsdmod.c
--- stand/zbsdmod/zbsdmod.c	2 Dec 2013 18:36:11 -0000	1.9
+++ stand/zbsdmod/zbsdmod.c	22 Oct 2019 04:49:32 -0000
@@ -284,6 +284,13 @@ elf32bsdboot(void)
 		"mov	r1, r1;"
 		"sub	pc, pc, #4;"
 		"mov	r1, #(0x00000010 | 0x00000020);"
+		/*
+		 * Put the rest of instructions into the same cacheline
+		 * to make sure no I$ refill after invalidation.
+		 */
+		"b	2f;"
+		".align 5;"
+		"2:"
 		"mcr	p15, 0, r1, c1, c0, 0;" /* Write new control register */
 		"mcr	p15, 0, r1, c8, c7, 0;" /* invalidate I+D TLB */
 		"mcr	p15, 0, r1, c7, c5, 0;" /* invalidate I$ and BTB */

---

2. Unaddressed (but workarounded) issue

(1) Kernel fault 'Alignment Fault 3'

With the above two changes, the stock GENERIC kernel from
NetBSD/zaurus 8.1 distribution is properly loaded and
shows screen console messages, but it gets the following
error right after mount root:

---
 :

boot device: ld0
root on ld0a dumps on ld0b
root file system type: ffs
kern.module.path=/stand/zaurus/8.1/modules
Fatal kernel mode data abort: 'Alignment Fault 3'
trapframe 0xc4b95ef8
FSR=000000f3, FAR=e1a0c00d, spsr=a0000053
r0 =e1a0c00d, r1 =00000022, r2 =00000001, r3 =00000000
r4 =c0339c50, r5 =c118eb40, r6 =c05a3d10, r7 =c0339c50
r8 =00000000, r9 =c0510058, r10=c050b9f8, r11=c4b95fac
r12=c4b95f60, ssp=c4b95f48, slr=c0441ab0, pc =c03d0494

Stopped in pid 0.44 (system) at netbsd:mutex_tryentr+0x8:	ldr	r1, [r0]

db> bt
0xc4b95fac: netbsd:sched_sync+0xc
db> 
---
(manually typed from screen)

Note this kernel fault does not occur when the kernel is
booted with serial console (via "boot -1" on zboot prompt).

After misc try and error, the following GENERIC config changes
make GENERIC kernel boots upto the multi user:

---
Index: conf/GENERIC
===================================================================
RCS file: /cvsroot/src/sys/arch/zaurus/conf/GENERIC,v
retrieving revision 1.73.6.3
diff -u -p -d -r1.73.6.3 GENERIC
--- conf/GENERIC	18 Apr 2018 14:45:09 -0000	1.73.6.3
+++ conf/GENERIC	22 Oct 2019 04:49:32 -0000
@@ -30,6 +30,7 @@ maxusers	32			# estimated number of user
 options 	CPU_XSCALE_PXA250	# Support the XScale PXA25x core
 options 	CPU_XSCALE_PXA270	# Support the XScale PXA27x core
 makeoptions	CPUFLAGS="-mcpu=xscale"
+makeoptions	COPTS="-Os"
 
 # Architecture options
 options 	XSCALE_CACHE_READ_WRITE_ALLOCATE
@@ -60,7 +61,7 @@ file-system	MSDOSFS		# MS-DOS file syste
 file-system	KERNFS		# /kern
 file-system	NULLFS		# loopback file system
 #file-system	OVERLAY		# overlay file system
-file-system	PUFFS		# Userspace file systems (e.g. ntfs-3g & sshfs)
+#file-system	PUFFS		# Userspace file systems (e.g. ntfs-3g & sshfs)
 file-system	PROCFS		# /proc
 #file-system	UMAPFS		# NULLFS + uid and gid remapping
 #file-system	UNION		# union file system
@@ -164,14 +165,14 @@ options 	WSDISPLAY_COMPAT_RAWKBD		# can 
 
 # Development and Debugging options
 
-#options 	DIAGNOSTIC		# internal consistency checks
+options 	DIAGNOSTIC		# internal consistency checks
 #options 	DEBUG
 #options 	VERBOSE_INIT_ARM	# verbose bootstraping messages
 options 	DDB			# in-kernel debugger
 options 	DDB_HISTORY_SIZE=100	# Enable history editing in DDB
 #options 	KGDB
 #makeoptions 	DEBUG="-g"		# compile full symbol table
-makeoptions	COPY_SYMTAB=1
+#makeoptions	COPY_SYMTAB=1
 
 
 # Kernel root file system and dump configuration.
@@ -398,7 +399,7 @@ pseudo-device	pty			# pseudo-terminals
 pseudo-device	clockctl		# user control of clock subsystem
 pseudo-device	drvctl			# user control of drive subsystem
 pseudo-device	ksyms			# /dev/ksyms
-pseudo-device	putter			# for puffs and pud
+#pseudo-device	putter			# for puffs and pud
 
 # a pseudo device needed for Coda	# also needs CODA (above)
 #pseudo-device	vcoda			# coda minicache <-> venus comm.

---

I don't think any individual change could cause the fault,
but I'm afraid something like alignment address issue or
random memory corruption etc.

"r0 =e1a0c00d" is an argument of mutex_tryenter(), but
it seems the first instruction of mutex_tryenter().
I'm not sure even if ddb shows the correct infomations.

(2) yet another kernel load failure

The modified above GENERIC kernel works, but when I removed
the "COPTS" line, "options DIAGNOSTIC", or "COPY_SYMTAB" line,
the each modified kernel is not loaded (or executed) properly.

The behavior is similar to the following movie reported
by Sevan Janiyan on this list back in 2017:
 https://mail-index.netbsd.org/port-zaurus/2017/04/01/msg000065.html
 >> https://www.geeklan.co.uk/files/tmp/zaurus3.mov

I have no idea what happens in this case.

---

Anyway, I've updated my old liveimage script for NetBSD/zaurus 8.1
and put a working diskimage for 1GB SD using local binaries
with above changes:
 http://teokurebsd.org/netbsd/liveimage/20191022-zaurus/
  liveimage-zaurus-20191022.img.gz	
 https://github.com/tsutsui/netbsd-teokureliveimage-zaurus

The similar instructions for the old 6.0_BETA image can be appiled:
 https://mail-index.netbsd.org/port-zaurus/2012/02/04/msg000050.html

with the following minor differences:
 - no comp.tgz and xcomp.tgz (and games, man, modules, and tests) sets
   due to 1GB size restriction
 - "expand-image-fssize.sh" script can be used to expand FS size to
   the actual media size (can be exec on the single user prompt)
 - untesed on C7x0 and C860 models (yet)
 - dumb private binary packages are also available:
 http://teokurebsd.org/netbsd/packages/earmv4/8.0_2019Q1/All/
   (built for NetBSD/hpcarm 8.0 (earmv4) from pkgsrc-2019Q1)

I'll add more summarized informations somewhere else
if I'll get extra motivation.

Have fun,
---
Izumi Tsutsui


Home | Main Index | Thread Index | Old Index