Subject: repne ins[bwl] vs rep ins[bwl]
To: None <port-i386@netbsd.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-i386
Date: 01/22/2006 16:24:08
--vtzGhvizbBRQ85DL
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Hi,
while working on Xen3 support, I've run into a problem I don't fully
understand. Basically, insw() doesn't read data while a C loop around
inw() works as expected.

Here is the interesting part of a ddb section.
port to read is 0x1f0, len is 512, so 0x100 words to read. Target is
0xc6318b64.
db> sh reg
ds          0x11
es          0x11
fs          0x31
gs          0x11
edi         0xc6318b64
esi         0xc6318b64
ebp         0xc6318a2c
ebx         0xc0b0ab80
edx         0x1f0 
ecx         0x100 
eax         0
eip         0xc01b85d6  wdc_datain_pio+0x106 
cs          0x9
eflags      0x206
esp         0xc63189f4
ss          0x11
netbsd:wdc_datain_pio+0x106:    cld
db> s
Stopped in pid 2.1 (atabus0) at netbsd:wdc_datain_pio+0x107:    repne insw
%
dx,%es:(%edi)
db>
Stopped in pid 2.1 (atabus0) at netbsd:wdc_datain_pio+0xc0:     xorl    %edi,%ed
i
db> sh r
ds          0x11
es          0x11
fs          0x31
gs          0x11
edi         0xc6318b66
esi         0xc6318b64
ebp         0xc6318a2c
ebx         0xc0b0ab80
edx         0x1f0
ecx         0x100
eax         0
eip         0xc01b8590  wdc_datain_pio+0xc0
cs          0x9
eflags      0x206
esp         0xc63189f4
ss          0x11
netbsd:wdc_datain_pio+0xc0:     xorl    %edi,%edi
db> s

As you can see, the target (%edi) was incremented by only one word, and
the counter (%ecx) was not touched. Looking at the target buffer, the
first 2 bytes contains the expected values, while the remaining are left
untouched (all 0, while the 3rd byte should be 0xff). The device (an IDE
drive) is still in a state where is expect the host to read data (DRQ asserted
in the status register).

Now I see we're doing a repne insw here. From the intel docs, only rep
is allowed with {in,out}s{b,w,l} instructions. So I remplaced repne with
rep in sys/arch/x86/include/pio.h (see attached diff), and this made things
work properly:
db> sh re
ds          0x11
es          0x11
fs          0x31
gs          0x11
edi         0xc6318b64
esi         0xc6318b64
ebp         0xc6318a2c
ebx         0xc0b0ab80
edx         0x1f0
ecx         0x100
eax         0
eip         0xc01b85d7  wdc_datain_pio+0x107
cs          0x9
eflags      0x206
esp         0xc63189f4
ss          0x11
netbsd:wdc_datain_pio+0x107:    repe insw       %dx,%es:(%edi)
db> s
Stopped in pid 2.1 (atabus0) at netbsd:wdc_datain_pio+0xc0:     xorl    %edi,%ed
i
db> sh re
ds          0x11
es          0x11
fs          0x31
gs          0x11
edi         0xc6318d64
esi         0xc6318b64
ebp         0xc6318a2c
ebx         0xc0b0ab80
edx         0x1f0
ecx         0
eax         0
eip         0xc01b8590  wdc_datain_pio+0xc0
cs          0x9
eflags      0x206
esp         0xc63189f4
ss          0x11

Now %edi is incremented by the expected value, %ecx is 0 and the data are
read properly. The IDE driver works. This change also made the ahc driver
work, while it would panic in setup phase with the original pio.h.

I tested a kernel on Xen2, and a native i386 kernel with this patch, and
they work properly.
Linux also use rep, not repne.

Now I don't know why repne doesn't work with Xen3 while it works otherwise.
Maybe it's interrupt-related, and Xen3 doesn't restart a repne ins properly
(although, from the Intel docs, and interrupted rep should be restarted
automatically by the processor).
In case this matters, all this is on a:
cpu0 at mainbus0: (uniprocessor)
cpu0: Intel Pentium II (686-class), 447.51 MHz, id 0x652
cpu0: features 183fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu0: features 183fbff<PGE,MCA,CMOV,PAT,PSE36,MMX>
cpu0: features 183fbff<FXSR>
cpu0: I-cache 16 KB 32B/line 4-way, D-cache 16 KB 32B/line 4-way
cpu0: L2 cache 512 KB 32B/line 4-way
cpu0: ITLB 32 4 KB entries 4-way, 2 4 MB entries fully associative
cpu0: DTLB 64 4 KB entries 4-way, 8 4 MB entries 4-way
cpu0: 32 page colors
hypervisor0 at mainbus0

Does anyone know if my patch can cause problems on some CPUs ? I don't think
so, as linux doens't seem to have special-case for this.
Does anyone object if I commit this patch ?
 
-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

--vtzGhvizbBRQ85DL
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff

Index: include/pio.h
===================================================================
RCS file: /cvsroot/src/sys/arch/x86/include/pio.h,v
retrieving revision 1.3
diff -u -r1.3 pio.h
--- include/pio.h	24 Dec 2005 20:07:42 -0000	1.3
+++ include/pio.h	22 Jan 2006 14:49:07 -0000
@@ -84,7 +84,7 @@
 {
 	void *dummy1;
 	int dummy2;
-	__asm volatile("cld\n\trepne\n\tinsb"			:
+	__asm volatile("cld\n\trep\n\tinsb"			:
 			 "=D" (dummy1), "=c" (dummy2) 		:
 			 "d" (port), "0" (addr), "1" (cnt)	:
 			 "memory");
@@ -114,7 +114,7 @@
 {
 	void *dummy1;
 	int dummy2;
-	__asm volatile("cld\n\trepne\n\tinsw"			:
+	__asm volatile("cld\n\trep\n\tinsw"			:
 			 "=D" (dummy1), "=c" (dummy2)		:
 			 "d" (port), "0" (addr), "1" (cnt)	:
 			 "memory");
@@ -144,7 +144,7 @@
 {
 	void *dummy1;
 	int dummy2;
-	__asm volatile("cld\n\trepne\n\tinsl"			:
+	__asm volatile("cld\n\trep\n\tinsl"			:
 			 "=D" (dummy1), "=c" (dummy2)		:
 			 "d" (port), "0" (addr), "1" (cnt)	:
 			 "memory");
@@ -171,7 +171,7 @@
 {
 	void *dummy1;
 	int dummy2;
-	__asm volatile("cld\n\trepne\n\toutsb"		:
+	__asm volatile("cld\n\trep\n\toutsb"		:
 			 "=S" (dummy1), "=c" (dummy2)		:
 			 "d" (port), "0" (addr), "1" (cnt));
 }
@@ -197,7 +197,7 @@
 {
 	void *dummy1;
 	int dummy2;
-	__asm volatile("cld\n\trepne\n\toutsw"		:
+	__asm volatile("cld\n\trep\n\toutsw"		:
 			 "=S" (dummy1), "=c" (dummy2)		:
 			 "d" (port), "0" (addr), "1" (cnt));
 }
@@ -223,7 +223,7 @@
 {
 	void *dummy1;
 	int dummy2;
-	__asm volatile("cld\n\trepne\n\toutsl"		:
+	__asm volatile("cld\n\trep\n\toutsl"		:
 			 "=S" (dummy1), "=c" (dummy2)		:
 			 "d" (port), "0" (addr), "1" (cnt));
 }

--vtzGhvizbBRQ85DL--