Subject: Re: pcmcia patches in current (6/30/97)
To: Herb Peyerl <hpeyerl@beer.org>
From: Chris G. Demetriou <cgd@pa.dec.com>
List: tech-kern
Date: 07/04/1997 21:36:04
> Wolfgang Rupprecht <wolfgang@wsrcc.com> wrote:
> > The problem is that any network activity runs a very high risk of a
> > panic in the vm_* code. I can ftp out and do an ls on the server, but
> > the laptop usually panics before the ls listing finishes.
>
> Yes, I was having pretty bad problems with a machine that had 5 3com's
> in it and Chris was also having problems... We (mostly chris) spent a
> fair bit of time trying to narrow down the problem and, last I saw of
> the discussion, it seemed to be a compiler issue.
Yah.
My testing indicated that with the do/while(0) changes in mbuf.h,
dev/ic/elink3.c (the back-end for the 'ep' driver), in particular the
function epget(), is miscompiled when the default i386 kernel
optimization, -O2, is turned on.
My data points:
(1) when all kernel files are compiled with -O2 and with no
special patches applied, the kernel crashes when I try
to 'ssh' in or use X11 applications (in particular,
ghostview) with a remote display.
(2) when all kernel files are compiled with -O2 except for
elink3.c compiled without -O2 (and with a few warning
flags disabled so some sketchy inlines don't cause
warnings) and with no special patches applied, the kernel
works properly.
(3) when all kernel files are compiled with -O2 and with the
patch below applied to mbuf.h and "#define
_LOSING_COMPILER" added only to elink3.c, the kernel
works properly.
epget() uses MGET, MGETHDR, and MCLGET, and that's what the diff below
changes to not use do/while(0). I tried changing fewer, but that
didn't yield a kernel that was stable. I also tested changing
elink3.c to use m_get instead of MGET and m_gethdr instead of MGETHDR,
and with that change only MCLGET needed the do/while(0) removed. None
of the uses of the mbuf macros in elink3.c should have their behaviour
changed by use of the do/while(0).
That's why I think this is a compiler bug. It could be something
different, but I know where I'd put my money. 8-)
When I defined _LOSING_COMPILER for all files in the kernel and
compiled with optimization, I found that elink3.o and a handful of
other object files (uipc_mbuf.o, and a few NFS objects, if i recall
correctly) differ from a non-_LOSING_COMPILER optimized build. This
makes sense (because register allocation is likely to change in the
presense of a do/while(0)), but indicates that other functionality
which I wasn't testing (most importantly NFS) may cause similar
problems using the new macro definitions.
Anyway, I have a workaround that makes my system function properly.
I'm not even going to attempt to dig into gcc to find the problem.
(It's just not worth my time.)
chris
===================================================================
Index: mbuf.h
===================================================================
RCS file: /cvsroot/src/sys/sys/mbuf.h,v
retrieving revision 1.1.1.3
diff -c -r1.1.1.3 mbuf.h
*** mbuf.h 1997/06/09 03:43:15 1.1.1.3
--- mbuf.h 1997/07/05 04:03:41
***************
*** 41,46 ****
--- 41,54 ----
#ifndef _SYS_MBUF_H_
#define _SYS_MBUF_H_
+ #ifdef _LOSING_COMPILER
+ #define _MBUF_LOSING_DO
+ #define _MBUF_LOSING_WHILE
+ #else
+ #define _MBUF_LOSING_DO do
+ #define _MBUF_LOSING_WHILE while (0)
+ #endif
+
#ifndef M_WAITOK
#include <sys/malloc.h>
#endif
***************
*** 198,204 ****
* If 'how' is M_WAIT, these macros (and the corresponding functions)
* are guaranteed to return successfully.
*/
! #define MGET(m, how, type) do { \
MALLOC((m), struct mbuf *, MSIZE, mbtypes[type], (how)); \
if (m) { \
MBUFLOCK(mbstat.m_mtypes[type]++;); \
--- 206,212 ----
* If 'how' is M_WAIT, these macros (and the corresponding functions)
* are guaranteed to return successfully.
*/
! #define MGET(m, how, type) _MBUF_LOSING_DO { \
MALLOC((m), struct mbuf *, MSIZE, mbtypes[type], (how)); \
if (m) { \
MBUFLOCK(mbstat.m_mtypes[type]++;); \
***************
*** 209,217 ****
(m)->m_flags = 0; \
} else \
(m) = m_retry((how), (type)); \
! } while (0)
! #define MGETHDR(m, how, type) do { \
MALLOC((m), struct mbuf *, MSIZE, mbtypes[type], (how)); \
if (m) { \
MBUFLOCK(mbstat.m_mtypes[type]++;); \
--- 217,225 ----
(m)->m_flags = 0; \
} else \
(m) = m_retry((how), (type)); \
! } _MBUF_LOSING_WHILE
! #define MGETHDR(m, how, type) _MBUF_LOSING_DO { \
MALLOC((m), struct mbuf *, MSIZE, mbtypes[type], (how)); \
if (m) { \
MBUFLOCK(mbstat.m_mtypes[type]++;); \
***************
*** 222,228 ****
(m)->m_flags = M_PKTHDR; \
} else \
(m) = m_retryhdr((how), (type)); \
! } while (0)
/*
* Macros for tracking external storage associated with an mbuf.
--- 230,236 ----
(m)->m_flags = M_PKTHDR; \
} else \
(m) = m_retryhdr((how), (type)); \
! } _MBUF_LOSING_WHILE
/*
* Macros for tracking external storage associated with an mbuf.
***************
*** 280,286 ****
* MEXTADD adds pre-allocated external storage to
* a normal mbuf; the flag M_EXT is set upon success.
*/
! #define MCLGET(m, how) do { \
MBUFLOCK( \
if (mclfree == 0) \
(void)m_clalloc(1, (how)); \
--- 288,294 ----
* MEXTADD adds pre-allocated external storage to
* a normal mbuf; the flag M_EXT is set upon success.
*/
! #define MCLGET(m, how) _MBUF_LOSING_DO { \
MBUFLOCK( \
if (mclfree == 0) \
(void)m_clalloc(1, (how)); \
***************
*** 299,305 ****
(m)->m_ext.ext_arg = NULL; \
MCLINITREFERENCE(m); \
} \
! } while (0)
#define MEXTMALLOC(m, size, how) do { \
(m)->m_ext.ext_buf = \
--- 307,313 ----
(m)->m_ext.ext_arg = NULL; \
MCLINITREFERENCE(m); \
} \
! } _MBUF_LOSING_WHILE
#define MEXTMALLOC(m, size, how) do { \
(m)->m_ext.ext_buf = \