NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

bin/44466: savecore tries to save NULL kernel -> clear of core-flag fails



>Number:         44466
>Category:       bin
>Synopsis:       savecore tries to save NULL kernel -> clear of core-flag fails
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jan 26 12:35:00 +0000 2011
>Originator:     Dr. Wolfgang Stukenbrock
>Release:        NetBSD 5.1
>Organization:
Dr. Nagler & Company GmbH
>Environment:
        
        
System: NetBSD test-s0 4.0 NetBSD 4.0 (NSW-WS) #0: Tue Aug 17 17:28:09 CEST 
2010 wgstuken@test-s0:/usr/src/sys/arch/amd64/compile/NSW-WS amd64
Architecture: x86_64
Machine: amd64
>Description:
        Due to some changes to savecore in the past it may happen, that the 
variable kernel is set to NULL.
        This is fine for kvm_openfiles() etc. but is a bad idea for stat() and 
open() syscalls ...
        Output of ktruss:

   ..... lots of lines deleted - including the stat(NULL, ...) call ....
   437      1 savecore write(0x5, 0x7f7ffd46b000, 0x319d) = 12701
       
"\^]\^?T^\M^W\^W\M-_\M-;\M-w\M-i\M-E\M-O\M-c3\M-?\M-l\M-R\M-O\M-w\M-x\M->\M-8{\M-~x\M-~M\M-CS\M-N\M^?zy\M-Q\M-=\M-;\M^O"
   437      1 savecore close(0x5)                  = 0
   437      1 savecore gettimeofday(0x7f7fffffc8e0, 0) = 0
   437      1 savecore writev(0x2, 0x7f7fffffc980, 0x2) = 63
       "savecore: writing compressed kernel to /var/crash/netbsd.13.gz\n"
   437      1 savecore fcntl(0x3, 0x3, 0)          = 2
   437      1 savecore sendto(0x3, 0x7f7fffffc9b0, 0x52, 0, 0, 0) = 82
       "<29>Jan 26 12:52:27 savecore: writing compressed kernel to 
/var/crash/netbsd.13.gz"
   437      1 savecore open("/var/crash/netbsd.13.gz", 0x601, 0x1b6) = 5
   437      1 savecore __fstat30(0x5, 0x7f7fffffca10) = 0
   437      1 savecore open(0, 0, 0)               Err#14 EFAULT
   437      1 savecore gettimeofday(0x7f7fffffc8d0, 0) = 0
   437      1 savecore issetugid()                 = 0
   437      1 savecore issetugid()                 = 0
   437      1 savecore open("/usr/share/nls/nls.alias.db", 0, 0) Err#2 ENOENT
   437      1 savecore open("/usr/share/nls/nls.alias", 0, 0) = 6
   437      1 savecore fcntl(0x6, 0x2, 0x1)        = 0
   437      1 savecore __fstat30(0x6, 0x7f7fffffbb20) = 0
   437      1 savecore mmap(0, 0x5f0, 0x1, 0x2, 0x6, 0, 0) = 0x7f7ffdff5000
   437      1 savecore close(0x6)                  = 0
   437      1 savecore munmap(0x7f7ffdff5000, 0x5f0) = 0
   437      1 savecore open("/usr/share/nls/C/libc.cat", 0, 0x7f7ffd5e0a01) = 6
   437      1 savecore __fstat30(0x6, 0x7f7fffffbfc0) = 0
   437      1 savecore mmap(0, 0x10be, 0x1, 0x1, 0x6, 0, 0) = 0x7f7ffdff4000
   437      1 savecore close(0x6)                  = 0
   437      1 savecore munmap(0x7f7ffdff4000, 0x10be) = 0
   437      1 savecore writev(0x2, 0x7f7fffffc970, 0x2) = 30
       "savecore: (null): Bad address\n"
   437      1 savecore fcntl(0x3, 0x3, 0)          = 2
   437      1 savecore sendto(0x3, 0x7f7fffffc9a0, 0x31, 0, 0, 0) = 49
       "<27>Jan 26 12:52:27 savecore: (null): Bad address"
   437      1 savecore write(0x5, 0x7f7ffd46b000, 0xa) = 10
       "\^_\M^K\b\0\0\0\0\0\0\^C"
   437      1 savecore exit(0x1)

        The realy bad thing with this is, that the helper function Open() will 
call exit(1) when the
        open() fails. So no kernel is saved (OK - shit may happen) and the 
core-present flag is not cleared!
        This will lead to a core saved on every boot from now on until 
"savecore -c" is called ...

        The following patch will kill the strange NULL calls to stat() and 
open().
        As a side effect, a kernel is now only saved, if -N is given on the 
command line.

        This PR is related to the the still open PR's 41310, 41441 and 41583.

        - PR 41310: a workaround by adding the -N option on the command line is 
described here
        - PR 41441: an other way to fix this problem - force kernel to be 
initialized as mentioned in the manual for savecore
                    This one will conflict with the effect of passing NULL to 
kvm_openfiles() if no -N is given.
        - PR 41583: the problem here seems to be the missing clear after a 
open-failure in a previous run.

>How-To-Repeat:
        Force the system to write a 
>Fix:
        Apply the following patch to /usr/src/sbin/savecore/savecore.c.

diff -u -r1.1 savecore.c
--- savecore.c  2011/01/26 12:13:56     1.1
+++ savecore.c  2011/01/26 12:20:38
@@ -774,6 +774,7 @@
                (void)close(ifd);
        (void)fclose(fp);
 
+      if (kernel != NULL) { /* only if -N is specified ... */
        /* Create a kernel. */
        (void)snprintf(path, sizeof(path), "%s/netbsd.%d%s",
            dirname, bounds, compress ? ".gz" : "");
@@ -804,6 +805,7 @@
                (void)fclose(fp);
        else
                (void)close(ofd);
+      }
 
        /*
         * For development systems where the crash occurs during boot
@@ -911,8 +913,12 @@
        char mbuf[100], path[MAXPATHLEN];
 
        /* XXX assume a reasonable default, unless we find a kernel. */
-       kernelsize = 20 * 1024 * 1024;
-       if (!stat(kernel, &st)) kernelsize = st.st_blocks * S_BLKSIZE;
+       if (kernel == NULL)
+               kernelsize = 0;
+       else {
+               kernelsize = 20 * 1024 * 1024;
+               if (!= NULL && !stat(kernel, &st)) kernelsize = st.st_blocks * 
S_BLKSIZE;
+       }
        if (statvfs(dirname, &fsbuf) < 0) {
                syslog(LOG_ERR, "%s: %m", dirname);
                exit(1);

>Unformatted:
        
        


Home | Main Index | Thread Index | Old Index