NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/47846: panic/lockups in raidframe during detach at shutdown



>Number:         47846
>Category:       kern
>Synopsis:       panic/lockups in raidframe during detach at shutdown
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue May 21 21:05:00 +0000 2013
>Originator:     Frank Kardel
>Release:        NetBSD 6.99.19
>Organization:
        
>Environment:
System: NetBSD pip.kardel.name 6.99.19 NetBSD 6.99.19 (PIPGEN) #28: Tue May 21 
22:10:18 CEST 2013 
kardel%pip.kardel.name@localhost:/usr/src/sys/arch/amd64/compile/PIPGEN amd64
Architecture: x86_64
Machine: amd64
>Description:
        raidframe panics or locks up at shutdown time.
        Problem was introduced with the commit of dynamic allocation of
        raidX devices.

Module Name:    src
Committed By:   christos
Date:           Sat Apr 27 21:18:43 UTC 2013

Modified Files:
        src/sys/dev/raidframe: rf_engine.c rf_netbsd.h rf_netbsdkintf.c
            rf_raid.h

Log Message:
allocate devices dynamically.


To generate a diff of this commit:
cvs rdiff -u -r1.47 -r1.48 src/sys/dev/raidframe/rf_engine.c
cvs rdiff -u -r1.29 -r1.30 src/sys/dev/raidframe/rf_netbsd.h
cvs rdiff -u -r1.299 -r1.300 src/sys/dev/raidframe/rf_netbsdkintf.c
cvs rdiff -u -r1.43 -r1.44 src/sys/dev/raidframe/rf_raid.h

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

        Issue is caused by:
                - softc structure are dynamically, independently allocated 
within the code
                - CFATTACH_DECL3_NEW(raid, sizeof(struct raid_softc), ...) 
still requests
                  a separate copy of struct raid_softc
                - raid_detach() uses device_private() to access the 
uninitialized
                  softc structure allocated during config_attach_pseudo().
        So, either you get stuck in raidlock() waiting forever or a panic is
        triggered in e.g. iostat_free() trying to free never initialized iostat 
structures.

How-To-Repeat:
        Try to shutdown  using a raidframe configuration with current raidframe 
code

>Fix:
Quick fix - more analysis is advisable.

Index: rf_netbsdkintf.c
===================================================================
RCS file: /cvsroot/src/sys/dev/raidframe/rf_netbsdkintf.c,v
retrieving revision 1.302
diff -u -r1.302 rf_netbsdkintf.c
--- rf_netbsdkintf.c    29 Apr 2013 21:21:10 -0000      1.302
+++ rf_netbsdkintf.c    21 May 2013 20:20:34 -0000
@@ -3841,15 +3843,20 @@
 raid_detach(device_t self, int flags)
 {
        int error;
-       struct raid_softc *rs = device_private(self);
+       struct raid_softc *rs = raidget(device_unit(self));
+
+       if (rs == NULL)
+               return ENXIO;
 
        if ((error = raidlock(rs)) != 0)
                return (error);
 
        error = raid_detach_unlocked(rs);

        raidunlock(rs);
 
+       /* XXXkd: raidput(rs) ??? */
+
        return error;
 }

The above is not completely analysed. The separate softc structure does not 
need to
be allocated (set DVF_PRIV_ALLOC?). Could there be more oversights?
Where should the raidput()s be done - are there possible leaks (not verified 
yet)?




Home | Main Index | Thread Index | Old Index