Port-i386 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Problem with raidframe under NetBSD-3 and NetBSD-4



        Hello.  Following up on my own message, I can now say it's a memory
deadlock issue.  If I try removing the swap device from the system, wich is
the b partition of the raid set, and then issue the raidctl -F  component0
command to get the construction going, I get:
panic: malloc: out of space in kmem_map
        
        Since I assume it's a lot of work to change raidframe to use MALLOC,
and check to see if it failed, perhaps a reasonable work around, although
I'd prefer to see a real fix, is to note in the raidctl man page that users
who are swapping to raid sets may need to attach temporary swap devices to
their systems when attempting to reconstruct raid sets with large disks.
I'd also be happy with a kernel message saying that the allocation failed
and that the construction could not be completed due to a lack of memory.

-Brian

On Apr 5,  2:34pm, Brian Buhrow wrote:
} Subject: Re: Problem with raidframe under NetBSD-3 and NetBSD-4
}       Hello.  Hmm.  Given what you say, and a brief look at the code, I'm
} thinking that what I'm seeing is a memory deadlock.  The construction calls
} malloc, through the RF_Malloc macro, which says it's OK to wait for memory
} to become available.  Then, we get stuck because to get more memory, we
} must write to swap, which means we must write to the raid device, which
} we can't do because we're constructing it.
}       I'll try and test this theory by attaching a non-raid disk to swap to
} during the rebuild operation.
} 
} -Brian
} 
} On Apr 5,  3:05pm, Greg Oster wrote:
} } Subject: Re: Problem with raidframe under NetBSD-3 and NetBSD-4
} } Brian Buhrow writes:
} } >   Hello Greg.   In looking at my problematic machine a little further,
} } > I'm wondering if my problem is a lack of kernel memory.  Vmstat -m output
} } > looks like:
} } > [starting with the last line of output]
} } > 
} } > In use 335974K, total allocated 337732K; utilization 99.5%
} } > 
} } > Can the allocated amount change, or is that a fixed size?  
} } 
} } It can change... but there is a maximum...
} } 
} } > By my
} } > calculations, assuming the allocated size is a fixed number, that's less
} } > than 2K available of kernel memory.  I could imagine that with disks this
} } > large, the reconstruction thread could easily use more than this amount of
} } > memory.
} } >   Do you know if and how to get more kernel memory, either by tuning the
} } > kernel options, or by tuning sysctl? 
} } 
} } you can set:
} } 
} }  options NKMEMPAGES=value
} } 
} } in the kernel config.  You can find out what you have right now with:
} } 
} }   sysctl -a | grep kmem
} } 
} } For example, my box w/ 2GB has:
} } 
} } vm.nkmempages = 32768
} } 
} } > The machine has 2GB of RAM, and the
} } > working machine I alluded to earlier also has 2GB of RAM.  However, the
} } > working machine only has 118MB of allocated kernel space, and its
} } > utilization percentage is 96.5, as opposed to 99.5.
} } > 
} } > Thoughts?
} } 
} } Hmm.... I recall there being some per-strip structure being allocated
} } for reconstructions, but I can't find it in the 2 seconds that I've 
} } looked for it (and I'm heading out in a few moments :( ).  If such an 
} } allocation was failing, that would certainly cause the reconstruction 
} } to stop (and, if there was a bug, perhaps to bung things up like you 
} } describe).
} } 
} } If you bump up vm.kmempages to 65536 (for example), and the rebuild 
} } still fails, then it's most likely not a memory issue... 
} } 
} } I'll attempt to have a closer look at this again tomorrow...
} } 
} } Later...
} } 
} } Greg Oster
} } 
} } 
} >-- End of excerpt from Greg Oster
} 
} 
>-- End of excerpt from Brian Buhrow





Home | Main Index | Thread Index | Old Index