NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/39526: adh driver crashes system if it runs out of memory



>Number:         39526
>Category:       kern
>Synopsis:       adh driver crashes system if it runs out of memory
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Sep 12 13:50:00 +0000 2008
>Originator:     Wolfgang Stukenbrock
>Release:        NetBSD 4.0
>Organization:
Dr. Nagler  & Company GmbH
        
>Environment:
        
        
System: NetBSD s012 4.0 NetBSD 4.0 (NSW-S012) #1: Thu Sep 11 12:21:03 CEST 2008 
root@s012:/usr/src/sys/arch/amd64/compile/NSW-S012 amd64
Architecture: x86_64
Machine: amd64
>Description:
        I'm running an 29320ALP-R in PCI-X Slot on an Intel Motherboard with 8 
GB Ram installed.
        (Intel S3210SHLX with E3110 (with the 6MB-Chache fix - see PR 
kern/39242))
        If the system starts using the memory bejong 4 GB the ahd-driver fails 
to allocate memory for some
        structures. The following messages occures:
        ahd0: failed to create DMA map for Sense Data structures, error = 12
        ahd_createdmamem error (2)
        ahd0: failed to create DMA map for SG data structures, error = 12
        ahd_createdmamem error (2)
        ahd0: failed to create DMA map for hardware SCB structures, error = 12
        ahd_createdmamem error (2)

        I've added a debug print to the uvm_pglistalloc_simple routine in order 
to get the cause of the problem and that
        reports the following: (example - the number of free pages will vary 
over time ..)

        plistalloc - waiting orig num 1 - num 1 low 0x1000000 high 0x100000000 
- free 162 pd_res 1 kres 5

        This means, that there was a request for one page, one page is still 
not found.
        The pages should be between 0x1000000 and 0x100000000.
        The vm-stat structure reports that there are still 162 pages free, one 
is reseverd for pagedaemon and
        5 pages are reserved for kernel.

        So there is physical memory available, but it is outside the range of 
the requested range.
        Therefore the error is returned to the ahd driver - it alwas failes in 
dma-mem-alloc().
        I've tried to enable 64-Bit access by setting AHD_64BIT_ADDRESSING but 
that does not help.
        (I've set the flag if we found a PCI_CAP_PCIX capability as a first 
try.)
        For unknown reasons the AHD_64BIT_ADDRESSING is defined and used in 
several places, but the code
        will never set it. Is this a bug/feature and setting the flag is just 
missing somewhere or is
        the flag obsolete?
        Without setting this flag there are much more memory allocation 
failure, so I thing setting the flag
        is just lost in the code by some other changes in the past.
        Neverless the crash described below happens all the time.

        After repeating the 6 error messages above some times, the following 
output is on the console:
        ahd0: failed to create DMA map for SG data structures, error = 12
        ahd_createdmamem error (2)
        ahd0: failed to create DMA map for Sense Data structures, error = 12
        ahd_createdmamem error (2)
        uvm_fault(0xffffffff80606ee0, 0xffff80009b797000, 2) -> e
        kernel: page fault trap, code=0
        Stopped in pid 9.1 (scsibus0) at        netbsd:ahd_alloc_scbs+0x1ed:    
repe stosq      %es:(%rdi)

        The system just tries to setup an additional request.
        The number of repeated output of the 6 lines above will vary until the 
crash happens.

        An other point that I don't understand is the fact, that even if 64bit 
adressing is allowed, the
        requested range is below 4GB. Is this a bug, or is it nessesary that 
dma-memory is below 4GB even
        with 64bit adressing?
>How-To-Repeat:
        Install an ahd-controler (e.g. 29320ALP-R) in a system with 8 GB memory 
and bring it to heavy load.
        After some minutes you will see the crash.
>Fix:
        not known to me up to now - sorry.
        All I've tried up to now failed to solve the problem at all - I've only 
some parital success.
        Setting AHD_64BIT_ADDRESSING in ahd_pci_attach() at line 403 in 
sys/dev/pci/ahd_pci.c after
        "if (!pci_get_capability(pa->pa_pc, pa->pa_tag, PCI_CAP_PCIX, 
&bd->pcix_off, NULL))" has succeded, will
        remove lots of failed allocation for transfers, but the system will 
crash anyway after some failed
        dma-memory alloction attempts.

>Unformatted:
        
        


Home | Main Index | Thread Index | Old Index