Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: boot fail with large RAM



On Wed, Nov 18, 2009 at 08:23:22PM +0100, Manuel Bouyer wrote:
> [...]
> I also noticed that amd64 has only 2 free lists, one for the low 16MB
> memory and one for anything else. Would it help to move to a third
> free list; one for low 16MB, a second for the low 4GB and a third one for
> anything else ? uvm(9) doens't say much about free lists ...

I tried it; it helps a lot (see attached dmesg). The system can boot,
and devices seems to be working fine. I also tested it on a desktop with
8GB ram, and another one with 2GB.
I think this patch can also help systems which don't have troubles at
boot time but could end up with bus_dma(9) allocation failures at
run time. So unless someone objects I'm going to commit the attached
patch in the next few days, and request a pullup to netbsd-5.

-- 
Manuel Bouyer, LIP6, Universite Paris VI.           
Manuel.Bouyer%lip6.fr@localhost
     NetBSD: 26 ans d'experience feront toujours la difference
--
booting hd0a:netbsd64
10584768+572992+823680 [731328+479128]=0xd96018
kernel text is mapped with 6 large pages and 25 normal pages
BIOS MEMORY MAP (8 ENTRIES):
    addr 0x0  size 0xa0000  type 0x1
    addr 0x100000  size 0x6f599000  type 0x1
    addr 0x6f699000  size 0x16000  type 0x2
    addr 0x6f6af000  size 0x1f000  type 0x3
    addr 0x6f6ce000  size 0x932000  type 0x2
    addr 0xe0000000  size 0x10000000  type 0x2
    addr 0xfe000000  size 0x2000000  type 0x2
    addr 0x100000000  size 0xb88000000  type 0x1
loading first16q 0xe000-0xa0000 (0xe-0xa0)
loading first16q 0xf0a000-0x1000000 (0xf0a-0x1000)
loading first4gq 0x1000000-0x6f699000 (0x1000-0x6f699)
loading default 0x100000000-0xc88000000 (0x100000-0xc88000)
Loaded initial symtab at 0xffffffff80d6dfc0, strtab at 0xffffffff80e21080, # en0
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 5.0_STABLE (GENERIC) #2: Thu Nov 19 11:59:12 CET 2009
        bouyer@roll:/dsk/l1/misc/bouyer/tmp/amd64/obj/dsk/l1/misc/bouyer/netbsdC
total memory = 49014 MB
avail memory = 47509 MB
SMBIOS rev. 2.6 @ 0x6f79c000 (83 entries)
Dell Inc. PowerEdge R710
mainbus0 (root)
ACPI Error (tbutils-0314): Null physical address for ACPI table [(null)] [20080]
cpu0 at mainbus0 apid 16: Intel 686-class, 2527MHz, id 0x106a5
cpu1 at mainbus0 apid 0: Intel 686-class, 2527MHz, id 0x106a5
cpu2 at mainbus0 apid 18: Intel 686-class, 2527MHz, id 0x106a5
cpu3 at mainbus0 apid 2: Intel 686-class, 2527MHz, id 0x106a5
cpu4 at mainbus0 apid 20: Intel 686-class, 2527MHz, id 0x106a5
cpu5 at mainbus0 apid 4: Intel 686-class, 2527MHz, id 0x106a5
cpu6 at mainbus0 apid 22: Intel 686-class, 2527MHz, id 0x106a5
cpu7 at mainbus0 apid 6: Intel 686-class, 2527MHz, id 0x106a5
cpu8 at mainbus0 apid 17: Intel 686-class, 2527MHz, id 0x106a5
cpu9 at mainbus0 apid 1: Intel 686-class, 2527MHz, id 0x106a5
cpu10 at mainbus0 apid 19: Intel 686-class, 2527MHz, id 0x106a5
cpu11 at mainbus0 apid 3: Intel 686-class, 2527MHz, id 0x106a5
cpu12 at mainbus0 apid 21: Intel 686-class, 2527MHz, id 0x106a5
cpu13 at mainbus0 apid 5: Intel 686-class, 2527MHz, id 0x106a5
cpu14 at mainbus0 apid 23: Intel 686-class, 2527MHz, id 0x106a5
cpu15 at mainbus0 apid 7: Intel 686-class, 2527MHz, id 0x106a5
ioapic0 at mainbus0 apid 0
ioapic1 at mainbus0 apid 1
acpi0 at mainbus0: Intel ACPICA 20080321
attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x5f irq 0
COMA (PNP0501) at acpi0 not configured
COMB (PNP0501) at acpi0 not configured
hpet0 at acpi0 (HPET, PNP0103-0): mem 0xfed00000-0xfed003ff
ipmi0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1
pchb0 at pci0 dev 0 function 0
pchb0: vendor 0x8086 product 0x3406 (rev. 0x13)
ppb0 at pci0 dev 1 function 0: vendor 0x8086 product 0x3408 (rev. 0x13)
ppb0: unsupported PCI Express version
pci1 at ppb0 bus 1
bnx0 at pci1 dev 0 function 0: Broadcom NetXtreme II BCM5709 1000Base-T
bnx0: Ethernet address 00:24:e8:46:7e:61
bnx0: interrupting at ioapic1 pin 4
brgphy0 at bnx0 phy 1: BCM5709 10/100/1000baseT PHY, rev. 8
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-Fo
bnx1 at pci1 dev 0 function 1: Broadcom NetXtreme II BCM5709 1000Base-T
bnx1: Ethernet address 00:24:e8:46:7e:63
bnx1: interrupting at ioapic1 pin 16
brgphy1 at bnx1 phy 1: BCM5709 10/100/1000baseT PHY, rev. 8
brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-Fo
ppb1 at pci0 dev 3 function 0: vendor 0x8086 product 0x340a (rev. 0x13)
ppb1: unsupported PCI Express version
pci2 at ppb1 bus 2
bnx2 at pci2 dev 0 function 0: Broadcom NetXtreme II BCM5709 1000Base-T
bnx2: Ethernet address 00:24:e8:46:7e:65
bnx2: interrupting at ioapic1 pin 0
brgphy2 at bnx2 phy 1: BCM5709 10/100/1000baseT PHY, rev. 8
brgphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-Fo
bnx3 at pci2 dev 0 function 1: Broadcom NetXtreme II BCM5709 1000Base-T
bnx3: Ethernet address 00:24:e8:46:7e:67
bnx3: interrupting at ioapic1 pin 10
brgphy3 at bnx3 phy 1: BCM5709 10/100/1000baseT PHY, rev. 8
brgphy3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-Fo
ppb2 at pci0 dev 4 function 0: vendor 0x8086 product 0x340b (rev. 0x13)
ppb2: unsupported PCI Express version
pci3 at ppb2 bus 3
mpt0 at pci3 dev 0 function 0: vendor 0x1000 product 0x0058
mpt0: interrupting at ioapic1 pin 1
mpt0: Phy 0: Link Rate 3.0 Gbps
mpt0: Phy 1: Link Rate 3.0 Gbps
mpt0: Phy 2: Link Rate 3.0 Gbps
scsibus0 at mpt0: 112 targets, 8 luns per target
ppb3 at pci0 dev 5 function 0: vendor 0x8086 product 0x340c (rev. 0x13)
ppb3: unsupported PCI Express version
pci4 at ppb3 bus 4
ppb4 at pci0 dev 6 function 0: vendor 0x8086 product 0x340d (rev. 0x13)
ppb4: unsupported PCI Express version
pci5 at ppb4 bus 5
vendor 0x197b product 0x2382 (miscellaneous system, revision 0x80) at pci5 dev d
sdhc0 at pci5 dev 0 function 2: vendor 0x197b product 0x2381 (rev. 0x80)
sdhc0: interrupting at ioapic1 pin 14
sdmmc0 at sdhc0
vendor 0x197b product 0x2383 (miscellaneous system, revision 0x80) at pci5 dev d
vendor 0x197b product 0x2384 (miscellaneous system, revision 0x80) at pci5 dev d
jme0 at pci5 dev 0 function 5
jme0: JMicron JMC260 Gigabit Ethernet Controller
jme0: Ethernet address 00:1b:8c:26:ad:30
jme0: interrupting at ioapic1 pin 3
ukphy0 at jme0 phy 1: Generic IEEE 802.3u media interface
ukphy0: OUI 0x00d831, model 0x0022, rev. 1
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ppb5 at pci0 dev 7 function 0: vendor 0x8086 product 0x340e (rev. 0x13)
ppb5: unsupported PCI Express version
pci6 at ppb5 bus 6
ppb6 at pci0 dev 9 function 0: vendor 0x8086 product 0x3410 (rev. 0x13)
ppb6: unsupported PCI Express version
pci7 at ppb6 bus 7
vendor 0x8086 product 0x342e (interrupt system, revision 0x13) at pci0 dev 20 fd
vendor 0x8086 product 0x3422 (interrupt system, revision 0x13) at pci0 dev 20 fd
vendor 0x8086 product 0x3423 (interrupt system, revision 0x13) at pci0 dev 20 fd
uhci0 at pci0 dev 26 function 0: vendor 0x8086 product 0x2937 (rev. 0x02)
uhci0: interrupting at ioapic0 pin 17
usb0 at uhci0: USB revision 1.0
uhci1 at pci0 dev 26 function 1: vendor 0x8086 product 0x2938 (rev. 0x02)
uhci1: interrupting at ioapic0 pin 18
usb1 at uhci1: USB revision 1.0
ehci0 at pci0 dev 26 function 7: vendor 0x8086 product 0x293c (rev. 0x02)
ehci0: inteuhidev0 at uhub1 port 2 configuration 1 interface 0
uhidev0: Avocent USB Composite Device-0, rev 1.10/0.00, addr 2, iclass 3/1
ukbd0 at uhidev0: 8 modifier keys, 6 key codes
ehci0: handing over low speed device on port 1 to uhci0
wskbd0 at ukbd0 mux 1
uhidev1 at uhub1 port 2 configuration 1 interface 1
uhidev1: Avocent USB Composite Device-0, rev 1.10/0.00, addr 2, iclass 3/1
ums0 at uhidev1
ums0: X report 0x0002 not supported
uhidev2 at uhub4 port 1 configuration 1 interface 0
uhidev2: CHESEN PS2 to USB Converter, rev 1.10/0.10, addr 2, iclass 3/1
ukbd1 at uhidev2: 8 modifier keys, 6 key codes
wskbd1 at ukbd1 mux 1
uhidev3 at uhub4 port 1 configuration 1 interface 1
uhidev3: CHESEN PS2 to USB Converter, rev 1.10/0.10, addr 2, iclass 3/1
uhidev3: 3 report ids
ums1 at uhidev3 reportid 1: 5 buttons and Z dir.
wsmouse0 at ums1 mux 0
uhid0 at uhidev3 reportid 2: input=1, output=0, feature=0
uhid1 at uhidev3 reportid 3: input=3, output=0, feature=0
atapibus0 at atabus0: 2 targets
cd0 at atapibus0 drive 0: <TEAC DVD-ROM DV28SV, 09082903120833, D.0K> cdrom reme

Index: amd64/machdep.c
===================================================================
RCS file: /cvsroot/src/sys/arch/amd64/amd64/machdep.c,v
retrieving revision 1.102.4.11
diff -u -p -u -r1.102.4.11 machdep.c
--- amd64/machdep.c     3 Oct 2009 23:49:50 -0000       1.102.4.11
+++ amd64/machdep.c     19 Nov 2009 11:45:08 -0000
@@ -1375,7 +1375,7 @@ init_x86_64(paddr_t first_avail)
        struct mem_segment_descriptor *ldt_segp;
        int x;
 #ifndef XEN
-       int first16q, ist;
+       int first16q, first4gq, ist;
        extern struct extent *iomem_ex;
        uint64_t seg_start, seg_end;
        uint64_t seg_start1, seg_end1;
@@ -1585,11 +1585,19 @@ init_x86_64(paddr_t first_avail)
         * all of the ISA DMA'able memory won't be eaten up
         * first-off).
         */
-       if (avail_end <= (16 * 1024 * 1024))
+#define ADDR_16M (16 * 1024 * 1024)
+#define ADDR_4G (4ULL * 1024 * 1024 * 1024)
+
+       if (avail_end <= ADDR_16M)
                first16q = VM_FREELIST_DEFAULT;
        else
                first16q = VM_FREELIST_FIRST16;
 
+       if (avail_end <= ADDR_4G)
+               first4gq = VM_FREELIST_DEFAULT;
+       else
+               first4gq = VM_FREELIST_FIRST4G;
+
        /* Make sure the end of the space used by the kernel is rounded. */
        first_avail = round_page(first_avail);
 
@@ -1636,19 +1644,19 @@ init_x86_64(paddr_t first_avail)
 
                /* First hunk */
                if (seg_start != seg_end) {
-                       if (seg_start < (16 * 1024 * 1024) &&
+                       if (seg_start < ADDR_16M &&
                            first16q != VM_FREELIST_DEFAULT) {
                                uint64_t tmp;
 
-                               if (seg_end > (16 * 1024 * 1024))
-                                       tmp = (16 * 1024 * 1024);
+                               if (seg_end > ADDR_16M)
+                                       tmp = ADDR_16M;
                                else
                                        tmp = seg_end;
 
                                if (tmp != seg_start) {
 #ifdef DEBUG_MEMLOAD
-                                       printf("loading 0x%"PRIx64"-0x%"PRIx64
-                                           " (0x%lx-0x%lx)\n",
+                                       printf("loading first16q 0x%"PRIx64
+                                           "-0x%"PRIx64" (0x%lx-0x%lx)\n",
                                            seg_start, tmp,
                                            atop(seg_start), atop(tmp));
 #endif
@@ -1658,10 +1666,32 @@ init_x86_64(paddr_t first_avail)
                                }
                                seg_start = tmp;
                        }
+                       if (seg_start < ADDR_4G &&
+                           first4gq != VM_FREELIST_DEFAULT) {
+                               uint64_t tmp;
+
+                               if (seg_end > ADDR_4G)
+                                       tmp = ADDR_4G;
+                               else
+                                       tmp = seg_end;
+
+                               if (tmp != seg_start) {
+#ifdef DEBUG_MEMLOAD
+                                       printf("loading first4gq 0x%"PRIx64
+                                           "-0x%"PRIx64" (0x%lx-0x%lx)\n",
+                                           seg_start, tmp,
+                                           atop(seg_start), atop(tmp));
+#endif
+                                       uvm_page_physload(atop(seg_start),
+                                           atop(tmp), atop(seg_start),
+                                           atop(tmp), first4gq);
+                               }
+                               seg_start = tmp;
+                       }
 
                        if (seg_start != seg_end) {
 #ifdef DEBUG_MEMLOAD
-                               printf("loading 0x%"PRIx64"-0x%"PRIx64
+                               printf("loading default 0x%"PRIx64"-0x%"PRIx64
                                    " (0x%lx-0x%lx)\n",
                                    seg_start, seg_end,
                                    atop(seg_start), atop(seg_end));
@@ -1674,19 +1704,19 @@ init_x86_64(paddr_t first_avail)
 
                /* Second hunk */
                if (seg_start1 != seg_end1) {
-                       if (seg_start1 < (16 * 1024 * 1024) &&
+                       if (seg_start1 < ADDR_16M &&
                            first16q != VM_FREELIST_DEFAULT) {
                                uint64_t tmp;
 
-                               if (seg_end1 > (16 * 1024 * 1024))
-                                       tmp = (16 * 1024 * 1024);
+                               if (seg_end1 > ADDR_16M)
+                                       tmp = ADDR_16M;
                                else
                                        tmp = seg_end1;
 
                                if (tmp != seg_start1) {
 #ifdef DEBUG_MEMLOAD
-                                       printf("loading 0x%"PRIx64"-0x%"PRIx64
-                                           " (0x%lx-0x%lx)\n",
+                                       printf("loading first16q 0x%"PRIx64
+                                           "-0x%"PRIx64" (0x%lx-0x%lx)\n",
                                            seg_start1, tmp,
                                            atop(seg_start1), atop(tmp));
 #endif
@@ -1696,10 +1726,32 @@ init_x86_64(paddr_t first_avail)
                                }
                                seg_start1 = tmp;
                        }
+                       if (seg_start1 < ADDR_4G &&
+                           first4gq != VM_FREELIST_DEFAULT) {
+                               uint64_t tmp;
+
+                               if (seg_end1 > ADDR_4G)
+                                       tmp = ADDR_4G;
+                               else
+                                       tmp = seg_end1;
+
+                               if (tmp != seg_start1) {
+#ifdef DEBUG_MEMLOAD
+                                       printf("loading first4gq 0x%"PRIx64
+                                           "-0x%"PRIx64" (0x%lx-0x%lx)\n",
+                                           seg_start1, tmp,
+                                           atop(seg_start1), atop(tmp));
+#endif
+                                       uvm_page_physload(atop(seg_start1),
+                                           atop(tmp), atop(seg_start1),
+                                           atop(tmp), first4gq);
+                               }
+                               seg_start1 = tmp;
+                       }
 
                        if (seg_start1 != seg_end1) {
 #ifdef DEBUG_MEMLOAD
-                               printf("loading 0x%"PRIx64"-0x%"PRIx64
+                               printf("loading default 0x%"PRIx64"-0x%"PRIx64
                                    " (0x%lx-0x%lx)\n",
                                    seg_start1, seg_end1,
                                    atop(seg_start1), atop(seg_end1));
Index: include/vmparam.h
===================================================================
RCS file: /cvsroot/src/sys/arch/amd64/include/vmparam.h,v
retrieving revision 1.18
diff -u -p -u -r1.18 vmparam.h
--- include/vmparam.h   20 Jan 2008 13:43:38 -0000      1.18
+++ include/vmparam.h   19 Nov 2009 11:45:08 -0000
@@ -157,9 +157,10 @@
 #define VM_PHYSSEG_STRAT       VM_PSTRAT_BIGFIRST
 #define VM_PHYSSEG_NOADD               /* can't add RAM after vm_mem_init */
 
-#define        VM_NFREELIST            2
+#define        VM_NFREELIST            3
 #define        VM_FREELIST_DEFAULT     0
-#define        VM_FREELIST_FIRST16     1
+#define        VM_FREELIST_FIRST4G     1
+#define        VM_FREELIST_FIRST16     2
 
 #include <x86/pmap_pv.h>
 


Home | Main Index | Thread Index | Old Index