NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/39242: NetBSD 4.0 will start busy-loop an hang on machines with more than 4 GB memory



The following reply was made to PR kern/39242; it has been noted by GNATS.

From: Wolfgang Stukenbrock <Wolfgang.Stukenbrock%nagler-company.com@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: kern-bug-people%NetBSD.org@localhost, gnats-admin%NetBSD.org@localhost, 
netbsd-bugs%NetBSD.org@localhost
Subject: Re: kern/39242: NetBSD 4.0 will start busy-loop an hang on machines 
with more than 4 GB memory
Date: Thu, 31 Jul 2008 10:24:28 +0200

 Hi,
 
 I think I've located the problem with the looping scsibus0 kernel process!
 
 I've added some print in the uvm_plistalloc_simple() routine that will 
 start if  the routine is gooing to sleep for more memory.
 The problem happend after unpacking a tar file in the filesystem on the 
 SCSI-disks and then call sync to bring the cache to the disk.
 I've got the following output:
 
 
 plistalloc - waiting orig num 1 - num 1 low 0x1000000 high 0x100000000 - 
 free 245 pd_res 1 kres 5
 plistalloc - loop fl 0 psi 3 ps-fl 1 num 1
 plistalloc - loop fl 0 psi 2 ps-fl 1 num 1
 plistalloc - loop fl 0 psi 1 ps-fl 0 num 1
 plistalloc - loop fl 0 psi 0 ps-fl 0 num 1
 plistalloc - loop fl 1 psi 3 ps-fl 1 num 1
 plistalloc - loop fl 1 psi 2 ps-fl 1 num 1
 plistalloc - loop fl 1 psi 1 ps-fl 0 num 1
 plistalloc - loop fl 1 psi 0 ps-fl 0 num 1
 plistalloc - waiting orig num 1 - num 1 low 0x1000000 high 0x100000000 - 
 free 245 pd_res 1 kres 5
 plistalloc - loop fl 0 psi 3 ps-fl 1 num 1
 plistalloc - loop fl 0 psi 2 ps-fl 1 num 1
 plistalloc - loop fl 0 psi 1 ps-fl 0 num 1
 plistalloc - loop fl 0 psi 0 ps-fl 0 num 1
 plistalloc - loop fl 1 psi 3 ps-fl 1 num 1
 plistalloc - loop fl 1 psi 2 ps-fl 1 num 1
 plistalloc - loop fl 1 psi 1 ps-fl 0 num 1
 plistalloc - loop fl 1 psi 0 ps-fl 0 num 1
 plistalloc - waiting orig num 1 - num 1 low 0x1000000 high 0x100000000 - 
 free 245 pd_res 1 kres 5
 plistalloc - loop fl 0 psi 3 ps-fl 1 num 1
 plistalloc - loop fl 0 psi 2 ps-fl 1 num 1
 plistalloc - loop fl 0 psi 1 ps-fl 0 num 1
 plistalloc - loop fl 0 psi 0 ps-fl 0 num 1
 plistalloc - loop fl 1 psi 3 ps-fl 1 num 1
 plistalloc - loop fl 1 psi 2 ps-fl 1 num 1
 plistalloc - loop fl 1 psi 1 ps-fl 0 num 1
 plistalloc - loop fl 1 psi 0 ps-fl 0 num 1
 ...
 
 endless gooing on ....
 
 The controller seems to request one additional page in the range fom 
 0x1000000 to 0x100000000. This means an address below 4GB.
 The page daemon is kicked, but does nothing, because pagedaemon does not 
 know anything about the range in which the required memory must reside.
 This looks like a conceptual problem in 4.0 to me.
 
 some additional information from DDB - after I get the system into the 
 debugger ...
 
 Stopped in pid 28.1 (aiodoned) at       netbsd:cpu_Debugger+0x5: 
 leave
 db{0}> trace
 cpu_Debugger() at netbsd:cpu_Debugger+0x5
 comintr() at netbsd:comintr+0x6e0
 Xintr_ioapic_edge4() at netbsd:Xintr_ioapic_edge4+0xd4
 --- interrupt ---
 _kernel_lock() at netbsd:_kernel_lock+0xad
 intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x16
 Xintr_ioapic_level10() at netbsd:Xintr_ioapic_level10+0xd8
 --- interrupt ---
 Xspllower() at netbsd:Xspllower+0xe
 DDB lost frame for netbsd:Xsoftclock+0x1a, trying 0xffff800056ff4db8
 Xsoftclock() at netbsd:Xsoftclock+0x1a
 --- interrupt ---
 0x56ff4e30:
 db{0}> show uvmexp
 Current UVM status:
    pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
    2039030 VM pages: 1276109 active, 623406 inactive, 2284 wired, 372 free
    pages  1470432 anon, 429766 file, 1604 exec
    freemin=64, free-target=85, wired-max=679676
    faults=6949833, traps=7652305, intrs=16596032, ctxswitch=30047566
    softint=6211694, syscalls=17975420, swapins=287, swapouts=306
    fault counts:
      noram=112, noanon=0, pgwait=11, pgrele=0
      ok relocks(total)=1590(1591), anget(retrys)=1256742(505), 
 amapcopy=452013
      neighbor anon/obj pg=900772/6909020, gets(lock/unlock)=1606322/1086
      cases: anon=966859, anoncow=288741, obj=1336907, prcopy=269411, 
 przero=86586
 0
    daemon and swap counts:
      woke=67411, revs=5048, scans=1391672, obscans=1386891, anscans=2492
      busy=0, freed=1389383, reactivate=103, deactivate=2015339
      pageouts=89708, pending=1299675, nswget=99734
      nswapdev=1, swpgavail=6291455
      swpages=6291455, swpginuse=1389021, swpgonly=1289440, paging=0
 db{0}>
 
 I failed to get a stack-listing of the scsibus0 process
 
 ps output:
 9                0        0          0 2 0x20200    1         scsibus0
 
 but "trace/t 9" hangs up and I'm not able to get back into DDB
 
 
 For better understanding - the printf() statments I've inserted into 
 uvm_plistalloc_simple():
 
 XXX - start of modified routine ...
 
 static int
 uvm_pglistalloc_simple(int num, paddr_t low, paddr_t high,
      struct pglist *rlist, int waitok)
 {
          int fl, psi, s, error;
          struct vm_physseg *ps;
 int o_num = num;
 int xx = 0;
 
          /* Default to "lose". */
          error = ENOMEM;
 
 again:
          /*
           * Block all memory allocation and lock the free list.
           */
          s = uvm_lock_fpageq();
 
          /* Are there even any free pages? */
          if (uvmexp.free <= (uvmexp.reserve_pagedaemon + 
 uvmexp.reserve_kernel))
                  goto out;
 
          for (fl = 0; fl < VM_NFREELIST; fl++) {
 #if (VM_PHYSSEG_STRAT == VM_PSTRAT_BIGFIRST)
                  for (psi = vm_nphysseg - 1 ; psi >= 0 ; psi--)
 #else
                  for (psi = 0 ; psi < vm_nphysseg ; psi++)
 #endif
                  {
                          ps = &vm_physmem[psi];
 if (xx != 0) printf("plistalloc - loop fl %d psi %d ps-fl %d num %d\n", 
 fl, psi, ps->free_list, num);
 
                          if (ps->free_list != fl)
                                  continue;
 
                          num -= uvm_pglistalloc_s_ps(ps, num, low, high, 
 rlist);
                          if (num == 0) {
                                  error = 0;
                                  goto out;
                          }
                  }
 
          }
 
 out:
          /*
           * check to see if we need to generate some free pages waking
           * the pagedaemon.
           */
 
          uvm_kick_pdaemon();
          uvm_unlock_fpageq(s);
          if (error) {
                  if (waitok) {
                          /* XXX perhaps some time limitation? */
 #ifdef DEBUG
                          printf("pglistalloc waiting\n");
 #endif
 printf("plistalloc - waiting orig num %d - num %d low 0x%lx high 0x%lx - 
 free %d pd_res %d kres %d\n",
     o_num, num, low, high, uvmexp.free, uvmexp.reserve_pagedaemon, 
 uvmexp.reserve_kernel); xx=1;
                          uvm_wait("pglalloc");
                          goto again;
                  } else
                          uvm_pglistfree(rlist);
          }
 #ifdef PGALLOC_VERBOSE
          if (!error)
                  printf("pgalloc: %lx..%lx\n",
                         VM_PAGE_TO_PHYS(TAILQ_FIRST(rlist)),
                         VM_PAGE_TO_PHYS(TAILQ_LAST(rlist, pglist)));
 #endif
          return (error);
 }
 
 XXX - end of modified routine ...
 
 
 
 
 I thing the we need something like a list of ranges where memory is need 
 and the pagedaemon should free some memory in that range.
 
 At the moment, any system with more than 4 GB RAM may get into this 
 problem - the SCSI-controller is an Adaptec 29160A-R
 
 I will remove 4 GB of memory from the system in order to get it stable, 
 but that cannot be the final sollution.
 
 
 The problem switches back from "CPU-cache-problem" to my initial subject 
 for some supported controlers.
 
 best regards
 
 W. Stukenbrock
 
 Wolfgang Stukenbrock wrote:
 
 > The following reply was made to PR kern/39242; it has been noted by GNATS.
 > 
 > From: Wolfgang Stukenbrock 
 > <Wolfgang.Stukenbrock%nagler-company.com@localhost>
 > To: gnats-bugs%NetBSD.org@localhost
 > Cc: Simon Burge <simonb%NetBSD.org@localhost>, 
 > kern-bug-people%NetBSD.org@localhost
 > Subject: Re: kern/39242: NetBSD 4.0 will start busy-loop an hang on machines 
 > with more than 4 GB memory
 > Date: Wed, 30 Jul 2008 20:13:32 +0200
 > 
 >  Hi again,
 >  
 >  I've seen responce to my last mail directly to Simon, but I continue 
 >  testing my system.
 >  
 >  here the patch I've added to /usr7src/sys/amd64/amd64/cpu.c:
 >  
 >  s012# rcsdiff -c -r1.1 c*
 >  ===================================================================
 >  RCS file: RCS/cpu.c,v
 >  retrieving revision 1.1
 >  diff -c -r1.1 cpu.c
 >  *** cpu.c       2008/07/29 07:28:03     1.1
 >  --- cpu.c       2008/07/29 08:05:21
 >  ***************
 >  *** 217,222 ****
 >  --- 217,239 ----
 >                           tcolors /= cai->cai_associativity;
 >                   }
 >                   ncolors = max(ncolors, tcolors);
 >  +               /*
 >  +                * If the desired number of colors is not a power of
 >  +                * two, it won't be good.  Find the greatest power of
 >  +                * two which is an even divisor of the number of colors,
 >  +                * to preserve even coloring of pages.
 >  +                */
 >  +               if (ncolors & (ncolors - 1) ) {
 >  +                       int try, picked = 1;
 >  +                       for (try = 1; try < ncolors; try *= 2) {
 >  +                               if (ncolors % try == 0) picked = try;
 >  +                       }
 >  +                       if (picked == 1) {
 >  +                               panic("desired number of cache colors %d 
 >  is "
 >  +                               " > 1, but not even!", ncolors);
 >  +                       }
 >  +                       ncolors = picked;
 >  +               }
 >           }
 >  
 >           /*
 >  
 >  
 >  Just some minutes ago, I've got two new kernel crashes.
 >  
 >  1. the kernel process [scsibus0] starts looping and sleeps sometimes in 
 >  pglalloc. After reinstalling the system with the patch above, there was 
 >  no DDB in the kernel, so I could not get any other information.
 >  
 >  2. now I've DDB in the kernel and tried to reproduce the problem. But it 
 >  crashes prior reaching this state in pagedaemon ...
 >  Some output from the console below:
 >  
 >  
 >  uvm_fault(0xffffffff80628800, 0x0, 1) -> e
 >  kernel: page fault trap, code=0
 >  Stopped in pid 26.1 (pagedaemon) at     netbsd:uvm_rb_insert+0x37: 
 >  movq    0
 >  x40(%rax),%rax
 >  db{0}> trace
 >  uvm_rb_insert() at netbsd:uvm_rb_insert+0x37
 >  uvm_map_enter() at netbsd:uvm_map_enter+0x290
 >  uvm_map() at netbsd:uvm_map+0xfe
 >  uvm_pagermapin() at netbsd:uvm_pagermapin+0x92
 >  uvm_swap_io() at netbsd:uvm_swap_io+0x3c
 >  swapcluster_flush() at netbsd:swapcluster_flush+0x55
 >  uvm_pageout() at netbsd:uvm_pageout+0x42b
 >  db{0}> show uvmexp
 >  Current UVM status:
 >     pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
 >     2039030 VM pages: 1333964 active, 651446 inactive, 2248 wired, 5 free
 >     pages  1674503 anon, 311641 file, 1580 exec
 >     freemin=64, free-target=85, wired-max=679676
 >     faults=3420439, traps=3975116, intrs=8916183, ctxswitch=14697719
 >     softint=710475, syscalls=12629494, swapins=177, swapouts=210
 >     fault counts:
 >       noram=5, noanon=0, pgwait=0, pgrele=0
 >       ok relocks(total)=1068(1069), anget(retrys)=73451(286), amapcopy=31017
 >       neighbor anon/obj pg=65282/474220, gets(lock/unlock)=116248/783
 >       cases: anon=51370, anoncow=21184, obj=96126, prcopy=20121, przero=63798
 >     daemon and swap counts:
 >       woke=32081, revs=4228, scans=1204885, obscans=1161315, anscans=2987
 >       busy=0, freed=1164012, reactivate=128, deactivate=1857899
 >       pageouts=75209, pending=1088875, nswget=78493
 >       nswapdev=1, swpgavail=6291455
 >       swpages=6291455, swpginuse=1164021, swpgonly=1085489, paging=66
 >  db{0}>
 >  
 >  
 >  This patch seems to enable the system to work with the 6MB cache of the 
 >  E3110 CPU, but the kernel is not realy stable at all.
 >  Any idea? What should I try next?
 >  
 >  By the way "vmstat -s | grep colo" reports 32 colors now.
 >  The system was busy again when the crash happens. raidframe sync on one 
 >  SATA-raid and one SCSI-raid, transfered something arund 12 GB int /tmp 
 >  (tmpfs) so nearly 5 GB of the 24 GB swap was used.
 >  The kernel crashes at the moment where I've tried to copy on oth the 
 >  archives from /tmp to a filesystem on the raid just syncing.
 >  
 >  OK that may be a lot of work for the system, and it may get slow, but it 
 >  may not crash!
 >  I've failed to get a core image this time - sorry.
 >  continue does not work and the system freezes in sync from DDB ...
 >  
 >  Bx the way: I've saved the 8 GB core-image from the crash below, but 
 >  that one has still 96 page color active - the patch was missing. I think 
 >  it makes no sence to look at it at all.
 >  If nobody came to me in the next few day with a request for it, I will 
 >  remove it.
 >  
 >  
 >  W. Stukenbrock
 >  
 >  Wolfgang Stukenbrock wrote:
 >  
 >  > Hi,
 >  > 
 >  > I've took the diff's from x86/x86/cpu.c rev 1.32 and merged them into 
 >  > amd64/amd64/cpu.c - there is no x86/x86/cpu.c in 4.0 ...
 >  > 
 >  > It looks like it will solve the problem.
 >  > Thanks
 >  > 
 >  > The system will no longer freeze after using something around 4 GB 
 >  > memory (of the 8 GB installed ...).
 >  > 
 >  > But I've recognized, that under (very) heavy load the system will panic 
 >  > with "out of memory" in the pagedaemon.
 >  > I have a 8GB core file here - anybody interested in analysing ???. I 
 >  > think it will not pass through any mailing system ... (the bzip2 
 >  > compresseed version is still larger 2 GB (compression still running ...)
 >  > 
 >  > In DDB there was exactly on page stated to be free in "show uvmexp" - 
 >  > the output follows:
 >  > 
 >  > db{1}> show uvmexp
 >  > Current UVM status:
 >  >   pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
 >  >   2039036 VM pages: 1119946 active, 547095 inactive, 4031 wired, 1 free
 >  >   pages  1328086 anon, 341555 file, 1764 exec
 >  >   freemin=64, free-target=85, wired-max=679678
 >  >   faults=13246446, traps=18516116, intrs=17316633, ctxswitch=64031864
 >  >   softint=6934242, syscalls=65761334, swapins=431, swapouts=1372
 >  >   fault counts:
 >  >     noram=2008, noanon=0, pgwait=0, pgrele=0
 >  >     ok relocks(total)=2409(2418), anget(retrys)=12931894(1119), 
 >  > amapcopy=365963
 >  >     neighbor anon/obj pg=702333/4863720, gets(lock/unlock)=1135392/1299
 >  >     cases: anon=6302100, anoncow=256103, obj=947341, prcopy=188007, 
 >  > przero=24372
 >  > 47
 >  >   daemon and swap counts:
 >  >     woke=9819, revs=6729, scans=2176414, obscans=1733276, anscans=143716
 >  >     busy=0, freed=1876669, reactivate=14780, deactivate=3760043
 >  >     pageouts=9330, pending=134669, nswget=1111
 >  >     nswapdev=1, swpgavail=6291455
 >  >     swpages=6291455, swpginuse=143978, swpgonly=142546, paging=323
 >  > db{1}> trace
 >  > cpu_Debugger() at netbsd:cpu_Debugger+0x5
 >  > panic() at netbsd:panic+0x1f5
 >  > pmap_growkernel() at netbsd:pmap_growkernel+0x446
 >  > uvm_map_prepare() at netbsd:uvm_map_prepare+0x371
 >  > uvm_map() at netbsd:uvm_map+0xae
 >  > uvm_km_alloc() at netbsd:uvm_km_alloc+0x73
 >  > vmem_xalloc() at netbsd:vmem_xalloc+0x130
 >  > vmem_alloc() at netbsd:vmem_alloc+0x86
 >  > amap_alloc() at netbsd:amap_alloc+0xdb
 >  > uvm_map_enter() at netbsd:uvm_map_enter+0x24b
 >  > uvm_map() at netbsd:uvm_map+0xfe
 >  > sys_obreak() at netbsd:sys_obreak+0x106
 >  > syscall_plain() at netbsd:syscall_plain+0x1fc
 >  > uvm_fault(0xffff8000a666f440, 0x6907000, 1) -> e
 >  > kernel: page fault trap, code=0
 >  > Faulted in DDB; continuing...
 >  > db{1}> continue
 >  > syncing disks... 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 giving up
 >  > 
 >  > dumping to dev 18,1 offset 33560487
 >  > dump 8189 8188 8187 8186 8185 8184 8183 8182 8181 8180 8179 8178 8177 
 > .....
 >  > 
 >  > I've started more than 1000 processes to see what will happen if "some" 
 >  > memory is needed for processes. (cat ... | dd obs=... | dd obs=... | ... 
 >  >  >/dev/null)
 >  > The systems reduces the amount of memory used by the file-cache to 
 >  > something about 1,2 GB of the 8 GB main memory - as expected.
 >  > At the time of the crash there was something around 800 MB of 24 GB swap 
 >  > space used.
 >  > 
 >  > I know that this is not related to the previous problem. Does it make 
 >  > sence to create another bug report for that? I'm not shure about it.
 >  > 
 >  > W. Stukenbrock
 >  > 
 >  > Simon Burge wrote:
 >  > 
 >  >> Wolfgang Stukenbrock wrote:
 >  >>
 >  >>
 >  >>> The following reply was made to PR kern/39242; it has been noted by 
 >  >>> GNATS.
 >  >>>
 >  >>> From: Wolfgang Stukenbrock 
 > <Wolfgang.Stukenbrock%nagler-company.com@localhost>
 >  >>> To: gnats-bugs%NetBSD.org@localhost
 >  >>> Cc: Subject: Re: kern/39242: NetBSD 4.0 will start busy-loop an hang 
 >  >>> on machines with more than 4 GB memory
 >  >>> Date: Tue, 29 Jul 2008 09:07:09 +0200
 >  >>>
 >  >>> Hi,
 >  >>>
 >  >>> yes it is a E3110 CPU with 6MB cache.
 >  >>> What files I need to catch from the current and integrate the changes 
 >  >>> into my 4.0-version of netbsd?
 >  >>> You talk about the "cache detection stuff" AND the file named below.
 >  >>>
 >  >>> I think it would be a real great idea to bring this fix into the 
 >  >>> releases as soon as possible.
 >  >>> This CPU is "very" cheep compared to the other one's (at least in 
 >  >>> germany) and therefore is very attractive for new systems.
 >  >>>
 >  >>
 >  >> Does rev 1.32 of sys/arch/x86/x86/cpu.c apply cleanly to netbsd-4 ?  I
 >  >> don't recall which bits of which files moved around with x86 recently.
 >  >> If so, that at least guarantees that a bogus number doesn't get passed
 >  >> deeper into UVM, and that'll be enough for netbsd-4.
 >  >>
 >  >> Note also the problem was to do with "half memory used", not "4G of
 >  >> memory used".
 >  >>
 >  >> Cheers,
 >  >> Simon.
 >  >>
 >  > 
 >  > 
 >  
 >  
 > 
 
 


Home | Main Index | Thread Index | Old Index