Subject: Re: "diff" loses on large files?
To: Mike Cheponis <mac@Wireless.Com>
From: Chuck Silvers <chuq@chuq.com>
List: port-i386
Date: 07/12/2001 06:44:47
hi,

it's been known for quite some time that gnu diff's algorithm isn't
so great for very large files, I thought there was even a PR about it
(though I can't find it now).

however the panic is something that definitely needs to be fixed,
and I'll look into that in 10 days when I get back from vacation
(if no one else fixes it while I'm gone).

-Chuck


On Thu, Jul 12, 2001 at 12:16:14AM -0700, Mike Cheponis wrote:
> > Date: Thu, 12 Jul 2001 12:30:58 +0200 (CEST)
> > From: Wojciech Puchar <wojtek@wojtek.3miasto.net>
> > To: Mike Cheponis <mac@Wireless.Com>
> > Subject: Re: "diff" loses on large files?
> >
> > > cpu0: AMD Athlon Model 4 (Thunderbird) (686-class), 1100.12 MHz
> > > total memory = 511 MB
> > > avail memory = 468 MB
> > > using 6574 buffers containing 26296 KB of memory
> > >
> > > $ ll *odx
> > > 3065307 -rw-r--r--  1 mac  user  661,969,123 Jul 11 21:45 bad.odx
> > > 3065315 -rw-r--r--  1 mac  user  661,969,123 Jul 11 21:45 ok.odx
> > >
> > > $ diff -H  *odx
> > > diff: memory exhausted
> >
> > ulimit -m / ulimit -d should do
> 
> Well.....
> 
> $ su
> Password:
> # ulimit -a
> cpu time (seconds)         unlimited
> file size (blocks)         unlimited
> data seg size (kbytes)     131072
> stack size (kbytes)        2048
> core file size (blocks)    unlimited
> resident set size (kbytes) 506156
> locked-in-memory size (kb) 168718
> processes                  80
> file descriptors           64
> # ulimit -d 999999
> # ulimit -a
> cpu time (seconds)         unlimited
> file size (blocks)         unlimited
> data seg size (kbytes)     999999
> stack size (kbytes)        2048
> core file size (blocks)    unlimited
> resident set size (kbytes) 506156
> locked-in-memory size (kb) 168718
> processes                  80
> file descriptors           64
> S# diff *odx
> 
> At this point, the kernel panics:
> 
> uvm_fault (0xe3846468, 0x0, 0, 3) -> 1
> kernel: page fault trap, code=0
> stopped in sendmail at uvm_swap_markbad+0xf   addl %ebx, 0x24(%eax)
> 
> Trace produces:
> 
> uvm_swap_markbad+0xf
> uvmfault_anonget+0x262
> uvm_fault+0x943
> trap() at trap+0x413
> -- trap (number 6) --
> 0x480ff95c
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Now, I did set the data seg size > my available memory.  But should this
> cause a panic?  Seems a rather rude thing to do, panic like that!
> 
> And, to get back to my original issue: diff is braindead.  It should
> -automatically- grab more memory if it can, and, if it can't, it should
> figure out how to use /tmp or some other part of the filesystem if it's
> algorithms are so pathetic that it needs so darn much memory.
> 
> Thanks again -Mike
> 
> 
> p.s. At first I typed in "ulimit -m / ulimit -d should do" at the command
> line (it's late...).
> 
> This had the effect of interpreting "-m /" as "-m 0" and therefore, after
> that command, no other command could be executed.  This would appear to
> be another serious bug.  (fwiw, I'm using "zsh")