Subject: Re: bin/10625: /usr/bin/cmp is unable to compare rather large files
To: None <netbsd-bugs@netbsd.org>
From: James Chacon <jchacon@genuity.net>
List: netbsd-bugs
Date: 08/19/2000 21:38:40
I know this thread died a while back but I'm about to send-pr the same
thing on tail so I'm picking up part of this now :-)

>> >of the process's address space (the manual page makes no obvious claims
>> >along these lines)?
>> 
>> It really has to fail, since there is no way to give you back a
>> char * that can access data beyond the end of your address space.
>> And in fact it is more hairy than that because it must find a free
>> contiguous region in your address space to hand back to you.  This
>> may be the reason that 2.6GB of files couldn't be mapped.
>
>But does it actually *really* always fail when there's no more room in
>your maximum allowed process size?  I've not tested it yet, and I am
>very leary of mmap() given previous bugs it has suffered.

Yes. Here's things from uvm_mmap.c:


        /*
         * XXX (in)sanity check.  We don't do proper datasize checking
         * XXX for anonymous (or private writable) mmap().  However,
         * XXX know that if we're trying to allocate more than the amount
         * XXX remaining under our current data size limit, _that_ should
         * XXX be disallowed.
         */

After these checks sys_mmap calls uvm_mmap which sets up some things
(and does some more sanity checking) and finally calls uvm_map to attempt the
actual mapping. sys_mmap is quite small and ends up in the generic uvm mapping
routines to do the actual work so I'm unsure what you're leery of here. Bad
userland code implementations (like presuming you can mmap() anyting) doesn't
mean the implementation itself is the issue. 

In uvm_map if space can't be found to map this it'll bail:

        if ((prev_entry = uvm_map_findspace(map, *startp, size, startp, 
            uobj, uoffset, flags & UVM_FLAG_FIXED)) == NULL) {
                UVMHIST_LOG(maphist,"<- uvm_map_findspace failed!",0,0,0,0);
                vm_map_unlock(map);
                return (KERN_NO_SPACE);

Findspace errors almost right away if you ask for a mapping that extends
off the end of available address space.

>
>If it does fail then this failure case should be documented (more?)
>clearly in mmap(2).

     [ENOMEM]      MAP_FIXED was specified and the addr parameter wasn't
                   available.  MAP_ANON was specified and insufficient memory
                   was available.


How is that not clear? This looks like any other man page with regards to
error codes and caveats. It should be fairly obvious to folks that attempting
to map a file/memory area larger than virtual memory space isn't going to
work. As a matter of fact I'd be highly unhappy if the interface did rolling
windows or some other nonsense in an attempt to accomodate me. That certainly
violates the principle of least surprise.

>
>Is there a regression test to make sure it continues to fail when it
>should?  It doesn't appear src/regress/sys/uvm/mmap/mmap.c has a test
>for this case....

It probably can't hurt to create one but as I point out above if somehow
mmap doesn't fail here the whole system has issues as sys_mmap depends on
the standard uvm_map calls to actually perform the work.

As I pointed out above I'm about to send-pr the exact same condition with
tail (and probably anything else in the source tree using mmap()). I have a
2.3G log file I attempt to tail and it bails with:

tail: /usr/local/sup/logs/fetch.log.071100.1: File too large

Unfortunately the tail code doesn't have the stdio fallback routines already
in it. To fix this either requires writing file routines or rewriting the
code to handle moving mmap windows. I'm going to go down the latter route as
I'm betting I'll get reuse out of it in other cases that crop up.

James