Subject: Re: semantics of off_t changes
To: None <Matthieu.Herrb@laas.fr>
From: Chris G. Demetriou <cgd@postgres.Berkeley.EDU>
List: current-users
Date: 04/11/1994 20:04:19
>  The fact that off_t is 64 bits while size_t is still 32 bits means
>  that a file cannot allways be loaded in memory in one read().

yup.  as a matter of fact, given 32 bits is the maximum address
space available to programs on the i386, you'll *never* be able to
have a process malloc more than that, so there are some files that
will be just too big.

> This is quite awful ! X sources (and other) are full of code like:
> 
> 	fstat(fd, &st);
> 	buf = malloc(st.st_size);
> 	if (buf == NULL) {
> 	   /* error */
> 	   ...
> 	}
> 	if (read(fd, buf, st.st_size) != st.st_size) {
> 	   /* error */
> 	   ....
> 	}
> 
> which obviously can fail if the file is large enough. 

Right.  If the file is too large, the user's most likely done
something stupid, and will be squished by the error returned
from malloc() (if the system is smart enough to do that).

> In the case of X, I suppose that (st.st_size > 2^31) is an error
> condition anyway by in the general case, such code has to be replaced
> by an mmap() call (which does take an off_t size).

No, in addition to being incorrect, that's not really sensible.

First of all, mmap looks like:

     caddr_t
     mmap(caddr_t addr, int len, int prot, int flags, int fd, off_t offset)

i.e. the len is an int (but should really be a size_t -- on an i386,
the same)...

Second, given that, and the physical realities involved, you
*still* can't allocate more of an address space than size_t (or
e.g. caddr_t) will allow, so it's not possible anyway.

In most cases, programs which decide to malloc and read in whole
files without regard for the file size are broken.  For (small)
files with a predictable length, this is a reasonable technique,
but more most purposes it's just wrong...


chris

------------------------------------------------------------------------------