Re: lib/43310: dirent(5) standard compliance

To: lib-bug-people%netbsd.org@localhost,gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost,jruohonen%iki.fi@localhost
Subject: Re: lib/43310: dirent(5) standard compliance
From: Matthew Mondor <mm_lists%pulsar-zone.net@localhost>
Date: Sat, 15 May 2010 23:45:01 +0000 (UTC)

The following reply was made to PR lib/43310; it has been noted by GNATS.

From: Matthew Mondor <mm_lists%pulsar-zone.net@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: Re: lib/43310: dirent(5) standard compliance
Date: Sat, 15 May 2010 19:41:03 -0400

 On Sat, 15 May 2010 10:35:02 +0000 (UTC)
 Jukka Ruohonen <jruohonen%iki.fi@localhost> wrote:
 
 > The following reply was made to PR lib/43310; it has been noted by GNATS.
 > 
 > From: Jukka Ruohonen <jruohonen%iki.fi@localhost>
 > To: gnats-bugs%NetBSD.org@localhost
 > Cc: 
 > Subject: Re: lib/43310: dirent(5) standard compliance
 > Date: Sat, 15 May 2010 13:26:38 +0300
 > 
 >  Actually this appears to be more involved.
 >  
 >  The d_name is statically allocated:
 >  
 >  > struct dirent {
 >  >         ino_t d_fileno;                 /* file number of entry */
 >  >         uint16_t d_reclen;              /* length of this record */
 >  >         uint16_t d_namlen;              /* length of string in d_name */
 >  >         uint8_t  d_type;                /* file type, see below */
 >  > #if defined(_NETBSD_SOURCE)
 >  > #define MAXNAMLEN       511
 >  >         char    d_name[MAXNAMLEN + 1];  /* name must be no longer than 
 > this
 >  > #*/
 >  > #else
 >  >         char    d_name[511 + 1];        /* name must be no longer than 
 > this
 >  > #*/
 >  > #endif
 >  > };
 >  
 >  While the standard specifically stresses that:
 >  
 >      "The array of char d_name is not a fixed size. Implementations may
 >       need to declare struct dirent with an array size for d_name of 1,
 >       but the actual number of characters provided matches (or only
 >       slightly exceeds) the length of the filename."
 >  
 >  To my reading this means that some implementations may define only a pointer
 >  to d_name and then allocate it dynamically.
 >  
 >  Consider what the Linux manual page recommends:
 >  
 >         "Since POSIX.1 does not specify the size of the d_name field, and 
 > other
 >         nonstandard fields may precede that field within the dirent 
 > structure,
 >         portable  applications  that use readdir_r() should allocate the 
 > buffer
 >         whose address is passed in entry as follows:
 >  
 >             len = offsetof(struct dirent, d_name) +
 >                       pathconf(dirpath, _PC_NAME_MAX) + 1
 >             entryp = malloc(len);
 >  
 >      (POSIX.1 requires that d_name is the last field in a struct dirent.)"
 >  
 >  And indeed glibc goes and defines this as:
 >  
 >      struct linux_dirent64 {
 >              u64             d_ino;
 >              s64             d_off;
 >              unsigned short  d_reclen;
 >              unsigned char   d_type;
 >              char            d_name[0];
 >      };
 >  
 >  I do not know how severely this affects portability, but at least the
 >  readdir_r(3) function, where the dirent is supplied by the caller, is
 >  affected.
 
 I however still see on a not so old Linux system here:
 
 #ifdef __USE_LARGEFILE64
 struct dirent64
   {
     __ino64_t d_ino;
     __off64_t d_off;
     unsigned short int d_reclen;
     unsigned char d_type;
     char d_name[256];           /* We must not include limits.h! */
   };
 #endif
 
 and know of multiple code assuming that the following should work:
 
 [...]
 {
     struct dirent ent, *entp;
 
     if ((err = readdir_r(dir, &ent, &entp)) == 0) {
 
 To me it seems to make sense to have this API, and what I understand
 from the OG text is not that we're obliged to dynamically allocate the
 d_name string, but that some implementations may internally do it (i.e.
 provide a string to an existing input from getdents(2) or equivalent),
 and that strlen(3) should be used on the string rather than
 memcpy(2) with NAME_MAX...  Such that simply changing NAME_MAX would
 make sense.  NetBSD provides the unstandard d_namlen extension which
 could permit to avoid the strlen(3) overhead, though.
 -- 
 Matt

Prev by Date: Re: lib/43310: dirent(5) standard compliance
Next by Date: Re: lib/43310: dirent(5) standard compliance
Previous by Thread: Re: lib/43310: dirent(5) standard compliance
Next by Thread: Re: lib/43310: dirent(5) standard compliance
Indexes:

Home | Main Index | Thread Index | Old Index