NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/48959: Misrepresentation of files of 4 GiB or larger in cd9660



>Number:         48959
>Category:       kern
>Synopsis:       Misrepresentation of files of 4 GiB or larger in cd9660
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jul 02 09:50:00 +0000 2014
>Originator:     Thomas Schmitt
>Release:        6.99.44
>Organization:
>Environment:
NetBSD netbsdcur.local 6.99.44 NetBSD 6.99.44 (GENERIC) #32: 
Wed Jul  2 07:55:04 UTC 2014  ...:/usr/obj/sys/arch/i386/compile/GENERIC i386

>Description:
Data files of size 4 GiB or larger have to be represented in ISO 9660
in a form which is misrepresented by the cd9660 filesystem driver.

  netbsd# mount_cd9660 /dev/cd0a /mnt/iso
  netbsd# ls -l /mnt/iso/my
  total 16777208
  -rw-r--r--  1 thomas  dbus  4294965248 May  6 15:30 large_file
  -rw-r--r--  1 thomas  dbus  4294965248 May  6 15:30 large_file

Whereas the Debian 6 host operating system of my NetBSD VM says

  $ ls -l /mnt/iso/my
  total 4227906
  -rw-r--r-- 1 thomas thomas 4329375744 May  6 17:30 large_file

Mounting without Rock Ridge POSIX names yields a different result:

  netbsd# mount_cd9660 -o norrip /dev/cd0a /mnt/iso
  netbsd# ls -l /mnt/iso/my
  total 67208
  -r-xr-xr-x  1 root  wheel  34410496 May  6 15:30 large_file

which is indeed the missing end piece of the two identical start
pieces shown with Rock Ridge.
(The reason is unification code which is a precaution for handling
 the rarely used ISO 9660 feature of multiple File Versions.)

>How-To-Repeat:
The problem is demonstrated by

    http://scdbackup.webframe.org/large.iso.bz2

4470 bytes, MD5 7d78dc3efaec8ea3f1801335329f410d.
It inflates to 4,329,897,984 bytes.

Do this only if the fix of kern/48787 is applied.
If not, you will not get rid of the /mnt/iso mount point until reboot !

>Fix:
I have now implemented a changeset which corrects above problem.

An auxiliary text is available at
  http://scdbackup.webframe.org/cd9660_level3_notes.txt
It assesses ISO 9660 specs and current implementation, motivates 
the proposal for a model change in struct iso_node, and publishes
my (now nearly fulfilled) todo sheet.

The changeset anticipates the adoption of PR 48808 (mount option -s).

The goal is to let cd9660 recognize files with multiple file sections
and represent their multiple directory records as a single vnode with
a uniform byte space.

There shall be no duplicate filenames presented to VFS. If files with
equal names are found, then only the last one of those with the highest
ISO 9660 version number will be visible.
This guarantee depends on properly sorted ISO 9660 directories.
The decision which intervals of directory records form a single file
depends on properly set Multi-Extent flag bits.

The filesystem specific vnode.v_data struct iso_node needed a change
to represent the 1:n relation between file and file section.
This change caused code adjustments all over the code of cd9660.
It makes nearly full use of the 64 bits of NetBSD's ino_t and
employs kmem(9) memory for files with more than one section.

Several implementations of interface methods are affected:

- cd9660_readdir() serving as VOP_READDIR(9)

  The case of mount -o norrip,nogens already used a delivery function
  with delayed file candidates: cd9660_vnops.c : iso_shipdir().
  Originally it only had the task to find the youngest version of a
  ISO 9660 data file and to separate associated files from the files
  to which they are bound.
  Now it decides which directory records form a valid inode,
  counts the records of the same file, and skips over them.

- cd9660_lookup() serving as VOP_LOOKUP(9)

  mount -o norrip returned the last record of matching name,
  whereas -o rrip returned the first matching record.
  Now norrip with a healthy ISO 9660 filesystem drops only older
  versions of the same name.
  All three filesystem interpretation types now return the inode number
  based on the byte address of first record of the winning file and on
  its number of file sections.

- cd9660_loadvnode() serving as struct vfsops.vfs_loadvnode to equip
  vnodes with struct iso_node. 
  Used indirectly by VFS_VGET(9), VFS_FHTOVP(9), VOP_LOOKUP(9).

  If the ino_t input parameter indicates a number of file sections larger
  than one, then the created iso_node gets kmem(9) memory attached as
  iso_node.iso_sections.
  The iso_node.i_number will indicate a file section count larger than 1
  only if such memory is attached to the iso_node.

  Because the function was already quite long and my changes added
  more complexity, i outsourced several functions:

  - iso_read_next_isodir()
    enters the linked list of ISO 9660 directory records at the
    first record of a file and may then iterate over the list.
    It uses bread(9) and brelse(9) as appropriate.
    The caller has to make the decision whether the list is at its end
    or whether iso_read_next_isodir() may be called again.

  - iso_register_fsects()
    records the start blocks and byte sizes of one or more file sections.

  - isodir_read_isoextattr()
    reads the ISO 9660 Extended Attribute blocks, if present.

  Regrettably this makes the diff about cd9660_loadvnode() hard to read.

- cd9660_bmap() as VOP_BMAP(9)

  Nothing changes for files with a single file section.
  Those with more file sections will need a loop to find the section
  which holds the desired block. Similar to the case of a single section,
  the last section will be base of the resulting block address,
  regardless whether its size includes the input block.


---------------------------------------------------------------------

The changes are tested by an atf-like test script and two ISO images:

  http://scdbackup.webframe.org/t_cd9660_regression_v02.tgz

with content

  cd9660/t_cd9660_regression.sh
  cd9660/cd9660_rgr.image.at0.bz2.uue
  cd9660/cd9660_rgr.image.at2114008.bz2.uue
  cd9660/exoten.iso.bz2.uue

When executed by
  cd cd9660 && ./t_cd9660_regression.sh
the script runs 5 test cases
 - pr_kern_48787   modified atf test of Martin Husemann/Paul Goyette
 - rock_ridge_rgr  Rock Ridge regression test
 - mount_s         for mount_cd9660 option -s
 - large_file      for large file
 - exotic          for exotic or undigestible file situations

The first four cases are exercised with cd9660_rgr.image, which contains
an ISO 9660 filesystem with large data file, and examples for the
Rock Ridge POSIX file types regular, directory, block device, fifo,
symbolic link.
xorriso perception of Rock Ridge aspect:
  dr-x------    1 1000     0               0 May  6 15:31 '/'
  dr-x------    1 1000     0               0 May  3 14:58 '/dev'
  prw-------    1 1000     0               0 May 24 14:29 '/dev/test.fifo'
  br--------    1 1000     5            0,12 May 14 14:33 '/dev/wd1e'
  dr-x------    1 1000     1000            0 May  6 15:30 '/my'
  -r--------    1 1000     1000     4329375744 May  6 15:30 '/my/large_file'
  dr-x------    1 1000     0               0 Jan 19 14:41 '/reg'
  -r-x------    1 1000     0          133411 Jan 19 14:41 '/reg/tar'
  lr-x------    1 1000     0               0 May 24 14:29 '/reg/to_regfile' -> 
'tar'
  -r--------    1 1000     1000            6 May  6 15:34 '/small_file'

exoten.iso challenges rather exotic situations. It began its life as
normal xorriso ISO and was manipulated by binary editing.
- File "01" exists in three versions.
  The last one "01.;3" has two sections of 4 KB each.
  The Rock Ridge names of those three versions differ, as they ought.
- File "07" has a first file section of 4095 bytes. It is therefore
  not digestible for cd9660_bmap()/VOP_BMAP(9).
- File "09" contains an ISO 9660 Extended Attribute block with
  user id 1234, group id 5678, and a timestamp of june 2014.
- File "12" has an associated file (Rock Ridge name "11").
  Associated files are shown by cd9660 with a leading "=".
  (Whether this is a wise choice is a different question.)

-o norrip,gens :
  netbsd$ ls -l /mnt/iso/exotic
  ls: 07.;1: Operation not supported
  total 224
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 00.;1
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 01.;1
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 01.;2
  -r-xr-xr-x  1 root  wheel  8192 May 26 08:00 01.;3
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 05.;1
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 08.;1
  -r-xr-xr-x  1 root  wheel  2048 May 26 08:00 09.;1
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 10.;1
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 12.;1
  ...
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 =12.;1

-o norrip :
  ls: 07: Operation not supported
  total 208
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 00
  -r-xr-xr-x  1 root  wheel  8192 May 26 08:00 01
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 05
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 08
  -r-xr-xr-x  1 root  wheel  2048 May 26 08:00 09
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 10
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 12
  ...
  -r-xr-xr-x  1 root  wheel  4096 May 26 08:00 =12

-o norrip,extatt
  ...
  -r-x--xr--  1 1234  5678  2048 Jun  4 07:51 09
  ...

default (Rock Ridge) :
  ls: 07: Operation not supported
  total 224
  -rw-r--r--  1 thomas  dbus  4096 May 26 08:00 00
  -rw-r--r--  1 thomas  dbus  4096 May 26 08:00 01
  -rw-r--r--  1 thomas  dbus  4096 May 26 08:00 02
  -rw-r--r--  1 thomas  dbus  8192 May 26 08:00 03
  -rw-r--r--  1 thomas  dbus  4096 May 26 08:00 05
  -rw-r--r--  1 thomas  dbus  4096 May 26 08:00 08
  -rw-r--r--  1 thomas  dbus  4096 May 26 08:00 10
  -rw-r--r--  1 thomas  dbus  4096 May 26 08:00 11
  -rw-r--r--  1 thomas  dbus  4096 May 26 08:00 12
  ...

----------------------------------------------------------------

API/ABI compatibility:

The ABI of struct iso_node was recently broken by Revision 1.16
of cd9660_node.h which removed an obsolete internal handle:
  -       LIST_ENTRY(iso_node) i_hash;

My change proposal is API/ABI compatible to Revision 1.16.

----------------------------------------------------------------

Remaining restrictions:

- ISO 9660 allows a file to be composed of multiple file sections
  with sizes which are not aligned to the filesystem block size.
  cd9660 demands that all but the last file section of a file must
  have sizes which are multiples of the block size. Usually 2 KiB.
  (Debian 6 GNU/Linux accepts unaligned file sections but messes
   up the content by truncating the inner section to block end,
   and filling up the file end by the lost number of bytes from
   the next data file. Not desirable.)

- My change imposes a deliberate limit of 128 on the number of sections
  per file. CD9660_FSECT_MAX can be adjusted in cd9660_node.h.

Remaining problems:

- The inode numbers of Rock Ridge PX are ignored. Hardlink siblings
  will get different ino_t values assigned.

- The name comparison for finding identical names is still not
  surely in sync underneath VOP_READDIR(9) and VOP_LOOKUP(9).
  It is done by two different functions in cd9660_util.c :
  isofntrans() and isofncmp(). For the sake of consistency, it
  would be desirable to unify them.
  I do not propose it now, because it has impact on lookup
  performance, has its own potential for regressions, and is not 
  urgently needed yet.

- I could not yet find ISO images or software which would provide
  test opportunities for ISO 9660 Associated Files or Extented
  Attributes (which are not related to getextattr(1)/extattr(9)).
  So i could only test my self-crafted examples in exoten.iso.

About the inode number inflation:

  Large data files get giant inode numbers, because the file section
  count is encoded above bit 48 of ino_t.

  The hardest reason why this information has to be encoded in ino_t,
  is the desire to implement method VFS_VGET(9). If VOP_LOOKUP(9) would
  be the only method which leads to creation of a vnode, then the address
  and count could be stored in some other members of struct iso_node.
  A simple EOPNOTSUPP would open this path.

  One could cut inode numbers to 32 bit and then port the cd9660
  improvements to FreeBSD. (Not that freebsd-hackers would be much
  interested in cd9660.)

Copyright:

  I do not plan to claim own copyright. But the old copyright texts in
  the cd9660 sources should probably be updated.
  I have seen in a test script
    # Copyright (c) 2014 The NetBSD Foundation, Inc.
  which would be well ok for me.



Home | Main Index | Thread Index | Old Index