NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/48959: Misrepresentation of files of 4 GiB or larger in cd9660
>Number: 48959
>Category: kern
>Synopsis: Misrepresentation of files of 4 GiB or larger in cd9660
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jul 02 09:50:00 +0000 2014
>Originator: Thomas Schmitt
>Release: 6.99.44
>Organization:
>Environment:
NetBSD netbsdcur.local 6.99.44 NetBSD 6.99.44 (GENERIC) #32:
Wed Jul 2 07:55:04 UTC 2014 ...:/usr/obj/sys/arch/i386/compile/GENERIC i386
>Description:
Data files of size 4 GiB or larger have to be represented in ISO 9660
in a form which is misrepresented by the cd9660 filesystem driver.
netbsd# mount_cd9660 /dev/cd0a /mnt/iso
netbsd# ls -l /mnt/iso/my
total 16777208
-rw-r--r-- 1 thomas dbus 4294965248 May 6 15:30 large_file
-rw-r--r-- 1 thomas dbus 4294965248 May 6 15:30 large_file
Whereas the Debian 6 host operating system of my NetBSD VM says
$ ls -l /mnt/iso/my
total 4227906
-rw-r--r-- 1 thomas thomas 4329375744 May 6 17:30 large_file
Mounting without Rock Ridge POSIX names yields a different result:
netbsd# mount_cd9660 -o norrip /dev/cd0a /mnt/iso
netbsd# ls -l /mnt/iso/my
total 67208
-r-xr-xr-x 1 root wheel 34410496 May 6 15:30 large_file
which is indeed the missing end piece of the two identical start
pieces shown with Rock Ridge.
(The reason is unification code which is a precaution for handling
the rarely used ISO 9660 feature of multiple File Versions.)
>How-To-Repeat:
The problem is demonstrated by
http://scdbackup.webframe.org/large.iso.bz2
4470 bytes, MD5 7d78dc3efaec8ea3f1801335329f410d.
It inflates to 4,329,897,984 bytes.
Do this only if the fix of kern/48787 is applied.
If not, you will not get rid of the /mnt/iso mount point until reboot !
>Fix:
I have now implemented a changeset which corrects above problem.
An auxiliary text is available at
http://scdbackup.webframe.org/cd9660_level3_notes.txt
It assesses ISO 9660 specs and current implementation, motivates
the proposal for a model change in struct iso_node, and publishes
my (now nearly fulfilled) todo sheet.
The changeset anticipates the adoption of PR 48808 (mount option -s).
The goal is to let cd9660 recognize files with multiple file sections
and represent their multiple directory records as a single vnode with
a uniform byte space.
There shall be no duplicate filenames presented to VFS. If files with
equal names are found, then only the last one of those with the highest
ISO 9660 version number will be visible.
This guarantee depends on properly sorted ISO 9660 directories.
The decision which intervals of directory records form a single file
depends on properly set Multi-Extent flag bits.
The filesystem specific vnode.v_data struct iso_node needed a change
to represent the 1:n relation between file and file section.
This change caused code adjustments all over the code of cd9660.
It makes nearly full use of the 64 bits of NetBSD's ino_t and
employs kmem(9) memory for files with more than one section.
Several implementations of interface methods are affected:
- cd9660_readdir() serving as VOP_READDIR(9)
The case of mount -o norrip,nogens already used a delivery function
with delayed file candidates: cd9660_vnops.c : iso_shipdir().
Originally it only had the task to find the youngest version of a
ISO 9660 data file and to separate associated files from the files
to which they are bound.
Now it decides which directory records form a valid inode,
counts the records of the same file, and skips over them.
- cd9660_lookup() serving as VOP_LOOKUP(9)
mount -o norrip returned the last record of matching name,
whereas -o rrip returned the first matching record.
Now norrip with a healthy ISO 9660 filesystem drops only older
versions of the same name.
All three filesystem interpretation types now return the inode number
based on the byte address of first record of the winning file and on
its number of file sections.
- cd9660_loadvnode() serving as struct vfsops.vfs_loadvnode to equip
vnodes with struct iso_node.
Used indirectly by VFS_VGET(9), VFS_FHTOVP(9), VOP_LOOKUP(9).
If the ino_t input parameter indicates a number of file sections larger
than one, then the created iso_node gets kmem(9) memory attached as
iso_node.iso_sections.
The iso_node.i_number will indicate a file section count larger than 1
only if such memory is attached to the iso_node.
Because the function was already quite long and my changes added
more complexity, i outsourced several functions:
- iso_read_next_isodir()
enters the linked list of ISO 9660 directory records at the
first record of a file and may then iterate over the list.
It uses bread(9) and brelse(9) as appropriate.
The caller has to make the decision whether the list is at its end
or whether iso_read_next_isodir() may be called again.
- iso_register_fsects()
records the start blocks and byte sizes of one or more file sections.
- isodir_read_isoextattr()
reads the ISO 9660 Extended Attribute blocks, if present.
Regrettably this makes the diff about cd9660_loadvnode() hard to read.
- cd9660_bmap() as VOP_BMAP(9)
Nothing changes for files with a single file section.
Those with more file sections will need a loop to find the section
which holds the desired block. Similar to the case of a single section,
the last section will be base of the resulting block address,
regardless whether its size includes the input block.
---------------------------------------------------------------------
The changes are tested by an atf-like test script and two ISO images:
http://scdbackup.webframe.org/t_cd9660_regression_v02.tgz
with content
cd9660/t_cd9660_regression.sh
cd9660/cd9660_rgr.image.at0.bz2.uue
cd9660/cd9660_rgr.image.at2114008.bz2.uue
cd9660/exoten.iso.bz2.uue
When executed by
cd cd9660 && ./t_cd9660_regression.sh
the script runs 5 test cases
- pr_kern_48787 modified atf test of Martin Husemann/Paul Goyette
- rock_ridge_rgr Rock Ridge regression test
- mount_s for mount_cd9660 option -s
- large_file for large file
- exotic for exotic or undigestible file situations
The first four cases are exercised with cd9660_rgr.image, which contains
an ISO 9660 filesystem with large data file, and examples for the
Rock Ridge POSIX file types regular, directory, block device, fifo,
symbolic link.
xorriso perception of Rock Ridge aspect:
dr-x------ 1 1000 0 0 May 6 15:31 '/'
dr-x------ 1 1000 0 0 May 3 14:58 '/dev'
prw------- 1 1000 0 0 May 24 14:29 '/dev/test.fifo'
br-------- 1 1000 5 0,12 May 14 14:33 '/dev/wd1e'
dr-x------ 1 1000 1000 0 May 6 15:30 '/my'
-r-------- 1 1000 1000 4329375744 May 6 15:30 '/my/large_file'
dr-x------ 1 1000 0 0 Jan 19 14:41 '/reg'
-r-x------ 1 1000 0 133411 Jan 19 14:41 '/reg/tar'
lr-x------ 1 1000 0 0 May 24 14:29 '/reg/to_regfile' ->
'tar'
-r-------- 1 1000 1000 6 May 6 15:34 '/small_file'
exoten.iso challenges rather exotic situations. It began its life as
normal xorriso ISO and was manipulated by binary editing.
- File "01" exists in three versions.
The last one "01.;3" has two sections of 4 KB each.
The Rock Ridge names of those three versions differ, as they ought.
- File "07" has a first file section of 4095 bytes. It is therefore
not digestible for cd9660_bmap()/VOP_BMAP(9).
- File "09" contains an ISO 9660 Extended Attribute block with
user id 1234, group id 5678, and a timestamp of june 2014.
- File "12" has an associated file (Rock Ridge name "11").
Associated files are shown by cd9660 with a leading "=".
(Whether this is a wise choice is a different question.)
-o norrip,gens :
netbsd$ ls -l /mnt/iso/exotic
ls: 07.;1: Operation not supported
total 224
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 00.;1
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 01.;1
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 01.;2
-r-xr-xr-x 1 root wheel 8192 May 26 08:00 01.;3
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 05.;1
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 08.;1
-r-xr-xr-x 1 root wheel 2048 May 26 08:00 09.;1
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 10.;1
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 12.;1
...
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 =12.;1
-o norrip :
ls: 07: Operation not supported
total 208
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 00
-r-xr-xr-x 1 root wheel 8192 May 26 08:00 01
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 05
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 08
-r-xr-xr-x 1 root wheel 2048 May 26 08:00 09
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 10
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 12
...
-r-xr-xr-x 1 root wheel 4096 May 26 08:00 =12
-o norrip,extatt
...
-r-x--xr-- 1 1234 5678 2048 Jun 4 07:51 09
...
default (Rock Ridge) :
ls: 07: Operation not supported
total 224
-rw-r--r-- 1 thomas dbus 4096 May 26 08:00 00
-rw-r--r-- 1 thomas dbus 4096 May 26 08:00 01
-rw-r--r-- 1 thomas dbus 4096 May 26 08:00 02
-rw-r--r-- 1 thomas dbus 8192 May 26 08:00 03
-rw-r--r-- 1 thomas dbus 4096 May 26 08:00 05
-rw-r--r-- 1 thomas dbus 4096 May 26 08:00 08
-rw-r--r-- 1 thomas dbus 4096 May 26 08:00 10
-rw-r--r-- 1 thomas dbus 4096 May 26 08:00 11
-rw-r--r-- 1 thomas dbus 4096 May 26 08:00 12
...
----------------------------------------------------------------
API/ABI compatibility:
The ABI of struct iso_node was recently broken by Revision 1.16
of cd9660_node.h which removed an obsolete internal handle:
- LIST_ENTRY(iso_node) i_hash;
My change proposal is API/ABI compatible to Revision 1.16.
----------------------------------------------------------------
Remaining restrictions:
- ISO 9660 allows a file to be composed of multiple file sections
with sizes which are not aligned to the filesystem block size.
cd9660 demands that all but the last file section of a file must
have sizes which are multiples of the block size. Usually 2 KiB.
(Debian 6 GNU/Linux accepts unaligned file sections but messes
up the content by truncating the inner section to block end,
and filling up the file end by the lost number of bytes from
the next data file. Not desirable.)
- My change imposes a deliberate limit of 128 on the number of sections
per file. CD9660_FSECT_MAX can be adjusted in cd9660_node.h.
Remaining problems:
- The inode numbers of Rock Ridge PX are ignored. Hardlink siblings
will get different ino_t values assigned.
- The name comparison for finding identical names is still not
surely in sync underneath VOP_READDIR(9) and VOP_LOOKUP(9).
It is done by two different functions in cd9660_util.c :
isofntrans() and isofncmp(). For the sake of consistency, it
would be desirable to unify them.
I do not propose it now, because it has impact on lookup
performance, has its own potential for regressions, and is not
urgently needed yet.
- I could not yet find ISO images or software which would provide
test opportunities for ISO 9660 Associated Files or Extented
Attributes (which are not related to getextattr(1)/extattr(9)).
So i could only test my self-crafted examples in exoten.iso.
About the inode number inflation:
Large data files get giant inode numbers, because the file section
count is encoded above bit 48 of ino_t.
The hardest reason why this information has to be encoded in ino_t,
is the desire to implement method VFS_VGET(9). If VOP_LOOKUP(9) would
be the only method which leads to creation of a vnode, then the address
and count could be stored in some other members of struct iso_node.
A simple EOPNOTSUPP would open this path.
One could cut inode numbers to 32 bit and then port the cd9660
improvements to FreeBSD. (Not that freebsd-hackers would be much
interested in cd9660.)
Copyright:
I do not plan to claim own copyright. But the old copyright texts in
the cd9660 sources should probably be updated.
I have seen in a test script
# Copyright (c) 2014 The NetBSD Foundation, Inc.
which would be well ok for me.
Home |
Main Index |
Thread Index |
Old Index