Subject: Boot NetBSD CD on Beige G3 OF 2.4 should be possible (after fix)
To: None <port-macppc@netbsd.org>
From: =?ISO-8859-15?Q?Christian_M=FCller?= <cmue81@gmx.de>
List: port-macppc
Date: 11/02/2005 16:43:58
Ok,
some of you have described the
*Warning, unexpected short transfer 0/10240*
problem before. The problem is that OF can read and load ofwboot.xcf
from an iso9660 cd w/o a problem (having set OF variable boot-device to
ide1/@0:,\ofwboot.xcf for example), but when ofwboot of the NetBSD
project is given the kernel to load (via OF variable boot-file, e.g.
ide1/@0:1,/NETBSD.MACPPC) it will fail, resulting in looped printing of
the error message above.
I've looked at the code now for a while and seem to understand how
ofwboot.xcf looks for the kernel (which resulted in a misunderstanding
on the way that made me wrongly post: Bug in ofdev.c? question on this
list, shoot *g*).
The relevant files are (in order of appearance):
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/macppc/stand/ofwboot/boot.c?rev=1.18&content-type=text/x-cvsweb-markup
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/lib/libsa/loadfile.c?rev=1.22.2.3&content-type=text/x-cvsweb-markup
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/lib/libsa/open.c?rev=1.24&content-type=text/x-cvsweb-markup
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/macppc/stand/ofwboot/ofdev.c?rev=1.15&content-type=text/x-cvsweb-markup
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/lib/libsa/ufs.c?rev=1.45&content-type=text/x-cvsweb-markup
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/lib/libsa/ustarfs.c?rev=1.24&content-type=text/x-cvsweb-markup
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/lib/libsa/cd9660.c?rev=1.18&content-type=text/x-cvsweb-markup
Here goes:
* In boot.c//main the function loadfile(kernels[i], marks, LOAD_KERNEL)
gets called, with kernels[i] being the string you supplied via boot-file
or a default filled in by ofwboot code.
* In loadfile.c//loadfile the function open(fname, 0) is called to get a
filedescriptor fd which will then be given to fdloadfile. fname is the
same pointer kernels[i] points to.
* In open.c//open the function devopen(f, fname, &file) is called, which
is device specific code again. fname is still unprocessed (==
kenrels[i]). devopen uses OF_finddevice and other functions to check if
OF knows the device you supplied in your boot-file string. if no device
is given, the device is set to bootdev - the device ofwboot.xcf was
loaded from.
================================================================================================================================================================
ofdev.c//devopen also does the partition handling (together with
ofdev.c//filename) - in a rather obscure way, it expects letters from
"a" for the first, "b" for the second and so on (OF uses numbers) -
however doing so will cripple the filepath, which I will show with the
relevant code given [[ from
src/sys/arch/macppc/stand/ofwboot/ofdev.c//devopen() ]]. Assume
boot-file is set to ide1/ata-disk@0:a/path/to/kernel. If you use
ide1/ata-disk@0:a,/path/to/kernel the filename function will not find
your partition (see ^^^ below)!!
cp = filename(fname, &partition);
if (cp) {
strcpy(buf, cp);
*cp = 0;
}
if (!cp || !*buf)
return ENOENT;
if (!*fname)
strcpy(fname, bootdev);
strcpy(opened_name, fname);
if (partition) {
cp = opened_name + strlen(opened_name);
*cp++ = ':';
*cp++ = partition;
*cp = 0;
}
if (*buf != '/')
strcat(opened_name, "/");
strcat(opened_name, buf);
*file = opened_name + strlen(fname) + 1;
cp points at the slash after the partition letter in
"ide1/ata-disk@0:a/path/to/kernel" and partition contains 'a' after
filename() is done. After the following if is done, buf contains
"/path/to/kernel" and fname contains "ide1/ata-disk@0:a" (the slash was
replaced by the string-terminator 0 with *cp=0). After if (partition)
is done opened_name contains "ide1/ata-disk@0:a:a", then buf gets
appended, so opened_name contains "ide1/ata-disk@0:a:a/path/to/kernel".
Now it really goes wrong, fname was not changed, so
opened_name+strlen(fname)+1 lets (*file) point to "a/path/to/kernel",
NOT "/path/to/kernel"
If you use numbers instead of letters (ide1/ata-disk@0:5/path/to/kernel)
this bug won't affect you, but the code later on will, regardless of
what number you used, always use partition zero, as partition will not
be set by filename() function!
} else {
part = partition ? partition - 'a' : 0;
ofdev.partoff = label.d_partitions[part].p_offset;
}
================================================================================
^^^ [[ from src/sys/arch/macppc/stand/ofwboot/ofdev.c//filename() ]]
if (!strcmp(devtype, "block")) {
/* search for arguments */
for (cp = lp;
--cp >= str && *cp != '/' && *cp != ':';)
;
if (cp >= str && *cp == ':') {
/* found arguments */
for (cp = lp;
*--cp != ':' && *cp != ',';)
;
if (*++cp >= 'a' &&
*cp <= 'a' + MAXPARTITIONS)
*ppart = *cp;
}
}
return lp;
When the code reaches the statement above lp points at the forwardslash
succeeding the "," in "ide1/@0:a,/path/tokernel" After the first
for-loop cp points at the ":", the if is true, so the second for-loop
lets cp point to "," after it is done. So now the if that should parse
the partition letter is operating on (*lp)==(*++cp)=='/' which is
unintended --- the __solution__ is to completely delete the second for
loop and use ":" as the only delimiter for partition:
if (!strcmp(devtype, "block")) {
/* search for arguments */
for (cp = lp;
--cp >= str && *cp != '/' && *cp != ':';)
;
if (cp >= str && *cp == ':') {
/* found arguments */
if (*++cp >= 'a' &&
*cp <= 'a' + MAXPARTITIONS)
*ppart = *cp;
}
}
return lp;
================================================================================
================================================================================================================================================================
* In open.c//open the ofdev.c//devopen() function is done now and did
the right thing (tm), since we used "ide1/@0:0/path/to/kernel". Our
struct open_file f and char *file are properly set up, let's get down to
the part where the open function does the following:
besterror = ENOENT;
for (i = 0; i < nfsys; i++) {
error = FS_OPEN(&file_system[i])(file, f);
if (error == 0) {
f->f_ops = &file_system[i];
return (fd);
}
if (error != EINVAL)
besterror = error;
}
error = besterror;
filesystem and nfsys has been set up in ofdev.c//devopen, still:
file_system[0] = file_system_ufs;
file_system[1] = file_system_ustarfs;
file_system[2] = file_system_cd9660;
file_system[3] = file_system_hfs;
nfsys = 4;
So the lib standalone open function tries to be real useful in supplying
the user, that wants to load the kernel from the given device, with four
possible filesystems to read from at this early stage. Unfortunately
this works only if every single FS_OPEN(&file_system[i])(file, f)
routine returns. The cd9660 filesystem will be tried after ustarfs, but
it doesn't have a chance since in
src/sys/lib/libsa/ustarfs.c//real_fs_cyliner refuses to give up on this
while loop:
while(xferrqst > 0) {
#if !defined(LIBSA_NO_TWIDDLE)
twiddle();
#endif
for (i = 0; i < 3; ++i) {
e = DEV_STRATEGY(f->f_dev)(f->f_devdata, F_READ,
seek2 / 512, xferrqst, xferbase, &xfercount);
if (e == 0)
break;
printf("@");
}
if (e)
break;
if (xfercount != xferrqst)
printf("Warning, unexpected short transfer %d/%d\n",
(int)xfercount, (int)xferrqst);
xferrqst -= xfercount;
xferbase += xfercount;
seek2 += xfercount;
}
So basically we want to read 10240 bytes with ustarfs code from an
iso9660 filesystem - since it properly fails to work (we transfered 0
bytes in xfercount) this loop will only end hitting the power button. A
quick hack might be trying to read from an iso9660 filesystem before
ustarfs (of course, if the iso9660 code doesn't return you will have the
same trouble when trying to boot the kernel from ustarfs), a good hack
should probably make ustarfs code return from the loop...
Regards,
Christian