NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/51148: i386 install floppies no longer boot
The following reply was made to PR kern/51148; it has been noted by GNATS.
From: David Holland <dholland-bugs%netbsd.org@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: Maxime Villard <max%m00nbsd.net@localhost>
Subject: Re: kern/51148: i386 install floppies no longer boot
Date: Tue, 31 May 2016 04:32:32 +0000
This whole subthread didn't get sent to gnats.
------
From: Maxime Villard <max%m00nbsd.net@localhost>
To: Andreas Gustafsson <gson%gson.org@localhost>, netbsd-bugs%netbsd.org@localhost
Cc: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost
Subject: Re: kern/51148: i386 install floppies no longer boot
Date: Wed, 25 May 2016 13:06:09 +0200
First of all, I'm not subscribed to netbsd-bugs@, so please forward your mails
to me.
I have carefully investigated the mappings on amd64 and i386 with a kernel page
explorer I wrote, and there no issue. The levels are all linear, with no holes
in the middle, they are correctly linked, and they cover the whole kernel image,
preloaded modules and bootstrap tables.
In fact, there appears to be one bug in the L1 slot that should normally point
to the first page of the data segment: it seems to be destroyed. But this issue
was already here before my changes, so I didn't introduce it.
The changes from me you mentioned are all trivial, and it seems highly unlikely
to me that they cause the install failure. Normally, if there were a bug, it
should have been in the previous commmits. Also, my changes are in no way
install-related, and as far as I know, the mappings are the same on
CD/USB/floppy/whatever.
My guess, right now, is that my alignment changes in kern.ldscript somehow
trigger the aforementioned L1 slot bug on floppy installs.
I don't have a floppy device, and right now my NetBSD resources are limited. The
only thing I can do is asking.
Is the problem still present? (I don't see new entries in the log)
We are talking about GENERIC, and not GENERIC-PAE, right?
Does reverting only [1] fix the problem?
What if you put 'fillkpt' instead of 'fillkpt_nox' in [1]?
Thanks.
[1] 2016.05.15.07.17.53 maxv src/sys/arch/i386/i386/locore.S 1.124
From: Andreas Gustafsson <gson%gson.org@localhost>
To: Maxime Villard <max%m00nbsd.net@localhost>
Cc: netbsd-bugs%netbsd.org@localhost, kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost
Subject: Re: kern/51148: i386 install floppies no longer boot
Date: Wed, 25 May 2016 15:17:10 +0300
Maxime,
You wrote:
> First of all, I'm not subscribed to netbsd-bugs@, so please forward your mails
> to me.
Will do. I would have mailed you about the initial report if you had
been the only developer to commit during the period of build breakage
when the problem appeared, but there were commits by four developers,
and no easy way for me to determine which of them was at fault.
> I have carefully investigated the mappings on amd64 and i386 with a kernel page
> explorer I wrote, and there no issue. The levels are all linear, with no holes
> in the middle, they are correctly linked, and they cover the whole kernel image,
> preloaded modules and bootstrap tables.
>
> In fact, there appears to be one bug in the L1 slot that should normally point
> to the first page of the data segment: it seems to be destroyed. But this issue
> was already here before my changes, so I didn't introduce it.
>
> The changes from me you mentioned are all trivial, and it seems highly unlikely
> to me that they cause the install failure. Normally, if there were a bug, it
> should have been in the previous commmits. Also, my changes are in no way
> install-related, and as far as I know, the mappings are the same on
> CD/USB/floppy/whatever.
>
> My guess, right now, is that my alignment changes in kern.ldscript somehow
> trigger the aforementioned L1 slot bug on floppy installs.
>
> I don't have a floppy device, and right now my NetBSD resources are limited.
If you can run misc/py-anita from pkgsrc against an i386 release
build, that should reproduce the problem without the need for a
physical floppy device or even a NetBSD host.
> The
> only thing I can do is asking.
>
> Is the problem still present? (I don't see new entries in the log)
Yes, the problem is still present. I'm not sure what you mean about
not seeing new entries; the newest test runs are from today, and still
failing with the same error:
http://releng.netbsd.org/b5reports/i386/commits-2016.05.html#2016.05.25.10.15.01
> We are talking about GENERIC, and not GENERIC-PAE, right?
Yes.
> Does reverting only [1] fix the problem?
I will try that and report back.
> What if you put 'fillkpt' instead of 'fillkpt_nox' in [1]?
I will try that, too.
> Thanks.
>
> [1] 2016.05.15.07.17.53 maxv src/sys/arch/i386/i386/locore.S 1.124
--
Andreas Gustafsson, gson%gson.org@localhost
From: Maxime Villard <max%m00nbsd.net@localhost>
To: Andreas Gustafsson <gson%gson.org@localhost>
Cc: netbsd-bugs%netbsd.org@localhost, kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost
Subject: Re: kern/51148: i386 install floppies no longer boot
Date: Wed, 25 May 2016 16:30:58 +0200
Le 25/05/2016 ? 14:17, Andreas Gustafsson a ?crit :
> [...]
>>
>> I don't have a floppy device, and right now my NetBSD resources are limited.
>
> If you can run misc/py-anita from pkgsrc against an i386 release
> build, that should reproduce the problem without the need for a
> physical floppy device or even a NetBSD host.
>
I would be happy to do the tests myself. But the only i386 machine I have right
now is a VirtualBox VM, and there is PR 51134 that reboots the machine every ~5
minutes. I can do almost nothing on it.
From: Christos Zoulas <christos%zoulas.com@localhost>
To: Maxime Villard <max%m00nbsd.net@localhost>, Andreas Gustafsson <gson%gson.org@localhost>
Cc: netbsd-bugs%netbsd.org@localhost, kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost
Subject: Re: kern/51148: i386 install floppies no longer boot
Date: Wed, 25 May 2016 14:17:20 -0400
On May 25, 4:30pm, max%m00nbsd.net@localhost (Maxime Villard) wrote:
-- Subject: Re: kern/51148: i386 install floppies no longer boot
| Le 25/05/2016 à 14:17, Andreas Gustafsson a écrit :
| > [...]
| >>
| >> I don't have a floppy device, and right now my NetBSD resources are limited.
| >
| > If you can run misc/py-anita from pkgsrc against an i386 release
| > build, that should reproduce the problem without the need for a
| > physical floppy device or even a NetBSD host.
| >
|
| I would be happy to do the tests myself. But the only i386 machine I have right
| now is a VirtualBox VM, and there is PR 51134 that reboots the machine every ~5
| minutes. I can do almost nothing on it.
I am fixing that.
christos
From: Andreas Gustafsson <gson%gson.org@localhost>
To: Maxime Villard <max%m00nbsd.net@localhost>
Cc: netbsd-bugs%netbsd.org@localhost, kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost
Subject: Re: kern/51148: i386 install floppies no longer boot
Date: Wed, 25 May 2016 20:06:11 +0300
Maxime,
I have now run the tests you asked for.
> Does reverting only [1] fix the problem?
Yes. The system still doesn't install because the kernel is unable
to exec /sbin/init, but this is a different bug; when I don't revert
[1], the kernel does not even start (there are no kernel messages
on the console).
> What if you put 'fillkpt' instead of 'fillkpt_nox' in [1]?
I tested with this patch against 2016.05.22.09.10.37 sources:
diff -u -r1.124 locore.S
--- locore.S 15 May 2016 07:17:53 -0000 1.124
+++ locore.S 25 May 2016 14:33:35 -0000
@@ -731,7 +731,7 @@
movl RELOC(tablesize),%ecx /* length of BOOTSTRAP TABLES */
shrl $PGSHIFT,%ecx
orl $(PG_V|PG_KW),%eax
- fillkpt_nox
+ fillkpt
/* We are on (4). Map ISA I/O mem (later atdevbase) RWX. */
movl $(IOM_BEGIN|PG_V|PG_KW/*|PG_N*/),%eax
and it did _not_ fix the problem.
Later, you wrote:
> I would be happy to do the tests myself. But the only i386 machine I have right
> now is a VirtualBox VM, and there is PR 51134 that reboots the machine every ~5
> minutes. I can do almost nothing on it.
What do you host VirtualBox on? You can test the i386 port using anita+qemu
even on a non-i386 host.
--
Andreas Gustafsson, gson%gson.org@localhost
From: Maxime Villard <max%m00nbsd.net@localhost>
To: Andreas Gustafsson <gson%gson.org@localhost>
Cc: netbsd-bugs%netbsd.org@localhost, kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost
Subject: Re: kern/51148: i386 install floppies no longer boot
Date: Thu, 26 May 2016 09:33:52 +0200
I've committed a patch. Please let me know whether it fixes the issue.
From: Maxime Villard <max%m00nbsd.net@localhost>
To: Andreas Gustafsson <gson%gson.org@localhost>
Cc: netbsd-bugs%netbsd.org@localhost, kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost
Subject: Re: kern/51148: i386 install floppies no longer boot
Date: Thu, 26 May 2016 09:00:44 +0200
Le 25/05/2016 ? 19:06, Andreas Gustafsson a ?crit :
> Maxime,
>
> I have now run the tests you asked for.
>
>> Does reverting only [1] fix the problem?
>
> Yes. The system still doesn't install because the kernel is unable
> to exec /sbin/init, but this is a different bug; when I don't revert
> [1], the kernel does not even start (there are no kernel messages
> on the console).
>
>> What if you put 'fillkpt' instead of 'fillkpt_nox' in [1]?
>
> I tested with this patch against 2016.05.22.09.10.37 sources:
>
> diff -u -r1.124 locore.S
> --- locore.S 15 May 2016 07:17:53 -0000 1.124
> +++ locore.S 25 May 2016 14:33:35 -0000
> @@ -731,7 +731,7 @@
> movl RELOC(tablesize),%ecx /* length of BOOTSTRAP TABLES */
> shrl $PGSHIFT,%ecx
> orl $(PG_V|PG_KW),%eax
> - fillkpt_nox
> + fillkpt
>
> /* We are on (4). Map ISA I/O mem (later atdevbase) RWX. */
> movl $(IOM_BEGIN|PG_V|PG_KW/*|PG_N*/),%eax
>
> and it did _not_ fix the problem.
Thanks for the tests. I see where the problem is, and I'll commit a patch
soon.
>
> Later, you wrote:
>
>> I would be happy to do the tests myself. But the only i386 machine I have right
>> now is a VirtualBox VM, and there is PR 51134 that reboots the machine every ~5
>> minutes. I can do almost nothing on it.
>
> What do you host VirtualBox on? You can test the i386 port using anita+qemu
> even on a non-i386 host.
>
I'll answer in the other PR.
Home |
Main Index |
Thread Index |
Old Index