OK, so netbsd-10 appears to be "better" than netbsd-9, if not fixed
for this case. At least things are moving in the right direction!
(though it may make it harder to determine when things are fixed :)
The reset printf comes from:
https://nxr.netbsd.org/xref/src/sys/dev/ic/wdc.c#1049
I wonder if:
- netbsd pokes the chipset at the wrong time/wrong way and manage to
wedge it, hence the lost interrupts and hangs
- wdc->reset() is failing to reset things
- __wdcwait_reset() is failing to read a good state (too short timeout or...?)
It might be interesting to see what a kernel with ATADEBUG shows,
though there is a possibility that outputting the debug will change
the timing enough not to trigger the issue. It may also be interesting
to see if running with polling IO rather than interrupts avoids the
issue (though the latter definitely just for testing).
I'm not sure I'd feel right trying to encourage a testing of that
nature on a col-lo datacentre machine tho' :)