current-users: RAIDframe halts

Subject: RAIDframe halts
To: None <tech-kern@netbsd.org>
From: Mike M. Volokhov <mishka@apk.od.ua>
List: current-users
Date: 01/26/2004 13:23:15
Greetings!

Using RAIDframe (level 5) on NetBSD 1.6.2_RC3 within three SCSI IBM drives
I got system halted in the following state (produced by top(1)):

  PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
  498 root     -18    0   220K 3604K uvn_fp1    0:01 11.84%  6.25% cp
   17 root     -18    0     0K   19M km_getwa   0:05  3.56%  3.56% [raid]
  499 root     -18    0  1032K 1768K uvn_fp1    0:00  2.45%  1.17% sendmail
  500 root     -18    0  1032K 1768K uvn_fp1    0:00  1.32%  0.34% sendmail

This happens when cp (PID 498) was invoked on big set of various data (all
about 500M) using the following command:

	cp -Rp /from/non/raid/partition /to/raid-5/partition

The system hangs after some data was already copyed (about 20%). The system
replies to ICMP echo requests, but unable to communicate even trough console
keyboard. Possible it will hangs even on copying from RAID, or on copying
to RAID from RAID too, but this still untested by me.

That two sendmail (499, 500) processes are shown just for example of system
state.

My disks are:

	Non raid:
sd0 at scsibus0 target 0 lun 0: <IBM, DCAS-34330W, S65A> SCSI2 0/direct fixed
sd0: 4134 MB, 8205 cyl, 6 head, 171 sec, 512 bytes/sect x 8467200 sectors
sd0: sync (50.0ns offset 15), 16-bit (40.000MB/s) transfers, tagged queueing
sd5 at scsibus0 target 8 lun 0: <SEAGATE, ST34501W, 0018> SCSI2 0/direct fixed
sd5: 4339 MB, 6576 cyl, 8 head, 168 sec, 512 bytes/sect x 8887200 sectors
sd5: sync (50.0ns offset 15), 16-bit (40.000MB/s) transfers, tagged queueing

	Under RAID level 5 control:
sd1 at scsibus0 target 1 lun 0: <IBM, DDRS-34560D, DC1B> SCSI2 0/direct fixed
sd1: 4357 MB, 8387 cyl, 5 head, 212 sec, 512 bytes/sect x 8925000 sectors
sd1: sync (50.0ns offset 15), 16-bit (40.000MB/s) transfers, tagged queueing
sd2 at scsibus0 target 2 lun 0: <IBM, DDRS-34560D, DC1B> SCSI2 0/direct fixed
sd2: 4357 MB, 8387 cyl, 5 head, 212 sec, 512 bytes/sect x 8925000 sectors
sd2: sync (50.0ns offset 15), 16-bit (40.000MB/s) transfers, tagged queueing
sd3 at scsibus0 target 3 lun 0: <IBM, DDRS-34560D, DC1B> SCSI2 0/direct fixed
sd3: 4357 MB, 8387 cyl, 5 head, 212 sec, 512 bytes/sect x 8925000 sectors
sd3: sync (50.0ns offset 15), 16-bit (40.000MB/s) transfers, tagged queueing

---- RAID configuration ----
Components:
           /dev/sd1d: optimal
           /dev/sd2d: optimal
           /dev/sd3d: optimal
No spares.
Component label for /dev/sd1d:
   Row: 0, Column: 0, Num Rows: 1, Num Columns: 3
   Version: 2, Serial Number: 2003012502, Mod Counter: 3157682
   Clean: No, Status: 0
   sectPerSU: 32, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 8924928
   RAID Level: 5
   Autoconfig: Yes
   Root partition: No
   Last configured as: raid0
Component label for /dev/sd2d:
   Row: 0, Column: 1, Num Rows: 1, Num Columns: 3
   Version: 2, Serial Number: 2003012502, Mod Counter: 3157682
   Clean: No, Status: 0
   sectPerSU: 32, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 8924928
   RAID Level: 5
   Autoconfig: Yes
   Root partition: No
   Last configured as: raid0
Component label for /dev/sd3d:
   Row: 0, Column: 2, Num Rows: 1, Num Columns: 3
   Version: 2, Serial Number: 2003012502, Mod Counter: 3157682
   Clean: No, Status: 0
   sectPerSU: 32, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 8924928
   RAID Level: 5
   Autoconfig: Yes
   Root partition: No
   Last configured as: raid0
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.
---- END of RAID configuration ----

What it might be?
Any help would be appreciated.

--
Kind regards,
Mishka.