Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

iscsi-target core dumps



Hi,

I've been having some problems with iscsi-target on -current recently.
This is an old dual PIII/500Mhz system with 512 MB memory which I
could not make better use than as a iSCSI target; it has been doing
this for quite some time, I guess since some 4.99.50 or similar,
without any problems. Recently I installed a poor-man's gigabit
network in my test lab using:

re0 at pci0 dev 18 function 0: US Robotics (3Com) USR997902 Gigabit
Ethernet (rev. 0x10)
re0: interrupting at ioapic0 pin 18
re0: Ethernet address 00:14:c1:4c:3a:f4
re0: using 256 tx descriptors
rgephy0 at re0 phy 7: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 0

Was it that the version installed at the time did not recognize the
card, or I just wanted to see how is -current with iSCSI target, I
don't remember; anyway, I switched to 5.99.XX, at present 5.99.21. My
targets file is as follows:

# extent        file or device          start           length
extent0         /dev/wd0d       0               78533MB
extent1         /dev/wd1d       0               78533MB
extent2         /dev/sd1d       0               8676MB
# target        flags   storage         netmask
target0=wd0             rw      extent0         0.0.0.0/0
target1=wd1             rw      extent1         0.0.0.0/0
target2=sd1             rw      extent2         0.0.0.0/0

so it is exposing only three physical disk drives. The initiator is
Windows 2008 R2 server.

I am getting regular core dumps like the following:

This GDB was configured as "i386--netbsdelf"...(no debugging symbols found)

Reading symbols from /usr/lib/libiscsi.so.2...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libiscsi.so.2
Reading symbols from /usr/lib/libpthread.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libpthread.so.1
Reading symbols from /usr/lib/libc.so.12...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libc.so.12
Reading symbols from /usr/libexec/ld.elf_so...(no debugging symbols
found)...done.
Loaded symbols for /usr/libexec/ld.elf_so

Core was generated by `iscsi-target'.
Program terminated with signal 11, Segmentation fault.
#0  0xbbb2486b in _malloc_prefork () from /usr/lib/libc.so.12
(gdb) bt
#0  0xbbb2486b in _malloc_prefork () from /usr/lib/libc.so.12
#1  0xbbb24b2b in free () from /usr/lib/libc.so.12
#2  0xbbbd9c1d in iscsi_free () from /usr/lib/libiscsi.so.2
#3  0xbbbd2aae in param_text_parse () from /usr/lib/libiscsi.so.2
#4  0xbbbc9002 in find_target_iqn () from /usr/lib/libiscsi.so.2
#5  0xbbbca1e1 in iscsi_target_start () from /usr/lib/libiscsi.so.2
#6  0xbbbb4f1b in pthread_create () from /usr/lib/libpthread.so.1
#7  0xbbb06fe0 in ___lwp_park50 () from /usr/lib/libc.so.12
#8  0xbb200000 in ?? ()
#9  0xbb400000 in ?? ()
#10 0x11110001 in ?? ()
#11 0x00000001 in ?? ()
#12 0x33330003 in ?? ()
#13 0x00000000 in ?? ()
(gdb)

On the initiator host I have to disconnect from the target, remove the
portal, then start the target, then discover the portal and connect to
the targets, which is annoying a bit. As an additional complication,
this initiator uses another two or three portals - one NetBSD 5.99.22
with some 7 dk slices exposed and several [Open]Solaris COMSTAR iSCSI
targets, for the grand total of about 15 iSCSI disks in use (obviously
I am trying to push the initiator as well):

--- DISKPART output - the last three disks are the one in question:

DISKPART> list disk

  Disk ###  Status         Size     Free     Dyn  Gpt
  --------  -------------  -------  -------  ---  ---
  Disk 0    Online          223 GB  1024 KB
  Disk 1    Online          233 GB  1024 KB
  Disk 2    Online          335 GB      0 B
  Disk 3    Online          232 GB      0 B
  Disk 4    Online          153 GB      0 B
  Disk 5    Online           19 GB      0 B
  Disk 6    Online           14 GB      0 B
  Disk 7    Online           19 GB      0 B
  Disk 8    Online           14 GB      0 B
  Disk 9    Online           19 GB      0 B
  Disk 10   Online           14 GB      0 B
  Disk 11   Online            9 GB      0 B
  Disk 12   Online         4882 MB      0 B
  Disk 13   Online         4882 MB      0 B
  Disk 14   Online         4882 MB      0 B
  Disk 15   Online         4882 MB      0 B
  Disk 16   Online            8 GB      0 B
  Disk 17   Online           76 GB  1024 KB
  Disk 18   Online           76 GB  1024 KB

 DISKPART> select disk 16
Disk 16 is now the selected disk.
DISKPART> list partition
  Partition ###  Type              Size     Offset
  -------------  ----------------  -------  -------
  Partition 1    Primary              8 GB    31 KB
DISKPART> select disk 17
Disk 17 is now the selected disk.
DISKPART> list partition
  Partition ###  Type              Size     Offset
  -------------  ----------------  -------  -------
  Partition 1    Primary             76 GB  1024 KB
DISKPART> select disk 18
Disk 18 is now the selected disk.
DISKPART> list partition
  Partition ###  Type              Size     Offset
  -------------  ----------------  -------  -------
  Partition 1    Primary             29 GB  1024 KB
  Partition 2    Primary             47 GB    29 GB
DISKPART>
-------------------------------------------

I see there is one more possibly relevant bit of information. For some
inexplicable reason this system finds

# dkctl  wd1 listwedges
/dev/rwd1d: 2 wedges:
dk0: zfs, 160818911 blocks at 256, type:
dk1: 4868e609-6160-5f69-d388-db197e8b93ae, 16384 blocks at 160819167, type:
# dmesg | grep dk
dk0 at wd1: zfs
dk0: 160818911 blocks at 256, type:
dk1 at wd1: 4868e609-6160-5f69-d388-db197e8b93ae
dk1: 16384 blocks at 160819167, type:

To the best of my knowledge, I have never used this disk in a Solaris
environment, but I might be wrong; ATM:

# fdisk /dev/rwd1d
Disk: /dev/rwd1d
NetBSD disklabel disk geometry:
cylinders: 159560, heads: 16, sectors/track: 63 (1008 sectors/cylinder)
total sectors: 160836480

BIOS disk geometry:
cylinders: 1024, heads: 255, sectors/track: 63 (16065 sectors/cylinder)
total sectors: 160836479

Partition table:
0: NTFS, OS/2 HPFS, QNX2 or Advanced UNIX (sysid 7)
    start 2048, size 61440000 (30000 MB, Cyls 0/32/33-3824/150/38)
1: NTFS, OS/2 HPFS, QNX2 or Advanced UNIX (sysid 7)
    start 61442048, size 99389440 (48530 MB, Cyls 3824/150/39-10011/75/48)
2: <UNUSED>
3: <UNUSED>
No active partition.
Drive serial number: 3827248766 (0xe41f2e7e)

it contains only two NTFS partitions.

When the disks are in use, they seem to be working fine - apart from
the flood of

Dec  4 11:33:02 support9 iscsi-target: pid 23508:/usr/src/external/bsd/iscsi/lib
/../dist/src/lib/target.c:1845: ***ERROR*** Final bit
Dec  4 11:33:02 support9 iscsi-target: pid 23508:/usr/src/external/bsd/iscsi/lib
/../dist/src/lib/disk.c:1399: ***ERROR*** target_transfer_data() failed

messages (which BTW I've tried to eliminate by removing the
corresponding lines of target.c and disk.c with the result that the
initiator hard-locked when tried to access the target...).

Anything apparently wrong in my setup? Maybe some network capture
might be interesting, I wonder.

-- 
Chavdar Ivanov
----
Samuel Goldwyn  - "I'm willing to admit that I may not always be
right, but I am never wrong." -
http://www.brainyquote.com/quotes/authors/s/samuel_goldwyn.html


Home | Main Index | Thread Index | Old Index