Current-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
iscsi-target core dumps
Hi,
I've been having some problems with iscsi-target on -current recently.
This is an old dual PIII/500Mhz system with 512 MB memory which I
could not make better use than as a iSCSI target; it has been doing
this for quite some time, I guess since some 4.99.50 or similar,
without any problems. Recently I installed a poor-man's gigabit
network in my test lab using:
re0 at pci0 dev 18 function 0: US Robotics (3Com) USR997902 Gigabit
Ethernet (rev. 0x10)
re0: interrupting at ioapic0 pin 18
re0: Ethernet address 00:14:c1:4c:3a:f4
re0: using 256 tx descriptors
rgephy0 at re0 phy 7: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 0
Was it that the version installed at the time did not recognize the
card, or I just wanted to see how is -current with iSCSI target, I
don't remember; anyway, I switched to 5.99.XX, at present 5.99.21. My
targets file is as follows:
# extent file or device start length
extent0 /dev/wd0d 0 78533MB
extent1 /dev/wd1d 0 78533MB
extent2 /dev/sd1d 0 8676MB
# target flags storage netmask
target0=wd0 rw extent0 0.0.0.0/0
target1=wd1 rw extent1 0.0.0.0/0
target2=sd1 rw extent2 0.0.0.0/0
so it is exposing only three physical disk drives. The initiator is
Windows 2008 R2 server.
I am getting regular core dumps like the following:
This GDB was configured as "i386--netbsdelf"...(no debugging symbols found)
Reading symbols from /usr/lib/libiscsi.so.2...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libiscsi.so.2
Reading symbols from /usr/lib/libpthread.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libpthread.so.1
Reading symbols from /usr/lib/libc.so.12...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libc.so.12
Reading symbols from /usr/libexec/ld.elf_so...(no debugging symbols
found)...done.
Loaded symbols for /usr/libexec/ld.elf_so
Core was generated by `iscsi-target'.
Program terminated with signal 11, Segmentation fault.
#0 0xbbb2486b in _malloc_prefork () from /usr/lib/libc.so.12
(gdb) bt
#0 0xbbb2486b in _malloc_prefork () from /usr/lib/libc.so.12
#1 0xbbb24b2b in free () from /usr/lib/libc.so.12
#2 0xbbbd9c1d in iscsi_free () from /usr/lib/libiscsi.so.2
#3 0xbbbd2aae in param_text_parse () from /usr/lib/libiscsi.so.2
#4 0xbbbc9002 in find_target_iqn () from /usr/lib/libiscsi.so.2
#5 0xbbbca1e1 in iscsi_target_start () from /usr/lib/libiscsi.so.2
#6 0xbbbb4f1b in pthread_create () from /usr/lib/libpthread.so.1
#7 0xbbb06fe0 in ___lwp_park50 () from /usr/lib/libc.so.12
#8 0xbb200000 in ?? ()
#9 0xbb400000 in ?? ()
#10 0x11110001 in ?? ()
#11 0x00000001 in ?? ()
#12 0x33330003 in ?? ()
#13 0x00000000 in ?? ()
(gdb)
On the initiator host I have to disconnect from the target, remove the
portal, then start the target, then discover the portal and connect to
the targets, which is annoying a bit. As an additional complication,
this initiator uses another two or three portals - one NetBSD 5.99.22
with some 7 dk slices exposed and several [Open]Solaris COMSTAR iSCSI
targets, for the grand total of about 15 iSCSI disks in use (obviously
I am trying to push the initiator as well):
--- DISKPART output - the last three disks are the one in question:
DISKPART> list disk
Disk ### Status Size Free Dyn Gpt
-------- ------------- ------- ------- --- ---
Disk 0 Online 223 GB 1024 KB
Disk 1 Online 233 GB 1024 KB
Disk 2 Online 335 GB 0 B
Disk 3 Online 232 GB 0 B
Disk 4 Online 153 GB 0 B
Disk 5 Online 19 GB 0 B
Disk 6 Online 14 GB 0 B
Disk 7 Online 19 GB 0 B
Disk 8 Online 14 GB 0 B
Disk 9 Online 19 GB 0 B
Disk 10 Online 14 GB 0 B
Disk 11 Online 9 GB 0 B
Disk 12 Online 4882 MB 0 B
Disk 13 Online 4882 MB 0 B
Disk 14 Online 4882 MB 0 B
Disk 15 Online 4882 MB 0 B
Disk 16 Online 8 GB 0 B
Disk 17 Online 76 GB 1024 KB
Disk 18 Online 76 GB 1024 KB
DISKPART> select disk 16
Disk 16 is now the selected disk.
DISKPART> list partition
Partition ### Type Size Offset
------------- ---------------- ------- -------
Partition 1 Primary 8 GB 31 KB
DISKPART> select disk 17
Disk 17 is now the selected disk.
DISKPART> list partition
Partition ### Type Size Offset
------------- ---------------- ------- -------
Partition 1 Primary 76 GB 1024 KB
DISKPART> select disk 18
Disk 18 is now the selected disk.
DISKPART> list partition
Partition ### Type Size Offset
------------- ---------------- ------- -------
Partition 1 Primary 29 GB 1024 KB
Partition 2 Primary 47 GB 29 GB
DISKPART>
-------------------------------------------
I see there is one more possibly relevant bit of information. For some
inexplicable reason this system finds
# dkctl wd1 listwedges
/dev/rwd1d: 2 wedges:
dk0: zfs, 160818911 blocks at 256, type:
dk1: 4868e609-6160-5f69-d388-db197e8b93ae, 16384 blocks at 160819167, type:
# dmesg | grep dk
dk0 at wd1: zfs
dk0: 160818911 blocks at 256, type:
dk1 at wd1: 4868e609-6160-5f69-d388-db197e8b93ae
dk1: 16384 blocks at 160819167, type:
To the best of my knowledge, I have never used this disk in a Solaris
environment, but I might be wrong; ATM:
# fdisk /dev/rwd1d
Disk: /dev/rwd1d
NetBSD disklabel disk geometry:
cylinders: 159560, heads: 16, sectors/track: 63 (1008 sectors/cylinder)
total sectors: 160836480
BIOS disk geometry:
cylinders: 1024, heads: 255, sectors/track: 63 (16065 sectors/cylinder)
total sectors: 160836479
Partition table:
0: NTFS, OS/2 HPFS, QNX2 or Advanced UNIX (sysid 7)
start 2048, size 61440000 (30000 MB, Cyls 0/32/33-3824/150/38)
1: NTFS, OS/2 HPFS, QNX2 or Advanced UNIX (sysid 7)
start 61442048, size 99389440 (48530 MB, Cyls 3824/150/39-10011/75/48)
2: <UNUSED>
3: <UNUSED>
No active partition.
Drive serial number: 3827248766 (0xe41f2e7e)
it contains only two NTFS partitions.
When the disks are in use, they seem to be working fine - apart from
the flood of
Dec 4 11:33:02 support9 iscsi-target: pid 23508:/usr/src/external/bsd/iscsi/lib
/../dist/src/lib/target.c:1845: ***ERROR*** Final bit
Dec 4 11:33:02 support9 iscsi-target: pid 23508:/usr/src/external/bsd/iscsi/lib
/../dist/src/lib/disk.c:1399: ***ERROR*** target_transfer_data() failed
messages (which BTW I've tried to eliminate by removing the
corresponding lines of target.c and disk.c with the result that the
initiator hard-locked when tried to access the target...).
Anything apparently wrong in my setup? Maybe some network capture
might be interesting, I wonder.
--
Chavdar Ivanov
----
Samuel Goldwyn - "I'm willing to admit that I may not always be
right, but I am never wrong." -
http://www.brainyquote.com/quotes/authors/s/samuel_goldwyn.html
Home |
Main Index |
Thread Index |
Old Index