ISCSI - istgt pthread_Create() error

Status
Not open for further replies.

Hipcrostino

Cadet
Joined
May 24, 2012
Messages
2
Hi all,

I've got a FreeNas server setup to do two things. On the one hand its a normal file server in a windows network using CIFS. On the other hand its an ISCSI target for two Windows 2008 R2 Hyper-V machines in a cluster.

Most of the time it works just fine. The ISCSI system is set up on its own separate network through its own switch so that ISCSI traffic isn't clogging the main network. This seems to be working just fine.

My problem is that every couple of weeks the the ISTGT server fails. It gives this error:

"istgt[3006]: istgt_iscsi.c:5088:worker: ***ERROR*** pthread_create() failed" (3006 is the PID an always is)

This is followed by many items saying "last message repeated xx times" where xx can be any number. As a result, the Hyper-V Cluster falls over (just test and dev servers, no big deal) and then soon after CIFS stops responding. The box still responds via keyboard so I tell it to restart from the interface (option 10). When I do this the following happens:

Waiting for PIDS: 3006

Followed 30 seconds later with the watchdog timeout. Shutdown is then aborted, the discs get synced (thank god) and the box stops there. I have to use the reset button to fire it back up.

I've tried to kill the 3006 service manually but no commands seem to do anything. I'd be happy if I could just get the server to restart properly after an error like this.

Server is running 8.0.4 at the moment. I plan to upgrade to p2 ASAP, although there is no information that changes have been made to ISCSI in that time.

Has anyone else got this problem? Its starting to get annoying and I can't move production machines into the Hyper-V cluster until its resolved.

ISTGT version is: "istgt version 0.4 (20110928)"
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Could you post a little more information about your system?

I've been tracking a performance issue for some time now. In some cases, it's possible to slag ZFS to the point of near-unresponsiveness, depending on what you're doing. The simple case appears to be slowish disks on a large, fast FreeNAS box. I've been able to get it to go catatonic, flushing its cache, for minutes at a time. See Bug #1531.

I've seen variations of this with other combinations of hardware, but it was always hard to identify exactly what was going on because in many cases the server's CPU was pegged when it happened.

You might want to check the following things:

1) Is there any indication that large scale file transfers of any sort were happening?

2) Does "gstat" from the command line show that your component disks are busy, especially 99-100% busy?

3) If you cd over to "/mnt/yourpool" and start accessing files (ls, more, etc) is that successful or does it also hang?

4) Can you see what state the istgt process is in? ("ps agxlww|grep istgt") If it shows "tx->tx", it may be getting hung up in tx->tx_quiesce_done_cv, which basically means the system's busy flushing stuff out to disk. This isn't a lockup - the condition will clear as the cache finishes flushing - but clients may not care.

If this is the problem you're hitting, it's nothing to do with istgt, it's ZFS. A fix, or more accurately a workaround, is to lower vfs.zfs.txg.timeout to 5, and find a workable value to set vfs.zfs.txg.write_limit_override to.

Other random thought: if this is happening about monthly (you say "every couple of weeks"), you might also want to preface all of this with a check of whether or not ZFS is performing a scrub ("zpool status"), which happens monthly, and eats up huge amounts of I/O capacity for a long period of time (like maybe a day if your disks are big). I have a feeling that the scrub places substantial additional strain on the system and may tease systems that are already struggling on IOPS into bad behaviour.
 

Hipcrostino

Cadet
Joined
May 24, 2012
Messages
2
Hi jgreco,

Thats some great information there! to answer your questions as best I can.

System Specs:
16GB Ram
3.1Ghz Quad core Xeon E3-1220
Asus P8B-C/4L ATX server Mobo (has 4 inbuilt gigabit lan ports)
Arcea ARC-1261 Raid card (all drives configured as pass through)
1xIntel SSD 40GB
9x1Tb 7200 SATA Hdd
6x2Tb 7200 SATA Hdd

Freenas is configured as follows:

2 zPools

First is the 9x1Tb drives setup in as 3x3 drive mirrors pooled together. This is normal file storage.
Second is 6x2tb drives 2x3 drive mirrors + SSD at a Log drive (by itself :-/ ).

On the second drive there are two Iscsi targets set up. 1 is a 1GB witness disk for the Windows Cluster, the other is actual storage for VM discs


1) Is there any indication that large scale file transfers of any sort were happening?

Yes I have seen this happen consistently for large file transfers. Not always but most of the time. Things like copying a 30Gb VM HDD. But I've also seen it freeze during normal usage.

2) Does "gstat" from the command line show that your component disks are busy, especially 99-100% busy?
3) If you cd over to "/mnt/yourpool" and start accessing files (ls, more, etc) is that successful or does it also hang?
4) Can you see what state the istgt process is in? ("ps agxlww|grep istgt") If it shows "tx->tx", it may be getting hung up in tx->tx_quiesce_done_cv, which basically means the system's busy flushing stuff out to disk. This isn't a lockup - the condition will clear as the cache finishes flushing - but clients may not care.


For these three I'll have to wait until it happens again to test but I will get back to you when it dose. I might induce it this arvo via a large file transfer when the users have left to see what I can find out. Currently the system is working fine, gstat show minimal disc usage <1% here and there.

if this is happening about monthly (you say "every couple of weeks")?

It can happen more often than monthly. Scrub is set to 35days presently (I get daily admin emails which tell me this). I'll monitor to see if a scrub makes it happen also.

Hope that information helps. I'll get back to you when I know more.

Thanks, Paul.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Don't rule out other possibilities, but I think you're seeing the same thing I am.

https://support.freenas.org/ticket/1531

I've got an E3-1230 w/ 32GB RAM on a Supermicro X9SCi-LN4F in a 1U chassis with 4x 3.5" HDD slots. We're upgrading/recycling/etc some older gear; the old Opterons were great small fileservers in their day, but I was seeing them slag out under FreeNAS, getting maxxed out on CPU among other things. So I had been relatively curious to try out the old drives on the new system, to see just how things compared... the old Opteron 240's are maybe 15-20% the speed, and were only set up with 8GB. So the platform was massively bigger and better.

So imagine my shock when I started seeing lagouts measurable in *minutes*. Any pending I/O to the ZFS mount would just go catatonic.

Basically the problem is that ZFS is designed for certain workloads and makes some assumptions about the hardware that might not be right.

For example, with 32GB, you have the opportunity to have a very large amount of cached write buffer in ZFS. That's essentially a killer for systems with slow I/O. If you look at the output from my ticket, you'll notice that dd writes a HUGE amount of data very quickly, but the disks are trickling it out to the platters at a much slower 30-40MB/sec per. So basically if you write a bunch out to ZFS very quickly, it gets to a point where it decides that it all needs to be flushed to cache, and then proceeds to *hang* until it has flushed it. For my older slower 400GB drives, that turned into minutes. For some more modern 4TB 7200 drives, that was still in the uncomfortably-long-seconds range.

ZFS can be tuned somewhat to help reduce this. See my ticket and also various FreeNAS threads like http://forums.freenas.org/showthread.php?6347-ZFS-performance-with-sphere-5

What I'm trying right now is to make a point to the FreeNAS dev team that this is not a desirable thing for a NAS to do; I would much rather suffer slower I/O with minimum latency than faster I/O with periodic hangs (and the associated problems this causes for all the fileserver's clients).

Please report back, and feel free to add anything useful to the bug ticket too.
 

wagmachine

Cadet
Joined
May 31, 2012
Messages
2
I think I'm seeing either the same or a very similar problem; it's happening now (my FreeNAS 8.2.0 beta 3 box serves only me, so I can afford to let it sit in a non-functional state while I type this). My symptoms differ from jgreco's, but I'm seeing the same complaints about pthread_create() in istgt.

I'm not running on very beefy server-class hardware; mine is scraped together from left-over parts to serve my home/work/lab network. I'm running the 64-bit version of FreeNAS on a Core2Duo-based IP35Pro motherboard with 8 GB RAM, 3 2GB SATA drives in a Raid-Z config with an SSD for L2ARC and ZIL. I'm serving up some NFS shares, CIFS shares, and 5 or 6 ISCSI targets for some virtual disks for KVM on a CentOS 6 box. My FreeNAS box was pretty much rock-solid until I realized that it was only using 2 GB RAM and enabled autotune (which bumped kmem up to 5 GB or so and arc_max up to 4 GB). Now it hangs on almost a daily basis; usually I can't even ping it, but today I can get in via SSH.

1) Is there any indication that large scale file transfers of any sort were happening?

For me, generally yes. I believe this happened around 1:15 AM this morning during an incremental backup of the NFS-mounted shares. Sometimes it happens when doing big transfers via ISCSI, once I believe it happened while copying some ISOs to the CIFS share. However, I also can't seem to force it to occur; after seeing the first couple of hangs and losing my confidence in the server, I started trying to reproduce it by simultaneously dd'ing from multiple ISCSI targets and NFS shares, with no luck. The problem only seems to manifest after the server has been up for awhile.

2) Does "gstat" from the command line show that your component disks are busy, especially 99-100% busy?

No; on my server, everything is at 0%.

3) If you cd over to "/mnt/yourpool" and start accessing files (ls, more, etc) is that successful or does it also hang?[/I]

I can list directories and access files (a "dd if=<file> of=/dev/null" succeeds)>

4) Can you see what state the istgt process is in? ("ps agxlww|grep istgt") If it shows "tx->tx", it may be getting hung up in tx->tx_quiesce_done_cv, which basically means the system's busy flushing stuff out to disk. This isn't a lockup - the condition will clear as the cache finishes flushing - but clients may not care.

"istgt" on my box always indicates that it's in the "ucond" state (even under normal operation; maybe I don't push it hard enough). Right now here's what I get:

0 3004 1 0 44 0 3547700 178264 ucond SLs ?? 24:58.36 /usr/local/bin/istgt -c /usr/local/etc/istgt/istgt.conf



if this is happening about monthly (you say "every couple of weeks")?

Just about daily for me; I can probably find more information for you tomorrow :-(

If you guys think this is a different issue, should I file a bug as well?

Thanks!

Scott
 

wagmachine

Cadet
Joined
May 31, 2012
Messages
2
A little more information; here's the output from "top" on the FreeNAS server at the time of the hang:

Code:
last pid: 44177;  load averages:  0.18,  0.12,  0.08    up 1+13:04:17  09:46:18
1497 processes:1 running, 1496 sleeping
CPU:  5.2% user,  0.0% nice,  1.5% system,  0.0% interrupt, 93.3% idle
Mem: 207M Active, 71M Inact, 5450M Wired, 158M Buf, 2193M Free
Swap: 6144M Total, 6144M Free
Connection to freenas closed by remote host.                                   
Connection to freenas closed.sage from ***@***.*** ***          
***@***:~$ IM7EDIATELY  71508K 13936K ucond   1  10:12  2.29% collectd
  598 staylor       1          13324K  7616K CPU0    0   3:31  0.39% top
 3004            1439  44       3413M   173M ucond   0  13:47  0.00% istgt
 1930 root          4  45    0  5688K  1240K zfs     1   7:14  0.00% nfsd
 2692 root          8  44    0   136M 76312K uwait   1   7:03  0.00% python
 3593 staylor       1  44    0 15564K  2032K nanslp  0   0:37  0.00% zpool
 2472 root          1  44    0  6776K  1252K select  1   0:31  0.00% powerd
 3596 staylor       1  44    0 35876K  5064K select  1   0:13  0.00% sshd
 1978 root          1  44    0 11788K  2664K select  1   0:13  0.00% ntpd
 3570 staylor       1  44    0 35876K  5056K select  1   0:09  0.00% sshd
 2316 root          1  44    0 39224K  6300K select  0   0:05  0.00% nmbd
 1761 root          1  44    0  6908K  1508K select  0   0:03  0.00% syslogd
38489 www           1  44    0 14376K  4852K kqread  0   0:01  0.00% nginx
 2147 root          1  44    0 46800K  9300K select  0   0:00  0.00% smbd
 1962 root          1  76    0  7960K  1556K rpcsvc  0   0:00  0.00% rpc.lockd
 2309 root          1  44    0 13416K  3072K nanslp  0   0:00  0.00% smartd
 1907 root          1  44    0  7840K  1608K select  0   0:00  0.00% rpcbind


Interesting to me that "istgt" was using about 3.5 GB of RAM; right now, after a reboot, with 5 iSCSI iniators running, it's at 135M. I'm wondering if there's some sort of memory leak in the iSCSI target (although if so I would've expected this problem to be worse with less RAM).

Scott
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
That seems like a different (set of?) problem(s?) to me. I'll keep an eye out for some of this as I'm going to be testing and then deploying some new hardware in the next month or three.
 
Status
Not open for further replies.
Top