SOLVED An Unexpected Network Error Has Occured

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
OK, so you do have the IT mode HBA. How is your pool constructed? What specific drives are you using?
 

jwsl224

Dabbler
Joined
Feb 13, 2021
Messages
30
I am using dell SAS 7200 rpm drives, in Raidz2, with default encryption ans default settings from there. I have tried going with and without compression. still getting the occasional "unexpected network error".
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Please show the output of zpool status -v. RAIDZ2 may not be the best pool layout for your particular workload. Do you have any L2ARC or SLOG drives in your pool?
 

jwsl224

Dabbler
Joined
Feb 13, 2021
Messages
30
Code:
root@truenas[~]# zpool status -v
  pool: Old Server 1TB
state: ONLINE
config:

        NAME                                            STATE     READ WRITE CKSUM
        Old Server 1TB                                  ONLINE       0     0 0
          raidz1-0                                      ONLINE       0     0 0
            gptid/369864be-8a8f-11eb-a66a-ecf4bbd7a0a0  ONLINE       0     0 0
            gptid/371d250c-8a8f-11eb-a66a-ecf4bbd7a0a0  ONLINE       0     0 0
            gptid/38eca520-8a8f-11eb-a66a-ecf4bbd7a0a0  ONLINE       0     0 0

errors: No known data errors

  pool: Old Server 500GB
state: ONLINE
config:

        NAME                                            STATE     READ WRITE CKSUM
        Old Server 500GB                                ONLINE       0     0 0
          mirror-0                                      ONLINE       0     0 0
            gptid/1d482fa2-8b13-11eb-a66a-ecf4bbd7a0a0  ONLINE       0     0 0
            gptid/1db991cd-8b13-11eb-a66a-ecf4bbd7a0a0  ONLINE       0     0 0

errors: No known data errors

  pool: SAS3TB
state: ONLINE
  scan: scrub repaired 0B in 03:41:53 with 0 errors on Sun Mar 21 05:41:57 2021
config:

        NAME                                            STATE     READ WRITE CKSUM
        SAS3TB                                          ONLINE       0     0 0
          raidz2-0                                      ONLINE       0     0 0
            gptid/026d70d3-88ef-11eb-b19d-ecf4bbd7a0a0  ONLINE       0     0 0
            gptid/02a43280-88ef-11eb-b19d-ecf4bbd7a0a0  ONLINE       0     0 0
            gptid/02b25a9e-88ef-11eb-b19d-ecf4bbd7a0a0  ONLINE       0     0 0
            gptid/02cd9731-88ef-11eb-b19d-ecf4bbd7a0a0  ONLINE       0     0 0

errors: No known data errors

  pool: boot-pool
state: ONLINE
  scan: scrub repaired 0B in 00:00:06 with 0 errors on Fri Apr  2 03:45:06 2021
config:

        NAME                                            STATE     READ WRITE CKSUM
        boot-pool                                       ONLINE       0     0 0
          mirror-0                                      ONLINE       0     0 0
            gptid/5

i am running a test on the one mirror pool right now, seeing if it will also have any errors like the others.

now hold on. how can RaidZ2 not be the best layout for my workload?

also, i'm not quite sure what L2ARC or SLOG drives are..
 
Last edited by a moderator:

jwsl224

Dabbler
Joined
Feb 13, 2021
Messages
30
Please show the output of zpool status -v. RAIDZ2 may not be the best pool layout for your particular workload. Do you have any L2ARC or SLOG drives in your pool?
update: i just ran a copy test on the mirror pool on the server and am getting the same network error. so it can't be the raid setup on the pool. it has to be something in the communication between windows and TrueNAS. Something isn't in agreement.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
What's ifconfig -a show?
 

jwsl224

Dabbler
Joined
Feb 13, 2021
Messages
30
What's ifconfig -a show?
```

Code:
igb0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=e53bbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether ec:f4:bb:d7:a0:a0
        inet 10.1.3.245 netmask 0xffffff00 broadcast 10.1.3.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=e53fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether ec:f4:bb:d7:a0:a1
        inet 10.1.6.241 netmask 0xffffff00 broadcast 10.1.6.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
igb2: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether ec:f4:bb:d7:a0:a2
        media: Ethernet autoselect
        status: no carrier
        nd6 options=1<PERFORMNUD>
igb3: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether ec:f4:bb:d7:a0:a3
        media: Ethernet autoselect
        status: no carrier
        nd6 options=1<PERFORMNUD>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
pflog0: flags=0<> metric 0 mtu 33160
        groups: pflog
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 02:6d:c8:0b:23:00
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 ho


also, as a further update, the Dell T410 TrueNAS instance is not actually using a raid card; it's got 5 external USB drives in RaidZ plugged into it, and we're having the same issue with that one.
 
Last edited by a moderator:

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
the Dell T410 TrueNAS instance is not actually using a raid card; it's got 5 external USB drives in RaidZ plugged into it, and we're having the same issue with that one.

SMH. USB attached external storage is not a recipe for success. Most USB controllers can only really handle FAT filesystems, and have horrible performance with other file systems. I'm not surprised you're seeing errors with this setup.
 

jwsl224

Dabbler
Joined
Feb 13, 2021
Messages
30
SMH. USB attached external storage is not a recipe for success. Most USB controllers can only really handle FAT filesystems, and have horrible performance with other file systems. I'm not surprised you're seeing errors with this setup.

did you "shake my head" at me? :=D

oh well. it's ok. I just set it up as a test bench and to familiarize myself with TrueNAS. we're not actually using it for production.

But the Dell R730 setup is enterprise grade hardware; it's got no reason to fail. there must be some setting off somewhere.
 
Last edited:

jwsl224

Dabbler
Joined
Feb 13, 2021
Messages
30
Samuel. Update. i have created a virtual machine running windows server on the RaidZ pool on the server, and have been absolutely shellacking it with SMB traffic all day: 0 network errors. there are different brands of HD's on this pool, and one of them is a low-end desktop hard drive which is a lot slower than the others. so the windows server instance is really sluggish, and sometimes even hangs up. still. 0 network errors, 0 invalid handles. going at full tilt, with over 100k files transfered. i've also been pounding it with hash checks. still no errors.

however, as soon as i start copy files to the RaidZ2 pool, which has all enterprise SAS drives from Dell, i get "unexpected network error" and "invalid file handle".

there is some kind of software problem going on with truenas in how it handles SMB traffic, or how it talks with windows machines. what are your thoughts?
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Samuel. Update. i have created a virtual machine running windows server on the RaidZ pool on the server, and have been absolutely shellacking it with SMB traffic all day: 0 network errors. there are different brands of HD's on this pool, and one of them is a low-end desktop hard drive which is a lot slower than the others. so the windows server instance is really sluggish, and sometimes even hangs up. still. 0 network errors, 0 invalid handles. going at full tilt, with over 100k files transfered. i've also been pounding it with hash checks. still no errors.

however, as soon as i start copy files to the RaidZ2 pool, which has all enterprise SAS drives from Dell, i get "unexpected network error" and "invalid file handle".

there is some kind of software problem going on with truenas in how it handles SMB traffic, or how it talks with windows machines. what are your thoughts?
What version of TrueNAS is this?
 

jwsl224

Dabbler
Joined
Feb 13, 2021
Messages
30
well, i'm not a big believer in updates fixing much in one swing, but jeepers. this one SEEMS off to a good start. i've sent about 500k files towards the box as a preliminary test, 0 errors so far. even the server with a USB drive array on RaidZ hasn't come up with any errors. and it's on a wireless connection! i' beginning to wonder if the update wasn't just to disable any error reporting :cool:

if i don't see any errors for the rest of the day i'll do a torcher test at night by submitting all 3 pools simultaneously to a MD5 hash check over SMB and see if that will make them fail. hopefully not. *fingers crossed*
 

jwsl224

Dabbler
Joined
Feb 13, 2021
Messages
30
Update to latest version. There were SMB fixes.
this seemed to solve it. been running SMB traffic most of yesterday and through the night. out of over 1 million operations there was only 1 network error. i think that's close enough :)
 
Top