Samba, Win10, and Invalid Signature

Status
Not open for further replies.

greco

Dabbler
Joined
Oct 29, 2012
Messages
18
Hello- I recently bought new hardware and installed 11.1U4 (and even more recently updated to U5) .

Hardware is:
Core i3-6100 CPU @ 3.70GHz
16GB ram
3x 4TB drives in a mirror config
1x ssd boot drive

I'm having an issue with my Samba shares that did not exist in my FreeNAS 8.2.0. (The v8 box has not been decom yet, so I can still test against it, but it's non in use.)

On the 11.1 box, I can authenticate and access the shares. Directory list, file creation, edit, delete, all that good stuff. The problem is long time file handles break at some point with an "Invalid Signature" error on the next write. Aka:
Code:
NT_STATUS_INVALID_SIGNATURE  0xc000a000
The cryptographic signature is invalid.


I have an application (securecrt) running on my Win10 machine that I have configured to write log files to the Windows (SMB) share. The log files are created and written to and remain open for as long as the ssh session is open. After some undetermined period of (idle?) time, the next write to the log file causes an error to pop up saying Invalid Signature and no more logging.

According to smbstatus -v running in a loop with a 60 second delay,
Code:
83772		21109	  DENY_WRITE 0x100085	RDONLY	 LEASE(RWH)	   /mnt/primary/software   Internet/VanDyke/scrt/logs/20180611-bucket.log   Mon Jun 11 18:25:31 2018

the handle was still open until the write was attempted, and then disappeared from the list.

According to sysinternals procmon, secure crt still seemed to think it had an open file handle.

Most of my research regarding this error has been about the shares just not working at all. I can't find anything about "it mostly works except for long term file handles."

Secure CRT is just how I discovered this issue. I was able to create a simple script that created/opened a file on the share, and after some period of time, it also got the same error.

Code:
irb(main)> fh = open('filehandle-test.txt', 'wb')
=> #<File:filehandle-test.txt> (note: this is a "good" response)

irb(main)> fh.write("\n"+Time.now.to_s) ; fh.flush
=> #<File:filehandle-test.txt>

irb(main)> fh.write("\n"+Time.now.to_s) ; fh.flush
=> #<File:filehandle-test.txt>

irb(main)> fh.write("\n"+Time.now.to_s) ; fh.flush
Errno::E-21468: Invalid Signature.
		from (irb):17:in `flush'
		from (irb):17
		from C:/Ruby24-x64/bin/irb.cmd:19:in `<main>'


I'm not sure where to look next or how to address this. Seems like it would be a pretty obvious problem for people. My freenas is pretty stock, I don't have any jails or other such customizations. The machine is working great otherwise. Only other problem (annoyance, really) I have with it is that all of the "Reporting" graphs get erased on reboot, and that's pretty minor.

I also have Win7 and WinXP I can test with and cal also do some more sophisticated tests with writes to see if constant writing fails after some time, or only when there's some period of idleness after the last write.
 

greco

Dabbler
Joined
Oct 29, 2012
Messages
18
Thanks, I will try setting that. Though after a number of various tests, it doesn't seem to be idle related. I made a script that writes every a line every 5 minutes and it still gets the error. It takes a long time for the error to happen, but does consistently.
Three tests showed failure at 7.5, 6.9, and 9.3 hours. Meanwhile, the same test to the FreeNAS 8.2 system is still going since Monday night just fine.

It's my home rig, and I'm the only user, so usage is extremely low.

I also tried the same script from a linux host with an NFS mount in case it wasn't samba directly, but that test has been running fine since yesterday afternoon, about 17 hours so far.
 

greco

Dabbler
Joined
Oct 29, 2012
Messages
18
I'm still seeing this after adding deadtime = 0 -- though I noticed that while it added deadtime = 0 to the end of smb.conf [global] section, it did not remove the one already in the config that says deadtime = 15 (which is the same setting on the 8.2 system)

It still gets the errors after about 8-9 hours of the handle being opened. The 8.2 freenas is still happily writing the open file handle since Monday.

I've captured some samba logs at the time the error occurred which I'm still researching:

Code:
--- log.smbd

[2018/06/14 11:47:00.225599,  3] ../source3/smbd/smb2_server.c:3115(smbd_smb2_request_error_ex)
  smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] status[NT_STATUS_NETWORK_SESSION_EXPIRED] || at ../source3/smbd/smb2_server.c:2452
[2018/06/14 11:47:00.237296,  3] ../auth/kerberos/kerberos_pac.c:409(kerberos_decode_pac)
  Found account name from PAC: gregg [Gregg]
[2018/06/14 11:47:00.237314,  3] ../source3/auth/user_krb5.c:51(get_user_from_kerberos_info)
  Kerberos ticket principal name is [gregg@MYDOMAIN.COM]
[2018/06/14 11:47:00.241428,  3] ../source3/param/loadparm.c:3856(lp_load_ex)
  lp_load_ex: refreshing parameters


Code:
--- log.winbindd

[2018/06/14 11:47:00.236628,  3] ../source3/winbindd/winbindd_misc.c:395(winbindd_interface_version)
  [ 3262]: request interface version (version = 29)
[2018/06/14 11:47:00.236706,  3] ../source3/winbindd/winbindd_misc.c:428(winbindd_priv_pipe_dir)
  [ 3262]: request location of privileged pipe
[2018/06/14 11:47:00.237036,  2] ../auth/kerberos/kerberos_pac.c:96(check_pac_checksum)
  check_pac_checksum: PAC Verification failed: Decrypt integrity check failed (-1765328353)
[2018/06/14 11:47:00.237071,  2] ../auth/kerberos/kerberos_pac.c:96(check_pac_checksum)
  check_pac_checksum: PAC Verification failed: Decrypt integrity check failed (-1765328353)
[2018/06/14 11:47:00.237126,  3] ../auth/kerberos/kerberos_pac.c:409(kerberos_decode_pac)
  Found account name from PAC: gregg [Gregg]


Code:
--- log.wb-FARM

[2018/06/14 11:46:43.301139,  3] ../source3/winbindd/winbindd_dual_srv.c:704(_wbint_CheckMachineAccount)
  domain FARM secret is good
[2018/06/14 11:47:00.252312,  3] ../source3/winbindd/winbindd_pam.c:2375(winbindd_dual_pam_logoff)
  [68158]: pam dual logoff FARM\gregg
[2018/06/14 11:47:45.037697,  3] ../source3/winbindd/winbindd_cm.c:2085(connection_ok)
  connection_ok: Connection to pasture.mydomain.com for domain FARM is not connected
[2018/06/14 11:47:45.038346,  3] ../source3/libads/ldap.c:634(ads_connect)
  Successfully contacted LDAP server 10.0.0.4
[2018/06/14 11:47:45.038399,  3] ../source3/libsmb/namequery.c:3160(get_dc_list)
  get_dc_list: preferred server list: "pasture.mydomain.com, *"
[2018/06/14 11:47:45.038842,  3] ../source3/libsmb/namequery.c:3160(get_dc_list)
  get_dc_list: preferred server list: "pasture.mydomain.com, *"
[2018/06/14 11:47:45.039674,  3] ../source3/lib/util_sock.c:515(open_socket_out_send)
  Connecting to 10.0.0.4 at port 445
[2018/06/14 11:47:45.040618,  3] ../lib/ldb-samba/ldb_wrap.c:326(ldb_wrap_connect)
  ldb_wrap open of secrets.ldb
[2018/06/14 11:47:45.040731,  3] ../source3/libsmb/cliconnect.c:271(cli_session_creds_prepare_krb5)
  got OID=1.2.840.48018.1.2.2
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Can you provide the full contents of log.smbd, and winbind logs via PM?

NT_STATUS_NETWORK_SESSION_EXPIRED
Per MS-SMB2:
If the Status field in the SMB2 header is STATUS_NETWORK_SESSION_EXPIRED, the client MUST attempt to reauthenticate the session that is identified by the SessionId in the SMB2 header, as specified in section 3.2.4.2.3. If the reauthentication attempt succeeds, the client MUST retry the request that failed with STATUS_NETWORK_SESSION_EXPIRED. If the reauthentication attempt fails, the client MUST fail the operation and terminate the session, as specified in section 3.2.4.23.

So it looks like the reauth attempt is failing. Now to find why.
 

greco

Dabbler
Joined
Oct 29, 2012
Messages
18
Thanks for helping with this. I sent more log data as PM.

Something interesting I've noticed in my testing, all connections die at about the same time. So if I have 3x scripts running (3 different processes each having its own unique open file), they all die at pretty much the same time no matter if they were started close to the same time or hours apart (especially noticeable in the last set). You can see in the smb log where it writes every 5 minutes, then the session expires. Is that the session between the client and the freenas, or the freenas and the AD for authentication?

Code:
End:		2018-06-12 06:17:58 -0400 465 minutes since start  7.75h
End:		2018-06-12 06:20:59 -0400 455 minutes since start  7.5h
-
End:		2018-06-12 16:02:51 -0400 415 minutes since start  6.91h
-
End:		2018-06-13 01:47:36 -0400 560 minutes since start  9.33h
-
End:		2018-06-13 11:51:19 -0400 35 minutes since start
End:		2018-06-13 11:49:28 -0400 125 minutes since start
-
End:		2018-06-13 22:02:00 -0400 580 minutes since start
End:		2018-06-13 22:02:00 -0400 580 minutes since start
-
End:		2018-06-13 21:58:42 -0400 575 minutes since start
End:		2018-06-13 22:02:23 -0400 545 minutes since start
-
End:		2018-06-14 07:47:59 -0400 465 minutes since start
End:		2018-06-14 07:47:00 -0400 565 minutes since start
-
End:		2018-06-14 19:20:17 -0400 440 minutes since start
End:		2018-06-14 19:19:44 -0400 370 minutes since start
End:		2018-06-14 19:21:56 -0400 145 minutes since start
 

greco

Dabbler
Joined
Oct 29, 2012
Messages
18
I started the same tests on an XP vm, Win7 VM, and a physical Win7 machine last night. All are still running 16 hours later.

The Win10 machine stopped around 5am this morning and again around 3pm today. So definitely seems related to Win10.

I'm not a fan of Win 10, but it's the company issued laptop, so kinda stuck with it.
And I don't have an immediate way to test with a different Win 10 device.
 
Joined
Jun 11, 2017
Messages
11
Hello,

have you ever found a solution to this?

I have a simular issue and I am pretty sure it is related to the Kerberos Ticket expiration.
=> default in the Default domain policy GPO is 10 hours ("Maximum lifetime for service ticket" or "Maximum lifetime for user ticket")
 

greco

Dabbler
Joined
Oct 29, 2012
Messages
18
No, I never did. I haven't had time to research it after opening the ticket. They wanted a debug bundle that I wasn't comfortable uploading so I made a clean FreeNAS install as a VM and that clean install did not have a problem, though it wasn't synced to my "prod" AD and was providing AD itself. My next phase was going to be to sync it to the real AD and continue testing, but I just kind of stalled out, even though it's still an issue for me.

I also wanted to test from another Win10 machine or VM, but I don't have access to one right now.
 
Joined
Jun 11, 2017
Messages
11
I did some more tests, but I simply cannot get it to work (I went back to 11.0 Release => same issue).

Can anybody confirm they have this working:
  • FreeNAS 11.something (I tested most with 11.1 U5)
  • Windows server 2016 AD Domain (Functional level: 2016) - latest upgrades/patches
  • Windows 10 Client - latest upgrades/patches
  • SMB share on FreeNAS, mapped on Windows 10 client using an AD account
  • Do a file copy towards this share for more then 10 hours
I get errors after 10 hours...
If somebody can confirm it is working for them, I know it's something in my setup...
 
Joined
Jun 11, 2017
Messages
11
BTW: no need to do a copy for 10 hours.
I get the same error when playing an MP3 file.

This is what I get in log.smbd at the time of disconnect:

[2018/07/16 16:42:47.469443, 3] ../source3/smbd/smb2_server.c:3115(smbd_smb2_request_error_ex)
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] status[NT_STATUS_NETWORK_SESSION_EXPIRED] || at ../source3/smbd/smb2_server.c:2452


and I can change the 10-hour to 1 hour by changing the kerberos ticket lifetime from 10 hours to 1 hour in the Default Domain Policy GPO within the domain => 100% sure kerberos ticket related...


Is there a way to re-open Bug report 35092 ?
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
BTW: no need to do a copy for 10 hours.
I get the same error when playing an MP3 file.

This is what I get in log.smbd at the time of disconnect:

[2018/07/16 16:42:47.469443, 3] ../source3/smbd/smb2_server.c:3115(smbd_smb2_request_error_ex)
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] status[NT_STATUS_NETWORK_SESSION_EXPIRED] || at ../source3/smbd/smb2_server.c:2452


and I can change the 10-hour to 1 hour by changing the kerberos ticket lifetime from 10 hours to 1 hour in the Default Domain Policy GPO within the domain => 100% sure kerberos ticket related...


Is there a way to re-open Bug report 35092 ?

Perhaps you can open up a new bug ticket with a debug file attached and steps to reproduce. I'll work some in my testing environment to see what I can figure out about why this is failing.
 
Joined
Jun 11, 2017
Messages
11
I can confirm it is working correctly using a Windows 2012 R2 server as domain controller.
So it is not Windows 10 related, but AD.

Also a Server 2016 domain controller, but setting the domain functional level to 2012 R2 is not OK => errors.

Is nobody else using a Server 2016 domain controllers?
 
Joined
Jun 11, 2017
Messages
11
Hello Guys,

is really nobody using Windows Server 2016 as domain controller?

Or am I doing something wrong and is it working for you all?
 
Status
Not open for further replies.
Top