Kerberised NFS not working after TrueNAS 12.0-U4 to 2.0-U8.1 upgrade

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
Hi All,

I'm in need of some fairly urgent help to debug an issue.

After upgrading to TrueNAS-12.0-U8.1 kerberised NFS has stopped working for my linux clients (around 80).

My exports are set up in /etc/zfs/exports (for various reasons) and take the following format.

/mnt/store/home/username -sec=krb5:krb5i:krb5p

The NFS service is running and has been restarted a couple of times for good measure and an rpcinfo from a linux client seems to suggest the "necessary" services and ports are open

[root@linux-client ~]# rpcinfo -p fileserver.fqdn
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100000 4 7 111 portmapper
100000 3 7 111 portmapper
100000 2 7 111 portmapper
100005 1 udp 867 mountd
100005 3 udp 867 mountd
100005 1 tcp 867 mountd
100005 3 tcp 867 mountd
100003 2 tcp 2049 nfs
100003 3 tcp 2049 nfs
100024 1 udp 602 status
100024 1 tcp 602 status
100021 0 udp 604 nlockmgr
100021 0 tcp 956 nlockmgr
100021 1 udp 604 nlockmgr
100021 1 tcp 956 nlockmgr
100021 3 udp 604 nlockmgr
100021 3 tcp 956 nlockmgr
100021 4 udp 604 nlockmgr
100021 4 tcp 956 nlockmgr

I include the following, in case it helps, but I really don't know how to debug this. I'm trying to open a support ticket but I'm currently having issues with my account that is preventing me from being able to do this. Any help would be very much appreciated. I suspect this is a kerberos ticket issue of some sort.

root@filestore[~]# net ads info
LDAP server: 111.22.33.44
LDAP server name: ad-server.fqdn
realm: F.Q.D.N
Bind Path: dc=F,dc=Q,dc=D,dc=N
LDAP port: 389
Server time: Tue, 16 Aug 2022 09:38:36 BST
KDC server: 111.22.33.44
Server time offset: 0
Last machine account password change: Fri, 24 Jul 2020 17:02:01 BST

root@filestore[~]# ktutil -k /etc/krb5.keytab list | grep nfs
1 des-cbc-crc nfs/filestore.fqdn@F.Q.D.N
1 des-cbc-crc nfs/FILESTORE@F.Q.D.N
1 des-cbc-md5 nfs/filestore.fqdn@F.Q.D.N
1 des-cbc-md5 nfs/FILESTORE@F.Q.D.N
1 aes128-cts-hmac-sha1-96 nfs/filestore.fqdn@F.Q.D.N
1 aes128-cts-hmac-sha1-96 nfs/FILESTORE@F.Q.D.N
1 aes256-cts-hmac-sha1-96 nfs/filestore.fqdn@F.Q.D.N
1 aes256-cts-hmac-sha1-96 nfs/FILESTORE@F.Q.D.N
1 arcfour-hmac-md5 nfs/filestore.fqdn@F.Q.D.N
1 arcfour-hmac-md5 nfs/FILESTORE@F.Q.D.N
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
Just an update. Setting sec=sys allows me to manually NFS mount.

With Kerberos enable, and with automount debuging on, I'm seeing the following logs:

automount[9390]: >> mount.nfs4: access denied by server while mounting filestore.fqdn:/mnt/store/home/username

Should I need to leave and rejoin the domain on the TrueNAS box after an update?
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
Red Hat have provided me with an update which I hope might be useful. If anyone has any ideas please let me know!

From the pcap as well, the nfs-server is responding with NFS4ERR_WRONGSEC indicating
it's not supporting the security mechanism with which the share is being mounted:
~~~
$ tshark -tad -nr 0040-2022-08-16-linux-client-nfs-client.pcap -Y 'rpc.xid in {0xd0487b02 0x99c488df}'
261 2022-08-16 12:49:45.392087 ${IP1} → ${IP2} NFS 334 V4 Call EXCHANGE_ID
265 2022-08-16 12:49:45.392231 ${IP2} → ${IP1} NFS 106 V4 Reply (Call In 261) Status: NFS4ERR_MINOR_VERS_MISMATCH
430 2022-08-16 12:49:45.407558 ${IP1} → ${IP2} NFS 334 V4 Call EXCHANGE_ID
434 2022-08-16 12:49:45.407663 ${IP2} → ${IP1} NFS 114 V4 Reply (Call In 430) EXCHANGE_ID Status: NFS4ERR_WRONGSEC <---
~~~

from the RFC:
~~~
NFS4ERR_WRONGSEC The security mechanism being used by the client
for the operation does not match the server's
security policy. The client should change the
security mechanism being used and retry the
operation.
~~~

This is a server side issue which needs to be looked at with the help of the NFS server vendor.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Red Hat have provided me with an update which I hope might be useful. If anyone has any ideas please let me know!

From the pcap as well, the nfs-server is responding with NFS4ERR_WRONGSEC indicating
it's not supporting the security mechanism with which the share is being mounted:
~~~
$ tshark -tad -nr 0040-2022-08-16-linux-client-nfs-client.pcap -Y 'rpc.xid in {0xd0487b02 0x99c488df}'
261 2022-08-16 12:49:45.392087 ${IP1} → ${IP2} NFS 334 V4 Call EXCHANGE_ID
265 2022-08-16 12:49:45.392231 ${IP2} → ${IP1} NFS 106 V4 Reply (Call In 261) Status: NFS4ERR_MINOR_VERS_MISMATCH
430 2022-08-16 12:49:45.407558 ${IP1} → ${IP2} NFS 334 V4 Call EXCHANGE_ID
434 2022-08-16 12:49:45.407663 ${IP2} → ${IP1} NFS 114 V4 Reply (Call In 430) EXCHANGE_ID Status: NFS4ERR_WRONGSEC <---
~~~

from the RFC:
~~~
NFS4ERR_WRONGSEC The security mechanism being used by the client
for the operation does not match the server's
security policy. The client should change the
security mechanism being used and retry the
operation.
~~~

This is a server side issue which needs to be looked at with the help of the NFS server vendor.
Try updating to 13.0-U1.1 and see if it's resolved there.
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
Try updating to 13.0-U1.1 and see if it's resolved there.
Hi @anodos. I can't do that, I'm afraid. It's production server so would need much more planning. I also don't want to go down a rabbit hole at this stage with further upgrades. In fact, I'm almost tempted to roll back to see if that helps.
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
Hi @anodos. It turns out that I have no rollback option for 12.0-U4. I have two boxes, the second of which does give me an option to rollback.

Is there something I need to set to make this happen going forward? I thought this was an automatic thing.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
@Fab Sidoli Can you post a screenshot of the System > Boot menu? You should be able to activate the older boot environment there and then reboot into it.
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
Please see the attached. The failover server has the older 12.0 boot environments. The prod does not, even though it was previously running 12.0-U4. I also not the boot image on the prod for 12.0-U8.1 is over 50GB in size.


root@prod[~]# zfs list | grep "freenas-boot/ROOT"
freenas-boot/ROOT 57.2G 357G 88K none
freenas-boot/ROOT/11.3-U4 332K 357G 1.25G /
freenas-boot/ROOT/11.3-U4.1 408K 357G 1.25G /
freenas-boot/ROOT/11.3-U5 404K 357G 50.6G /
freenas-boot/ROOT/12.0-U8.1 57.2G 357G 50.8G /
freenas-boot/ROOT/FreeNAS-12.0-U4 372K 357G 50.8G /
freenas-boot/ROOT/Initial-Install 8K 357G 1.24G legacy
freenas-boot/ROOT/default 316K 357G 1.24G legacy

root@failover[~]# zfs list | grep "freenas-boot/ROOT"
freenas-boot/ROOT 9.33G 405G 88K none
freenas-boot/ROOT/11.3-U4.1 392K 405G 1.25G /
freenas-boot/ROOT/11.3-U5 432K 405G 1.26G /
freenas-boot/ROOT/12.0-U4 608K 405G 1.40G /
freenas-boot/ROOT/12.0-U7 388K 405G 1.41G /
freenas-boot/ROOT/12.0-U8.1 9.33G 405G 1.42G /
freenas-boot/ROOT/FreeNAS-12.0-U3 352K 405G 1.37G /
freenas-boot/ROOT/Initial-Install 8K 405G 1.24G legacy
freenas-boot/ROOT/default 316K 405G 1.24G legacy
 

Attachments

  • failover.png
    failover.png
    477.1 KB · Views: 77
  • prod.png
    prod.png
    297.7 KB · Views: 84

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
There is a 12.0-U4 at the bottom of the prod screenshot. Also beadm list is way more convenient than the zfs | grep you used :wink:
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
I wasn't sure this was the correct thing because the failover has an actual 12.0-U4 listing (as well as a FreeNAS-12.0-U3), so I was expecting to see the same. I'm not sure if the naming convention is significant.

I'm also not sure why the 12.0-U8.1 is near 60GB in size vs the other images on both systems.

root@prod[~]# beadm list
BE Active Mountpoint Space Created Nickname
default - - 6.6M 2020-07-08 19:15 default
Initial-Install - - 6.2M 2020-07-08 19:35 Initial-Install
11.3-U4 - - 1.3G 2020-07-22 19:20 11.3-U4
11.3-U4.1 - - 1.3G 2020-08-02 20:31 11.3-U4.1
11.3-U5 - - 1.3G 2020-12-01 07:06 11.3-U5
FreeNAS-12.0-U4 - - 1.4G 2021-07-12 07:32 FreeNAS-12.0-U4
12.0-U8.1 NR / 57.2G 2022-08-16 06:58 12.0-U8.1

root@failover[~]# beadm list
BE Active Mountpoint Space Created Nickname
default - - 6.1M 2020-08-07 14:52 default
Initial-Install - - 5.8M 2020-08-07 15:02 Initial-Install
11.3-U4.1 - - 1.3G 2020-08-07 15:53 11.3-U4.1
11.3-U5 - - 1.3G 2020-11-03 08:34 11.3-U5
FreeNAS-12.0-U3 - - 1.4G 2021-04-29 09:50 FreeNAS-12.0-U3
12.0-U4 - - 1.4G 2021-07-05 16:29 12.0-U4
12.0-U7 - - 1.4G 2022-01-05 10:33 12.0-U7
12.0-U8.1 NR / 9.3G 2022-08-08 12:23 12.0-U8.1
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
You can rename them all you like. Possibly someone did? Both are from July 2021, so that seems to match with prod having been updated one week after failover.

As for the size - where is your system dataset located? Possibly there is a core dump somewhere in that huge BE.
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
The system dataset is in the default location. I don't believe I have changed this. Listed as freenas-boot in the UI, which I believe lives on the boot drives, although I could be wrong.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
The default is to use the first storage pool when it is created. Depending on the size of your boot media you should change that. That also reduces write load on the boot drives.

To find the space hog:
Code:
cd /var/db/system
du -sk * | sort -rn
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
OK. That's good to know.


root@prod[/var/db/system]# du -sk * | sort -rn
143259 rrd-646f8dae97d646cc8946ddeb0ca79d97
120437 configs-646f8dae97d646cc8946ddeb0ca79d97
18114 ixdiagnose
14866 syslog-646f8dae97d646cc8946ddeb0ca79d97
404 samba4
1 webui
1 update
1 services
1 nfs-stablerestart.bak
1 nfs-stablerestart
1 cores
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
So it's not in your system dataset but in your BE.
Code:
cd /
du -skx * | sort -rn

Then cd into the topmost, i.e. largest, directory and repeat the process until you found something big.
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
Thank you! I've found the culprit directory. One of my colleagues has been doing something they shouldn't have!

I'll try the rollback in the hope it sorts out my Kerberised NFS issue
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
Hi All,

Is anyone able to tell me what versions of NFS are supported in TrueNAS 12.0 and if any changes have taken place between U4 and U8.1?

I would like to go back and address the NFS4ERR_WRONGSEC and NFS4ERR_MINOR_VERS_MISMATCH errors I'm seeing above in the packet dumps, which indicate a server side issue. Red Hat have reached the end of what they'll support, and whilst I can run with sec=sys, which still works, I'd like to do Kerberised NFS.

How do I do debuging of NFS connections on the TrueNAS box?
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
I'm told that this is apparently a "non-critical" issue and that a fix, if one is required, won't be implemented as a result. The suggestion is to upgrade to 13. Apologies for the rant, but this is very poor service. I'm appalled by the apparent lack of willingness of the engineering team to look into this issue despite their update tanking my system. I'll happily hold my hands up if the issue is on my end, but given that the only change in my environment was the update to U8.1 I find it shocking that they won't even look into fixing it.

If anyone can offer any useful suggestions it would be much appreciated.
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
Hi All,

I decided to leave and rejoin the domain to see if this would shed some light. I notice the following error in the middleware logs which are interesting.

I've always used the schema rfc2307 so I'm not sure if this is no longer a valid option. The drop down still exists so I assume it is. Apart from throwing up the error in the UI it does appear to have joined the domain and the test with net ads and wbinfo seem to suggest it's OK.

[2022/08/19 20:34:47] (DEBUG) ActiveDirectoryService.start():689 - Test join to DOMAIN failed. Performing domain join.
[2022/08/19 20:34:49] (WARNING) ActiveDirectoryService.start():703 - Failed to add NFS spn to active directory computer object.
middlewared.schema.Error: [schema_mode] Invalid choice: rfc2307

The failure to add the NFS spn to the active directory computer object is also interesting, not least because it does appear to be present in the AD. That said, it's not as far as the filestore itself is concerned - see below.

Code:
root@filestore[~]# net ads keytab list                                                       
Vno  Type                                        Principal
  1  aes256-cts-hmac-sha1-96                     restrictedkrbhost/filestore.domain@DOMAIN
  1  aes256-cts-hmac-sha1-96                     restrictedkrbhost/FILESTORE@DOMAIN
  1  aes128-cts-hmac-sha1-96                     restrictedkrbhost/filestore.domain@DOMAIN
  1  aes128-cts-hmac-sha1-96                     restrictedkrbhost/FILESTORE@DOMAIN
  1  arcfour-hmac-md5                            restrictedkrbhost/filestore.domain@DOMAIN
  1  arcfour-hmac-md5                            restrictedkrbhost/FILESTORE@DOMAIN
  1  aes256-cts-hmac-sha1-96                     host/filestore.domain@DOMAIN
  1  aes256-cts-hmac-sha1-96                     host/FILESTORE@DOMAIN
  1  aes128-cts-hmac-sha1-96                     host/filestore.domain@DOMAIN
  1  aes128-cts-hmac-sha1-96                     host/FILESTORE@DOMAIN
  1  arcfour-hmac-md5                            host/filestore.domain@DOMAIN
  1  arcfour-hmac-md5                            host/FILESTORE@DOMAIN
  1  aes256-cts-hmac-sha1-96                     FILESTORE@DOMAIN
  1  aes128-cts-hmac-sha1-96                     FILESTORE@DOMAIN
  1  arcfour-hmac-md5                            FILESTORE@DOMAIN


I manually added these in with

Code:
net ads keytab add nfs


If anyone has any thoughts please do get in touch.
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
Hi @admihai3, I assume your response was in error, or did you mean to post something?
 
Top