Link aggregation config corruption after upgrade from Cobia 13.10.01 to 13.10.1

gwjunk

Dabbler
Joined
Aug 30, 2023
Messages
18
Homelab pains.... I applied the update Cobia from 13.10.01 to 13.10.1 and my bond01 link aggregation disappeared from the UI. Everything that was pointing to it (bridges,, vlans, etc...) flipped to pointing to the first nic in the LCAP. When I try to add a link aggregation again I get the UNIQUE constraint error at the tail of this message. The bond01 is not listed from Web UI or from console edit network. Seems like there is some corruption in the config for the network devices. I can enable the second nic with a new config on a different subnet and it works, just cannot get a configuration change to save due to the errors. I tried fresh install of 13.10.1 on a new disk and restoring the config backup from before the upgrade with same results. Not sure where to go from here.

Truenas Cobia 13.10.1
Motherboard: ASUS x299 Sage 10G
CPU: Intel i9-10980xe
RAM: 256GB (8x32) Corsair Vengence
NIC: Intel x550-T2 (built into motherboard)
GPU: nVidia 730
Passthrough GPU: MSI NVidia 4060ti
HBA: 9500-16i 12Gb/s HBA TriMode SAS/NVMe
Disks: 8x8TB Ultrastar, 8x2TB Samsung 870 Pro, 4x2TB Samsung 980 Pro in Highpoint NVMe HBA, 4x2TB Samsung 980pro in U.2 enclosure connected to MB U.2

Error details --------------------------

sqlite3.IntegrityError) UNIQUE constraint failed: network_lagginterfacemembers.lagg_physnic [SQL: INSERT INTO network_lagginterfacemembers (lagg_ordernum, lagg_physnic, lagg_interfacegroup_id) VALUES (?, ?, ?)] [parameters: (0, 'enp194s0f1', 8)] (Background on this error at: https://sqlalche.me/e/14/gkpj)

- More info...
Error: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context self.dialect.do_execute( File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 736, in do_execute cursor.execute(statement, parameters) sqlite3.IntegrityError: UNIQUE constraint failed: network_lagginterfacemembers.lagg_physnic The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/middlewared/main.py", line 201, in call_method result = await self.middleware._call(message['method'], serviceobj, methodobj, params, app=self) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1342, in _call return await methodobj(*prepared_call.args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/service/crud_service.py", line 169, in create return await self.middleware._call( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1342, in _call return await methodobj(*prepared_call.args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/service/crud_service.py", line 194, in nf rv = await func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/schema/processor.py", line 44, in nf res = await f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/schema/processor.py", line 177, in nf return await func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/plugins/network.py", line 802, in do_create lagports_ids += await self.__set_lag_ports(lag_id, data['lag_ports']) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/plugins/network.py", line 1149, in __set_lag_ports await self.middleware.call( File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1399, in call return await self._call( ^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1342, in _call return await methodobj(*prepared_call.args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/schema/processor.py", line 177, in nf return await func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/plugins/datastore/write.py", line 62, in insert result = await self.middleware.call( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1399, in call return await self._call( ^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1353, in _call return await self.run_in_executor(prepared_call.executor, methodobj, *prepared_call.args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1251, in run_in_executor return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/middlewared/plugins/datastore/connection.py", line 106, in execute_write result = self.connection.execute(sql, binds) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1365, in execute return self._exec_driver_sql( ^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1669, in _exec_driver_sql ret = self._execute_context( ^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1943, in _execute_context self._handle_dbapi_exception( File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 2124, in _handle_dbapi_exception util.raise_( File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 211, in raise_ raise exception File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context self.dialect.do_execute( File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 736, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: network_lagginterfacemembers.lagg_physnic [SQL: INSERT INTO network_lagginterfacemembers (lagg_ordernum, lagg_physnic, lagg_interfacegroup_id) VALUES (?, ?, ?)] [parameters: (0, 'enp194s0f1', 8)] (Background on this error at: https://sqlalche.me/e/14/gkpj)
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I'm running a LAGG under 23.10.1 without a problem, but I'm seeing a number of threads mentioning this or something similar. Please report a bug (the link is at the top of the page, or you can do it through the TrueNAS GUI) and attach a debug file so the devs can see what's going on.
 

gwjunk

Dabbler
Joined
Aug 30, 2023
Messages
18
I'm running a LAGG under 23.10.1 without a problem, but I'm seeing a number of threads mentioning this or something similar. Please report a bug (the link is at the top of the page, or you can do it through the TrueNAS GUI) and attach a debug file so the devs can see what's going on.
I am a home lab user and cannot report an official bug for support. I did encounter the NIC rename issue where I had to repoint at the new NIC names months back when moving from Bluefin to Cobia. I assume whatever happened here was fixes in response to trying to address that initial issue with the 13.10.1 update. To me it looks like there is a single use for MAC attached to bond01 that is still in the database, but inactive and not visible for me to change from the UIs.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I am a home lab user and cannot report an official bug for support.
What makes you think that a home lab user can't report an "official bug"? The link is on every page of this forum, and it's also inside the product you're using--it's there for a reason.
 

gwjunk

Dabbler
Joined
Aug 30, 2023
Messages
18
I found the data issue. I downloaded current configuration .tar. Opening the database in sqlite3 I see that there are no laggs defined in network_lagginterface, however there are 2 NICs listed in network_lagginterfacemembers. These are dangling and since these NICs already belong to a LAGG, they are not available to be added to a new LAGG.


sqlite> select * from network_lagginterface
...> /
2|4|lacp|SLOW|LAYER2+3
sqlite> select * from network_lagginterfacemembers
...> /
37|0|enp194s0f0|2
38|1|enp194s0f1|2
sqlite>
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I do not pay for support.
Why do you think that's relevant? I'm asking because I'm honestly baffled--I don't think, in the 12 years I've been here, I've run into anyone who thinks they have to pay in order to file a bug, and I have no idea where such a belief would come from. You don't. File the bug.
 

gwjunk

Dabbler
Joined
Aug 30, 2023
Messages
18
Thanks. Prior history with other vendors where you can use and post in forums, but issue reporting and support not available without subscription... I thought the bug filing was through official support. I filed it. Considering deleting the the lagg member rows from the db in the config backup and and trying upload of the config if solution not provided.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
It's reasonable to expect that deleting those entries would fix the problem, but the question is why the problem happens in the first place--and you aren't the first to report it here, though it doesn't seem to happen to everyone (there's at least one person using a LAGG interface on 23.10.1--me--who isn't experiencing this). The debug file that you'll be asked to upload if you haven't already should have the information the devs need to figure out what's going on.
 

gwjunk

Dabbler
Joined
Aug 30, 2023
Messages
18
Great. Uploaded that a while ago. Good to have such a support community. Have been working with Oracle products for 25 years so my expectations have been lowered... :)
 

gwjunk

Dabbler
Joined
Aug 30, 2023
Messages
18
Commented out line 207 of /usr/lib/python3/dist-packages/middlewared/plugins/interface/link_address.py - Now "# interface_renamer.rename(db_interface["interface"], real_interface_by_link_address["name"])". Analyzed data in network_ tables and did cleanup (renamed vlan back to the vlan name where it had been changed to interface name, removed ghost lagg and laggmembers, etc... Reboots without issue and all working ok now. Will watch for bugfix in next release and be prepared to replicate the fix if not bundled in next update.
 
Top