Broken APT Repos

mobrien118

Dabbler
Joined
Jun 22, 2020
Messages
25
First, let me say that I'm absolutely loving this approach that TrueNAS SCALE is taking. I've been playing with it for several months now and appreciate being able to play with the Linux underpinnings.

I made a "mistake" the other day (just wanted to see what would happen) and ran "apt full-upgrade -y". This, of course, proceeds to force upgrade packages and settles dependencies by "forging ahead". Ultimately, it removed these essential packages: cifs-utils fuse libnvpair3 libuutil3 libzfs4 libzpool4 middlewared migrate93 nfs-ganesha nfs-ganesha-gluster nfs-ganesha-vfs openzfs python3-libzfs python3-midcli truenas truenas-samba zectl zfs

Obviously, those are needed for TrueNAS to function. Some were replaced by alternative packages, while others were removed. I was foolish to not take a snapshot, first, but NBD, this system is expendable or I wouldn't have done this in the first place.

The simplest "test fix" I've tried is to to simply install the "truenas" package (and I was waiting to post this to see if 21.04 would fix it on its own) and I get some dependency errors that are tricky to resolve, on their own:
Code:
The following packages have unmet dependencies:
 python3-libzfs : Depends: libnvpair1linux (>= 0.8.2) but it is not installable
                  Depends: libzfs2linux (>= 0.8.2) but it is not installable
                  Depends: python3 (< 3.9) but 3.9.2-2 is to be installed
 zectl : Depends: libnvpair1linux (>= 0.8.2) but it is not installable
         Depends: libzfs2linux (>= 0.8.2) but it is not installable
E: Unable to correct problems, you have held broken packages.


I notice that python3 has progressed to the 3.9.2 version, which is marked as incompatible, and doesn't have a backported version available. That's pretty straightforward. I might try to force it through to see just what happens with that version of Python and what breaks (maybe someone here already knows)

The other two (libnvpair1linux and libzfs2linux ) are a little trickier, since they seem to have been removed from Debian completely and, presumably, replaced by other packages or features. If they are specific libraries or binaries, obviously whatever features they provide will be broken unless "truenas" is updated to use whatever replaces them long term.

I guess this is the downside of using Debian's "testing" channel as your base, and I assume you're just mirroring their repos, not snapshotting them.

Is there anything I can do to help with this? My system is running, so I can do anything but reboot. I tried this same test on a test VM and after doing this and rebooting, ssh isn't on and the main screen keeps cycling:
Login incorrect

Login incorrect

Login incorrect
...
(He's dead, Jim)

Thanks!!
 

Kris Moore

SVP of Engineering
Administrator
Moderator
iXsystems
Joined
Nov 12, 2015
Messages
1,471
Ha, glad you are enjoying playing with (and breaking) SCALE :)

Yea, you should avoid apt usage unless you are very careful, or make sure you at minimum create a new boot-environment first so you can roll back if you have something like this happen due to apt mis-adventures.

In our case we are mirroring the Debian repos, so that for developer types it is possible to do 'apt update' and then selectively 'apt install' packages for testing, debug or dev work. By default those point to our mirrors so that we don't have upstream Debian pushing updates to the TrueNAS base images which break us at random times. We tend to sync from upstream shortly after a major release.

But word of caution again to other power-users out there. Use "apt" at your own risk, and please, please, please be sure to have a good BE snapshot first.
 

mobrien118

Dabbler
Joined
Jun 22, 2020
Messages
25
That makes sense. I assume that the long term goal will be to have apt stable (package references and dependencies), even if it is not recommended to use it for updating the system.

I guess I'll be reloading the OS on this.

Do you know if there is a way to back up truenas config from the command line when truecommand and truenas are not available? Or, is there maybe a backup config stored somewhere automatically that can be retrieved? If not, it's not too hard for me to rebuild this config.
 

Kris Moore

SVP of Engineering
Administrator
Moderator
iXsystems
Joined
Nov 12, 2015
Messages
1,471
Check inside of /var/db/system/configs-*/ there may be some DB backups in there which you can re-upload to the UI after re-install.
 

mobrien118

Dabbler
Joined
Jun 22, 2020
Messages
25
Check inside of /var/db/system/configs-*/ there may be some DB backups in there which you can re-upload to the UI after re-install.

Great! I found some recent configs and pulled them over.

I'm still playing around with this and I've noticed that I can boot into the "Initial Install" pool, which allows me to log in, but, of course, there is no config. I also can't process the 12.04 "upgrade" because it says Error: [ENOSPC] Insufficient disk space available on boot-pool (1.5 GB). Need 5.32 GB I'm thinking this is because it is trying to upgrade the "Initial Install", but wondering if the whole "boot-pool" is *actually* too small.

If I'm correct that it is just trying to upgrade the "Initial-install", then I should be able to clone that pool over the production pool, boot up, then import my backup...? Can someone confirm if I'm correct in this thinking, or show me how to confirm myself?

I'm sure it is 1/2 dozen vs 6 between doing this and just installing over the existing install, but I like to take the "scenic route" sometimes because I usually learn something and enjoy the view. If I'm right about the boot-pool, can anyone point me in the right direction to restore the "initial-install" over my production pool so I can give this a try?

Thanks!!

--mobrien118
 

Kris Moore

SVP of Engineering
Administrator
Moderator
iXsystems
Joined
Nov 12, 2015
Messages
1,471
So the "Initial Install" boot environment is just that, the state of the system device at the time of fresh install. If you roll back to that and import your config file you should be able to import the data pool and continue on. However it is a bit concerning the warnings about space issues on boot-pool. How big of a device is that? You'd usually want something 20GB+ so that you can have 2-3 boot-environments comfortably on there..
 

mobrien118

Dabbler
Joined
Jun 22, 2020
Messages
25
I agree that it doesn't seem right. Here is the output of "df -h":
Code:
Filesystem                                                  Size  Used Avail Use% Mounted on
udev                                                        3.9G     0  3.9G   0% /dev
tmpfs                                                       797M  1.1M  796M   1% /run
boot-pool/ROOT/Initial-Install                              3.6G  2.4G  1.2G  67% /
tmpfs                                                       3.9G   88K  3.9G   1% /dev/shm
tmpfs                                                       100M     0  100M   0% /run/lock
tmpfs                                                       4.0M     0  4.0M   0% /sys/fs/cgroup
tmpfs                                                       3.9G  2.8M  3.9G   1% /tmp
boot-pool/grub                                              1.2G  7.3M  1.2G   1% /boot/grub
boot-pool/.system                                           2.4G  1.3G  1.2G  51% /var/db/system
boot-pool/.system/cores                                     1.0G  128K  1.0G   1% /var/db/system/cores
boot-pool/.system/samba4                                    1.2G  128K  1.2G   1% /var/db/system/samba4
boot-pool/.system/syslog-9d3f24a1580a424f9dc66a6bc1dbb54d   1.2G  2.3M  1.2G   1% /var/db/system/syslog-9d3f24a1580a424f9dc66a6bc1dbb54d
boot-pool/.system/rrd-9d3f24a1580a424f9dc66a6bc1dbb54d      1.2G  6.7M  1.2G   1% /var/db/system/rrd-9d3f24a1580a424f9dc66a6bc1dbb54d
boot-pool/.system/configs-9d3f24a1580a424f9dc66a6bc1dbb54d  1.2G  384K  1.2G   1% /var/db/system/configs-9d3f24a1580a424f9dc66a6bc1dbb54d
boot-pool/.system/webui                                     1.2G  128K  1.2G   1% /var/db/system/webui
boot-pool/.system/services                                  1.2G  128K  1.2G   1% /var/db/system/services
boot-pool/.system/glusterd                                  1.2G  128K  1.2G   1% /var/db/system/glusterd
boot-pool/.system/ctdb_shared_vol                           1.2G  128K  1.2G   1% /var/db/system/ctdb_shared_vol


Looking deeper, I see that for this (my test machine that I'm trying to "solve" this on before performing it on my "real" machine) I only allocated an 8GB system disk. Seems like this might be the problem. My "real" machine has 20GB allocated for this, so maybe I should update my test and try again.
 

mobrien118

Dabbler
Joined
Jun 22, 2020
Messages
25
So, I was able to do a fresh OS install, import the data pool, and then restore the db file (yay!) with just one recognizable issue ( :-/ ):
Code:
Web UI HTTPS certificate setup failed.
2021-05-05 08:32:34 AM (America/Chicago)


Looks like it broke SSL on the web. Port 80 still works, but I can't get in via 443.

Without digging in to how SSL works with whatever web server this uses, anyone dealt with this before?

Also, should this be considered a bug? *Maybe* SSL certs should be included in backup files (assuming that is the cause and not something else, also assuming this doesn't start a war of opinions on security :smile:.

If this is not a known issue and it needs to be solved, I will work on a solution, but I don't want to double work if it is already known and has a known solution.
 

mobrien118

Dabbler
Joined
Jun 22, 2020
Messages
25
I used the "Credentials" --> "Certificates" menu to create a new CA (there was none), then created a certificate from that CA. The "freenas_default" cert that was there, I guess, did not pass the certificate chain, or possibly was missing a private key - I don't know, I didn't look deep into it, just solved the problem by generating the CA and a cert and then going into "System Settings" --> "General" then clicking "Settings" up by the "GUI" section and selected the new cert. It prompted me to restart the web interface and when it came back up https was working again! Easy peasy.

Hope this helps someone.
 
Top