Linux Jails - Experimental Script

Jip-Hop

Contributor
Joined
Apr 13, 2021
Messages
118
I run syncthing with docker in jailmaker and it works great with a few TB worth of files (big but also many small files). My jail uses bridge networking. Before that I ran syncthing on the host directly. It's only a single binary after all. And I started it using a systemd-run command with the same options as the official systemd service file for syncthing.

My compose file looks somewhat like this:

Code:
version: "3"
services:
  syncthing:
    image: syncthing/syncthing:1.27.2
    container_name: syncthing
    hostname: syncthing
    environment:
      - PUID=4000
      - PGID=4000
      - STGUIADDRESS=172.17.0.1:8384
    volumes:
      - /mnt/syncthing/data:/var/syncthing
    healthcheck:
      disable: true
    network_mode: host
    restart: unless-stopped
    labels:
      - traefik.enable=true
      - traefik.http.routers.syncthing.tls=true
      - traefik.http.routers.syncthing.rule=Host(`syncthing.example.com`)
      - traefik.http.routers.syncthing.entrypoints=websecure
      - traefik.http.services.syncthing.loadbalancer.server.port=8384
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Weird. My instance of syncthing really does not like running inside jailmaker on my server.

After a short period of indexing, the syncthing database gets corrupt and this happens after clearing and starting with a fresh database. I'm also seeing a lot of dmesg errors on the TrueNAS host complaining about the process id linked to syncthing and some errors mentioning macvlan. The macvlan issues could be something funky going on with my NIC (Intel X540-T2 10G NIC), but that shouldn't affect the syncthing database since that's running locally on a NVME SSD.

Other services that rely on large databases (Plex and Photoview) haven't been a problem.
Probably need sync=always for databases.

Is this being tested on 23.10 or 24.04 nightly? please specify.
 

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
Thanks morganL. I'm running TrueNAS-SCALE-23.10.1.1

I can try again with sync=always on the jailmaker dataset.

As a side note, does that mean you would recommend using sync=always on the dataset for the kubernetes instance of syncthing (the official iX app) I have running now on the SSD?

Since the dmesg errors that were reoccurring kept mentioning macvlan in 'trace call', I'll try using host networking if the above doesn't solve it.
 

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
I think the errors I was getting were due to passing through bridge interfaces from the host with macvlan.

For background, I have a single 10GbE connection to TrueNAS from my switch via an Intel X540-T2 NIC and it's set up with multiple VLANs, with each having a corresponding bridge interface. I used the below arguments to pass these bridge interfaces into the jail so that I could have multiple interfaces accessible in the jail with the management VLAN not included, mostly so data-heavy containers like Syncthing wouldn't need to cross over the firewall and would have direct access to clients inside the three subnets.

--network-macvlan=br10 --network-macvlan=br20 --network-macvlan=br50

After some testing, passing through a bridge interface and also using host networking seems to be much more stable for me.

Is there any way to use host networking but effectively 'block' the nspawn container from accessing one specific bridge/vlan (br100 in this case)? I just don't want the jail to be on the management VLAN. It's a bit of a shame that you can only passthrough one bridge interface without macvlan, but if I have to, I could compromise and just limit the jail to one bridge interface and then use firewall rules to allow traffic to pass for Syncthing peers.
 
Last edited:

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
Thanks, I did have several read throughs of the manual and also did come across that GitHub post, but ultimately decided that it was potentially introducing more problems to troubleshoot in future.

After thinking about it more, 95% of the large file syncing I do is on one particular VLAN, so I’m going to just passthrough that one single bridge. The remainder can just get routed via firewall rules.

Thanks for the help! I hope nspawn gets some official support in future.
 

cap

Contributor
Joined
Mar 17, 2016
Messages
122
I have a few jails. I have noticed that only one of the jails starts automatically when the server is restarted. I am pretty sure that all jails were configured accordingly during setup.
When I look at the configuration, I can't see any difference. What could be the reason for this?
Is it the same for you?

This is the Logitech Media Server, which runs alone in a jail (without Docker) and starts automatically:
Code:
startup=1
docker_compatible=0
gpu_passthrough_intel=0
gpu_passthrough_nvidia=0
systemd_nspawn_user_args==--network-bridge=br0 --resolv-conf=bind-host --private-users=262144:65536 --private-users-ownership=chown --bind-ro=/mnt/default/media/center/Musik
# You generally will not need to change the options below
systemd_run_default_args=--property=KillMode=mixed --property=Type=notify --property=RestartForceExitStatus=133 --property=SuccessExitStatus=133 --property=Delegate=yes --property=TasksMax=infinity --collect --setenv=SYSTEMD_NSPAW>
systemd_nspawn_default_args=--keep-unit --quiet --boot


This is a jail with Docker that does not start automatically:
Code:
startup=1
docker_compatible=1
gpu_passthrough_intel=1
gpu_passthrough_nvidia=0
systemd_nspawn_user_args=--network-bridge=br0 --resolv-conf=bind-host --private-users=65536:65536 --private-users-ownership=chown --bind-ro=/mnt/default/media/center
# You generally will not need to change the options below
systemd_run_default_args=--property=KillMode=mixed --property=Type=notify --property=RestartForceExitStatus=133 --property=SuccessExitStatus=133 --property=Delegate=yes --property=TasksMax=infinity --collect --setenv=SYSTEMD_NSPAW>
systemd_nspawn_default_args=--keep-unit --quiet --boot
 

Jip-Hop

Contributor
Joined
Apr 13, 2021
Messages
118
Config looks good so if it doesn't work it could be a bug. Please open an issue about it in github.

Just to check: you are running
/mnt/mypool/jailmaker/jlmkr.py startup
As post init script?
 

cap

Contributor
Joined
Mar 17, 2016
Messages
122
Config looks good so if it doesn't work it could be a bug. Please open an issue about it in github.
I don't currently have an account, but I can change that.

Just to check: you are running
/mnt/mypool/jailmaker/jlmkr.py startup
As post init script?
yes.

Edit Init/Shutdown Script:​

sudo /mnt/software/jailmaker/jlmkr.py startup
 

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
This is a jail with Docker that does not start automatically:
Code:
startup=1
docker_compatible=1
gpu_passthrough_intel=1
gpu_passthrough_nvidia=0
systemd_nspawn_user_args=--network-bridge=br0 --resolv-conf=bind-host --private-users=65536:65536 --private-users-ownership=chown --bind-ro=/mnt/default/media/center
# You generally will not need to change the options below
systemd_run_default_args=--property=KillMode=mixed --property=Type=notify --property=RestartForceExitStatus=133 --property=SuccessExitStatus=133 --property=Delegate=yes --property=TasksMax=infinity --collect --setenv=SYSTEMD_NSPAW>
systemd_nspawn_default_args=--keep-unit --quiet --boot
I'm experiencing similar, but in a different way, with a fresh new jail using host networking and the same private-users and private-users-ownership arguments. It's odd because I was using a different jail last week with the same settings with host networking and that had no problems.

Creating a new jail with host networking without any nspawn arguments (so root in jail is root on host) works. Also, creating a new jail with bridge networking with --private-users=65536:65536 --private-users-ownership=chown works.

So basically host networking on an effectively unprivileged jail isn't working with docker for me. Not a big deal for me personally since bridge networking is fine for what I need, but just thought I'd post the error log below if it helps.

OS Version:TrueNAS-SCALE-23.10.1.3

Log output is below.

Code:
root@docker:~# journalctl -xeu docker.service
Jan 31 10:16:05 docker systemd[1]: Starting docker.service - Docker Application Container Engine...
░░ Subject: A start job for unit docker.service has begun execution
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit docker.service has begun execution.
░░
░░ The job identifier is 442.
Jan 31 10:15:56 docker dockerd[96]: time="2024-01-31T10:15:56.272285018+08:00" level=info msg="Starting up"
Jan 31 10:15:58 docker dockerd[96]: time="2024-01-31T10:15:58.174975516+08:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Jan 31 10:15:58 docker dockerd[96]: time="2024-01-31T10:15:58.175578202+08:00" level=info msg="Loading containers: start."
Jan 31 10:15:58 docker dockerd[96]: time="2024-01-31T10:15:58.185455644+08:00" level=info msg="unable to detect if iptables supports xlock: 'iptables --wait -L -n': `iptables v1.8.9 (nf_tables): Could not fetch rule set generation id: Permission denied (you must be root)`" error="exit status 4"
Jan 31 10:15:58 docker dockerd[96]: time="2024-01-31T10:15:58.259087050+08:00" level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd namespace=moby
Jan 31 10:15:58 docker dockerd[96]: failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to register "bridge" driver: failed to create NAT chain DOCKER: iptables failed: iptables -t nat -N DOCKER: iptables v1.8.9 (nf_tables): Could not fetch rule set generation id: Permission denied (you must be root)
Jan 31 10:15:58 docker dockerd[96]:  (exit status 4)
Jan 31 10:15:58 docker dockerd[96]: time="2024-01-31T10:15:58.261690348+08:00" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby
Jan 31 10:15:58 docker systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ An ExecStart= process belonging to unit docker.service has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 1.
Jan 31 10:16:08 docker systemd[1]: docker.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ The unit docker.service has entered the 'failed' state with result 'exit-code'.
Jan 31 10:16:08 docker systemd[1]: Failed to start docker.service - Docker Application Container Engine.
░░ Subject: A start job for unit docker.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit docker.service has finished with a failure.
░░
░░ The job identifier is 442 and the job result is failed.
Jan 31 10:16:10 docker systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
 
Last edited:
  • Like
Reactions: cap

Jip-Hop

Contributor
Joined
Apr 13, 2021
Messages
118
--private-network


Disconnect networking of the container from the host. This makes all network interfaces unavailable in the container, with the exception of the loopback device and those specified with --network-interface= and configured with --network-veth. If this option is specified, the CAP_NET_ADMIN capability will be added to the set of capabilities the container retains.
The manual mentions that CAP_NET_ADMIN is added when using --private-network. Thus when using host networking, this capability isn't added. Perhaps that's what's missing to start your unprivileged jail @skittlebrau?
 

Jip-Hop

Contributor
Joined
Apr 13, 2021
Messages
118
A jail not being able to start (at all) is a different issue from a jail not starting automatically after running /mnt/software/jailmaker/jlmkr.py startup as root from a Post Init script.

@cap please make a GitHub issue and post the output of the /mnt/data/jailmaker/jlmkr.py startup command. You can capture the output of this command at boot by changeing your Post Init script to this:

Code:
#!/bin/env bash

# Description:              Jails Start
# When:                     Post Init
# Enabled:                  True
# Timeout:                  10

/mnt/data/jailmaker/jlmkr.py startup | mail -s "Jail" "youremail@example.com"
 

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
The manual mentions that CAP_NET_ADMIN is added when using --private-network. Thus when using host networking, this capability isn't added. Perhaps that's what's missing to start your unprivileged jail @skittlebrau?
Thanks, I just realised that! When I was attempting to use simple ping commands in the jail when using host networking, it would give me the error below about the operation not being permitted, which was a clear clue about missing certain network capabilities.

Code:
ping: socktype: SOCK_RAW
ping: socket: Operation not permitted


Thanks again. Sorry I know a lot of what I've asked essentially comes down to 'RTFM' - but sometimes it helps to contextualise things. Even though I've been using Linux in various forms for 20 years (started on early Red Hat), there's just so much I struggle with. The Arch wiki expands on a few topics with some practical examples and that’s almost always helpful as well - https://wiki.archlinux.org/title/systemd-nspawn#Networking
 
Last edited:

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
The manual mentions that CAP_NET_ADMIN is added when using --private-network. Thus when using host networking, this capability isn't added. Perhaps that's what's missing to start your unprivileged jail @skittlebrau?
One thing I noticed is that --capability=all is passed by default in all jailmaker instances, regardless of networking type or presence of --private-network in user arguments, so CAP_NET_ADMIN should already be granted as part of that.
 

Jip-Hop

Contributor
Joined
Apr 13, 2021
Messages
118
So basically host networking on an effectively unprivileged jail isn't working with docker for me.
Ah sorry I must have overlooked that you're using docker in it. When docker_compatible=1 then all capabilities are granted to the jail. It's not there by default though.
 

Jip-Hop

Contributor
Joined
Apr 13, 2021
Messages
118
I think the reason the rootless jail using host networking with Docker in it can't work because docker needs to modify iptables. Since it's using the host networking it would attempt to modify the iptables of the host network, for which it doesn't have permission as it is rootless. It would need it's own networking namespace, which e.g. bridge or macvlan provide.
 

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
I think the reason the rootless jail using host networking with Docker in it can't work because docker needs to modify iptables. Since it's using the host networking it would attempt to modify the iptables of the host network, for which it doesn't have permission as it is rootless. It would need it's own networking namespace, which e.g. bridge or macvlan provide.
That'd be it. I recalled that the log output above mentioned errors about iptables, so that sounds about right.

I also discovered it's not just me who experiences instability when using macvlan from a host bridge in a container. There's a few others on kernel 6.1x experiencing similar kernel call trace errors. Another piece of the puzzle could also likely be my switch since the default value for Port Security is 64, which limits the amount of unique MAC addresses per port and hitting the limit causes interfaces to randomly get dropped. I currently have 60 MAC addresses on that port (due to virtual ethernet interfaces) and I would've definitely have hit the limit previously.

I ran a separate container for testing (I ultimately felt that a single bridge interface was going to be too limiting long-term) and ipvlan has been stable so far, but I'll see what happens. I'll give it another week and then probably migrate my main docker stack over. I'd prefer to use macvlan, so will see if increasing the port security value helps with that.

Thanks for putting together such a neat script @Jip-Hop
 

Jip-Hop

Contributor
Joined
Apr 13, 2021
Messages
118
Top