Fun and difficulties with ports/pkg on FreeBSD

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Maintaining a local modified copy of ports sucks rocks. BTDT!

So, some of us maintain customized copies of FreeBSD with engineering requirements that an image be reproducible. This can be done in a variety of ways. I've been doing this since the very earliest days of FreeBSD and some design decisions are dictated by externalities you might or might not have, but in the specific case of images made here, we start with a RELEASE version of FreeBSD and add a specific list of ports to it. Ports are built just for "administrative" use, adding things like shells, benchmarks, utilities, etc. External services are an entirely different thing and aren't handled via ports. This allows a platform to be validated as functional and correct, and not subject to the random whim of external influences. I'm always entertained when something happens like someone deleting something and it breaks the world.

So one of the terrible things about FreeBSD ports is that it is basically a separate thing that is developed independently of FreeBSD. This means that because of the order in which releases happened, FreeBSD 11.4R ports more closely resemble FreeBSD 12.1R than 11.3R ports.

This basically blew up our FreeBSD 11 VMDK size, which used to be 23GB, but for 11.4R it has to be 24GB (and 12.1 is a staggering 30GB). We have a requirement that an image be able to host its ports tree, source tree, and have sufficient space to do a buildworld.

Unfortunately, lots of modern ports are pulling in craptons of dependencies. For example, I have no idea why FRR7 needs to pull in THREE different versions of Python ( python27-2.7.18_1, python37-3.7.8_1, python38-3.8.4 ) or why GIT needs to bring in things like BASH and SQLITE and dozens of other things. Worse, sometimes newer ports actually break old functionality, such as when FRR7's watchfrr script broke basic stuff that worked fine in Quagga and FRR4 (I think).

Ports itself kinda sucks rocks if you just want a reproducible result.

In many environments, it's just not cool to have a platform that is at some random and indeterminate state based on the date of your ports tree, and you want to validate that the thing works correctly. Some places do this thru gold masters, some thru branched ports trees. Lots of options.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Ports itself kinda sucks rocks if you just want a reproducible result.
Ever thought of building your own packages in poudriere? We do this from the quarterly ports tree branches.
You are probably well aware that this way you can disable many dependencies via custom options. E.g. no X11 ...

I don't get your point about 11.3 -> 11.4 -> 12.1 BTW. There is no such thing as a release ports tree. At least not in the form of an SVN tag. What used to be on the CDs in the good old days was just a snapshot at the time of release. Don't you follow at least the quarterly branch for security updates?

What I wonder is: how do they treat upstream updates to e.g. collectd? They cannot just sync but have to merge their local changes again and again and again. That's what I meant by "local copies suck".
 

Tigersharke

BOfH in User's clothing
Administrator
Moderator
Joined
May 18, 2016
Messages
893
Ever thought of building your own packages in poudriere? We do this from the quarterly ports tree branches.
You are probably well aware that this way you can disable many dependencies via custom options. E.g. no X11 ...

I know that when you grab a FreeBSD pkg, you get a prebuilt port with whatever the default config is, and a one-size-fits-all approach which satisfies most situations. This is one problem with pkg, the other, is that ports that may have been built onsite with specific configurations may be replaced if the port involved is also a dependency of a pkg chosen to be installed. So that is the pkg issues and how they conflict with the freedom of ports configurations. However, there is another layer which is often unseen.

A big reason why some inexplicable dependencies are included is that things come from Linux, and some things may be purpose-built for any certain distro. Most if not all of the stuff that originates from Linux use a different method of determining the build configuration which may be environment variable settings or make flags at invocation. Beyond this, is our ports system, which we expect and assume that those maintainers and the original porter expose all the possible options that exist in Linux for the build.

Many of the inexplicable dependencies may not have any direct need by the software at hand being built but also, it is likely that those same dependencies are not exposed as easy to adjust config options using the ncurses menu. We who use the ports tree or pkg repo are at the mercy of those maintainers unless we fight through the Linux documentation (when it exists) or the code, or whatever is necessary to discover why and how to eliminate those dependencies, or perhaps how to use more current versions (installed on the system) of things bundled with it.

  • I have used ports tree alone and it worked as expected, drew in needed dependencies, this is not always true anymore.
  • I have used pkg but even this is no guarantee, as I can build the port or install the pkg of openoffice-devel, both crash.
  • I have used portmaster and portupgrade in the relative past when PC-BSD was still around.
  • I tried poudriere and liked it, but this was at a time when the config for pkg repos or pkg itself was in flux, I had success for a time and then it broke, then I gave up on it.
  • I've been using synth most of the time but even it has issues at times. For unknown reasons I can sometimes build a port "manually" but synth fails for that same port.
  • I have also tried my hand at modifying port builds to eliminate dependencies or to have them produce a better result.
It seems that it has become ever more recommended to use a ports management program to do the work which is supposed to function properly by the ports tree alone, I vehemently oppose this direction.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Ever thought of building your own packages in poudriere?

No, because I've already got a tool that does the job, and I'm not really a "contrib tool of the day" fanboi. Poudriere is relatively new (2012) and I'm not looking for an intermediate step that requires even more time/space/IOPS anyways. Over the years FreeBSD has had numerous things come and go, and having been bitten by various fiascos such as vinum in the past, I don't care to invest significant engineering effort in possible losers.

We do this from the quarterly ports tree branches.
You are probably well aware that this way you can disable many dependencies via custom options. E.g. no X11 ...

Which you can do for ports as well.

I don't get your point about 11.3 -> 11.4 -> 12.1 BTW. There is no such thing as a release ports tree. At least not in the form of an SVN tag. What used to be on the CDs in the good old days was just a snapshot at the time of release.

Correct, and that's what has been tested to build by the FreeBSD ports team.

Don't you follow at least the quarterly branch for security updates?

To what end? I'm not using ports to create services. I don't really care if there's a bug in a utility that's been installed for admin use. We can't get security "updates" anyways, because the platform is frozen with chflags schg and securelevel set. The platform is an appliance that exists to run services.

I work with a different model than what you may be used to. I suspect you may have logged onto an *IX system somewhere and found something like, for example, three or four different partial Apache installs, two or three different ones from ports (especially back in the day when there were non-SSL and SSL versions) and one done in frustration by hand. This was actually pretty common back around FreeBSD ... 4? ... I wanna say, and there were divergent competing ports that did not do conflict checking. Anyways it was always very entertaining to log into some web server host that had been installed years prior and which had tried to follow FreeBSD and ports updates, because it would be a trainwreck. It was always particularly fun to find when some Linuxhead had gone and installed Apache by hand using paths that were familiar to them, so you ended up with several different httpd.conf files in various places on the system.

Anyways, I find it incredibly painful to have applications spammed throughout the base system. Inside /usr/local/etc, for example, which of those files are needed for Apache, or PHP, or perl? Which have been modified somehow to make an app run?

So I use a different model that isolates apps within their own directories. Traditional UNIX sometimes did this as /opt/${thing}. I typically do it out of root, so a web server might be "/www". /www is its own filesystem on a separate disks. This allows a host to be shut down, fed back into the OS installer, and come back out with a fresh OS build or image. You know that's you're not toasting anything of value because all the important stuff (the app itself) is on its own disk.

What I wonder is: how do they treat upstream updates to e.g. collectd? They cannot just sync but have to merge their local changes again and again and again. That's what I meant by "local copies suck".

I try to do that as little as possible, but especially now with the transition from python26 to 37/38, ports is a mess.

This is one of the reasons it is useful to have fixed points in the ports tree. We have some local patches and maybe a few local ports that get injected back into the ports tree. Because we have a requirement that hosts must be able to build offline, having all of this wired down and having all the appropriate content for /usr/ports/distfiles automatically installed is part of what the local build system here does.

You can try to manage your own fork of ports, and I can imagine that you would experience some ongoing local skull ache if you tried to do this on an ongoing basis for ports that were actually used for a wide variety of services -- this would be a full-time job, I would think.

I suspect that the ports being used by FreeNAS would be more of a fixed list, along the lines of what I'm doing for the base OS, but probably larger than my list, because I'm guessing that stuff like collectd has local changes for FreeNAS. Some of this stuff is just going to take manual effort to maintain, because not all of it will be appropriate to send upstream. Speaking from experience, it isn't that hard to integrate local changes to updated ports *most* of the time.

But it's really frustrating to work out of a constantly changing ports tree that doesn't produce a consistent reliable result. That's the normal reason people fork ports, or do some hybrid strategy such as what I do.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
No, because I've already got a tool that does the job, and I'm not really a "contrib tool of the day" fanboi. Poudriere is relatively new (2012) and I'm not looking for an intermediate step that requires even more time/space/IOPS anyways. Over the years FreeBSD has had numerous things come and go, and having been bitten by various fiascos such as vinum in the past, I don't care to invest significant engineering effort in possible losers.
That's 8 years! In that time entire open source projects come and go. Linux will have seen three new overlay filesystem implementations for Docker ... :tongue:

But seriously - in a time way before ZFS, in which way was Vinum a fiasco? I ran it in production like I ran Veritas on Solaris (that's why it is named that way) and couldn't complain ... only ever did mirroring, though.

And the python versions ... you got to shift to python 3.7 at one time. We set DEFAULT_VERSIONS accordingly and there are only two or three packages left that pull in python 2. We told our customers we would remove it end of this year.

Our approach is to build a completely new installation with all packages needed by a particular customer every month. Then switch the images on "patch day" while leaving the customer data in place.

Anyway, I'd really appreciate some feedback from iX, because I have to decide if I try to submit the cputemp module to OPNsense or upstream the patch to FreeBSD and I'd like to know why they didn't. Perhaps I am overlooking something.

Kind regards,
Patrick
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Nice post, good explanation of a bunch of stuff I didn't touch on --

  • I have also tried my hand at modifying port builds to eliminate dependencies or to have them produce a better result.
It seems that it has become ever more recommended to use a ports management program to do the work which is supposed to function properly by the ports tree alone, I vehemently oppose this direction.

I would s/better/consistent/ in the above bullet point.

I gave up using the ports tree for doing serious work a long time ago, because it would continue what I deemed to be a bad strategy, which is that it spams stuff into the host OS. This makes it nearly impossible to determine what dependencies are, especially if you have a host that is performing multiple functions, and it also makes it very difficult to independently update things like OpenSSL or to choose alternative implementations but only for a single service.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
That's the point I don't get. All ports install into /usr/local by default. None of the base OS utilities touch anything in there, neither do the ports touch anything outside. What do you mean by "spams"?

rm -rf /usr/local /var/db/pkg/* --> mostly virgin OS.

Stuff like OpenSSL, BIND, SAMBA, etc. do (or have in the past, I don't care to quibble) tinker with stuff that affects the platform. This has gotten somewhat better over the years, but things are definitely still spammed into base, especially /var and /etc. The classic example would be adding passwd entries.

But what I'm really getting at is that you cannot install a system that uses Apache plus LibreSSL along with MariaDB and OpenSSL from ports, and that once you have your mail server, database server, web server, etc., all stacked on a platform, you have a mess in /usr/local. Which applications need which files? What happens when you need to upgrade OpenSSL for one of them and another package breaks because of it?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
especially /var and /etc
Well, yes. Not /etc in general but definitely group and passwd entries. I cannot imagine a better system currently. When I really want to "clean up" instead of installing a fresh system, I clean the group and passwd entried, then look in /var for stuff with a numerical UID and nuke that.

Which applications need which files? What happens when you need to upgrade OpenSSL for one of them and another package breaks because of it?
Our solution is to install *everything* fresh each time and ro-mount these "blueprints" into the application jail. And why wouldn't you upgrade OpenSSL for all packages that depend on it? That's the reason for shared libraries, isn't it?

Our use cases as well as our expectations seem to be different to a degree that we will probably agree to disagree forever. I for one find the FreeBSD system once you use poudriere vastly superior to any Linux distribution I have tried. Regularly we have "brand spanking new" versions of e.g. PHP (say, 7.4) weeks before they make it into the Linux distros as regular packages. Creating Debian/Ubuntu packages is considered a black art. Our customers so far like what we do.
The biggest problem is that ports/packages are removed and due to the "everything depends on everything" and "build everything in poudriere each month" approach we cannot keep e.g. PHP 7.0 if we want to follow 2020Q3 and keep PHP 7.2, 7.3 and 7.4 up-to-date with patches. Absolutely no way. Transparency and clear communications are key here.
January 2021 we will remove python 2 from all our servers. We have been telling the customers for months alread. Adapt your scripts, please. It *will* be gone.

Kind regards,
Patrick
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Well, yes. Not /etc in general but definitely group and passwd entries. I cannot imagine a better system currently. When I really want to "clean up" instead of installing a fresh system, I clean the group and passwd entried, then look in /var for stuff with a numerical UID and nuke that.

It's hard to deal with this when things get spammed together.

Our solution is to install *everything* fresh each time and ro-mount these "blueprints" into the application jail. And why wouldn't you upgrade OpenSSL for all packages that depend on it? That's the reason for shared libraries, isn't it?

Oh, I dunno, maybe because for years there were three separate and incompatible versions of OpenSSL, and some things were simply incompatible....? Or because you might want to run an alternative SSL for one service but not another? Some of the SSL alternatives out there work only in certain roles.

Our use cases as well as our expectations seem to be different to a degree that we will probably agree to disagree forever. I for one find the FreeBSD system once you use poudriere vastly superior to any Linux distribution I have tried. Regularly we have "brand spanking new" versions of e.g. PHP (say, 7.4) weeks before they make it into the Linux distros as regular packages. Creating Debian/Ubuntu packages is considered a black art. Our customers so far like what we do.
The biggest problem is that ports/packages are removed and due to the "everything depends on everything" and "build everything in poudriere each month" approach we cannot keep e.g. PHP 7.0 if we want to follow 2020Q3 and keep PHP 7.2, 7.3 and 7.4 up-to-date with patches. Absolutely no way. Transparency and clear communications are key here.
January 2021 we will remove python 2 from all our servers. We have been telling the customers for months alread. Adapt your scripts, please. It *will* be gone.

You're experiencing the limitations of what you can do when you're dependent on ports. That's fine. In some ways, it is very hard work to build something more specialized, but service provider class hardening and security forces tradeoffs, and I'm fine with not relying on ports for those things.
 
Top