I thought that was the point too. However, I just did "docker pull postgres", "docker run -e POSTGRES_HOST_AUTH_METHOD=trust postgres", "docker run postgress ls -l bin/sh" and it shows a link to dash (a stripped-down bash).
To the best of my understanding, this isn't someone's private copy of Postgres, it's the version that most of the Docker world uses. It downloaded a crapton of stuff from the Internet in doing this. I'm too Docker-naive to have done anything complicated or clever installing the thing.
As far as I have seen, developing and administering a bunch of various things, the somewhat-dirty secret is that few things are as stripped-down as they could be. This is good and bad.
It's good because Docker's tooling is byzantine when things work well and infuriatingly user-hostile when they don't. The main problem, from my perspective - keeping in mind that I need to peek into the containers for stuff like debugging, configuration, etc. - is that the container's filesystem is an opaque image. To access anything inside the container, there's no reasonable alternative to executing a shell process inside the container and attaching a TTY to it (
docker exec -it ${CONTAINERNAME} bash
, slightly less horrible with docker-compose:
docker-compose exec ${CONTAINERNAME} bash
).
It's bad because the attack surface is larger and it adds bloat.
To the best of my understanding, this isn't someone's private copy of Postgres, it's the version that most of the Docker world uses. It downloaded a crapton of stuff from the Internet in doing this. I'm too Docker-naive to have done anything complicated or clever installing the thing.
What happens in practice, behind the buzzwords, is the following:
- Take a "base image", which is the base userland. Think
bsdinstall
or debootstrap
, but as a proprietary image.
- Add/change your application's dependencies on top of that, using the standard tooling. Hopefully the author cleaned up the cruft!
- Add the actual application bits on top of 2.
3. is pretty straightforward. 2. suffers from all the same things that have always plagued admins.
1. is "pick a distro" territory, with all that entails. Postgres, for instance, offers two sets of options: Debian or Alpine. The former is something like the minimum install of Debian. Alpine is weird and less-than-compatible, by virtue of not using glibc in favor of something smaller, which means that things break sometimes.
The crapton of stuff are the various layers, basically a stack of CoW overlays. These can be merged together, but common practice is to have several layers to allow for reuse. E.g. one layer for the base image, one for major dependencies that don't change much, one for dependencies that do change frequently, and one for the actual application.
Although it's not quite "Bob's copy of Postgres", it's not too far off. It's effectively "Postgres installed in a semi-custom environment that Bob setup up", shipped in binary format. For Open-Source stuff, the expectation is that the Dockerfiles defining the build are also OSS.
As for Docker - I thought that was the point? Containerising a single binary, not a full blown OS environment. That's how Docker proponents always explained it to me.
That's mostly a fantasy along the lines of "the OS is the kernel and the rest is like, your opinion, man". It's a line of reasoning that serves to dump problems and design flaws on top of others, but adds nothing to any real-world scenario. Conceptually, you
could compile a single binary that does everything, but there are good reasons why nobody does that outside of embedded systems.
So, what's good about Docker? I think there are two things that made it popular: Dockerfiles and Dockerhub. A Dockerfile doesn't do anything you can't do in a shell script, but it makes a lot of boilerplate stuff easy, allowing for nice, repeatable (not necessarily reproducible, in practice) builds without the frustration of a typical shell script. And Dockerhub makes it easy to get started by just downloading an image and then letting you grab the Dockerfile and work from there.
Some of the networking options are neat and automate some boring stuff like DNS and are good enough for containers that should not communicate with the outside world, only with other containers. I say "some" because the options somewhat bloated, as I understand it to support Docker Swarm (which I will venture to say is used by nobody of consequence).
I think that's about where my praise ends, at least relative to jails. Sure, it saves my ass to be able to run an aggressively closed-source app developed for a very specific CentOS 6 environment without having to run CentOS 6 on the host machine, but jails can do that, too.
Most of Docker's problems are probably caused by unfocused development efforts, throwing random features at the wall to see what sticks. There seems to be exactly zero planning involved in the development of Docker. Let's take a look at docker-compose: Compose files are where the problems begin, and I invite those fortunate enough to not yet have been scarred to have a look at
the documentation. Here are some nuggets:
- You'd think 2.x would supersede 1.x and the same for 3.x and 2.x. But that's not true, they evolved in parallel over multiple versions of Docker.
- Big deal, just update Docker and use 3.x, you say? Nope, 2.x has features not in 3.x.
- What really changed? I'm not sure I can explain it. The changes seem more like a Wikipedia edit war over the capitalization of an article's title than any serious change to the API.
- Want to know more about a given entry? Chances are, it'll just point to the main Docker docs, which tend to be more bizarre than helpful.
- Wait, what's this "Compose Specification"? Excellent question, I think they gave up on versioning and now it's a fscking free-for-all. Oh, and the docs are worse now, they're just a crummy markdown file on Github.
Docker also suffers a bit from having to reinvent some wheels, being a Linux thing:
- Image handling. For systems without
clean water ZFS, they need to be able to support the image stuff. If they could rely on ZFS, it could trivially be replaced with ZFS snapshots and send/recv. There is a ZFS driver, but I'm not sure it's using any ZFS features beyond creating tons of datasets. Still, it's better than partitioning off some physical disk space and handing it over to Docker, which was the recommended approach and maybe still is.
- A million distros mean a million base images and it becomes hard to standardize tooling.
- Linux doesn't have containers, so Docker has to cobble together the various bits and pieces to create that abstraction.