Shadow on a sparse null diet

Tomato wrapped with a tape measure
Image by Myriams-Fotos on pixabay

Summer is slowly ending (at least in my hemisphere), so dieting tips to slim down in time for the beach are a bit out of season at the moment, but who doesn't like a guy who suggests that everyone's shadow is too fat and implements a \0 diet anyhow? Thankfully that isn't only surprisingly easy, but also surprisingly impactful:

This is the story of how I ended up potentially saving terabytes of disk space across the world by changing two lines of code.

So, what happened?

Some background first: If you want to create a new user on Debian, you tend to invoke adduser, which is simply put a wrapper around useradd. This tool is part of shadow-utils commonly packaged as just shadow as it deals with the /etc/passwd and /etc/shadow files. It also deals with two log files: /var/log/lastlog and /var/log/faillog.

Those logs store information as you might have guessed already by their name: (Among other things) the time a user logged in last and if login attempt(s) failed. They do so in a binary format and not by appending to the file but each user (based on their UID) has a predefined place to record this information.

So far so good (or bad, depending on how good your spider-sense is). Now, users can not only be added, but also deleted and hence their UID be recycled. So adding a user has to ensure that old log data for a since then deleted user isn't recycled for the new user with that UID. The data hence needs to be reset – which in binary and in this format means overridden with zeros.

My initial problem with this was on a personal obsession-level: If you bootstrap a Debian system the /var/log/lastlog (/var/log/faillog) is empty before apt's postinst is running and after that they are (sparse) files of size 29492 (and 3232) bytes containing nothing but zeros. Seemed like a total pointless waste of resources to me (and something I can't reasonably explain and certainly don't want to emulate in apts postinst script).

Okay, sparse means they are full of holes (yes that is a technical term, see lseek(2) manpage for details) meaning most of those zeros are implicitly here, but don't take up any space in reality. Or at least in some realities, but we will come back to that. Lets just assume for now that I would just prefer those all-zero files, sparse or not, to not be created (or, well, filled in, as they are already created, just empty).

If we called useradd directly, we could use the --no-log-init which is documented to avoid resetting the databases, but Debian uses adduser and that doesn't have such a flag and what would happen if the files weren't empty but in fact a user with that UID existed previously and was deleted so that we indeed need to reset… – wait: Didn't I say the files were empty before useradd was run in my use-case? Empty files surely have no data of previously deleted users, so why should a reset be performed? Turns out the reset is performed as long as the files exist, regardless of their size.

So, how about changing access() to a stats() call and don't perform the reset if the file is smaller than the offset we want to reset? Merged upstream after a short discussion in a single week. Now patiently waiting for an upstream release and/or Debian to package it (an MR to fasttrack the patch into Debian was already merged in salsa).

And there you have it, the files remain empty after apts postinst runs, waiting to be filled by actual data at the time someone actually performs a login (probably not to _apt user, but that doesn't matter). I am happy and the rest of the world doesn't care…

Or perhaps they do?

The UID apt usually gets assigned is 100. That is pretty tiny. The libvirt-daemon-system package e.g. creates a user with the fixed UID 64055. In other words: On install your lastlog file grows to 18 MBs even if you probably never will log in to that user and likely never had a user with that UID before. Okay, pretty much all of those 18 MBs are one big hole, but with a popcon of 12694 (6.64%) that means a lower bound of ~230 GB of holes on Debian machines alone. Still a tiny UID through, some upstreams want to avoid stepping on individual distribution toes and pick UIDs well past 1000000000…

It is just a hole and nobody cares about the size of a hole, right?!

Just ship it in a container

Some people seem to disagree on that. So much so that the documentation of Docker calls using --no-log-init a best practice mentioning both that Debians adduser doesn't have that flag and that an unresolved go bug from 2015 is ultimately to blame that Docker images blow out of proportions if you deal with big holes in lastlog.

Sparse files and their holes are operation system und underlying storage support dependent (all the most interesting filesystems like btrfs, ext4 and tmpfs support it since at least Linux kernel 3.8), but as sparse files are by design behaving like normal files to unsuspecting applications your choice of file copy/transfer, backup and tarball creation can result in a sudden decompression of the holes.

Go's tar implementation is e.g. far from the only one missing support for sparse files. It is even ahead of the curve with supporting them while reading. So, if your docker/podman/… images, tarballs or your off-site backups turn out to be a bit (or a lot, depending on the UIDs you set up in them) smaller than before even if you didn't follow the best practice to opt out of the reset: You are welcome.

And the moral of the story? Don't document workarounds as best practice.

Thanks to Johannes Schauer Marin Rodrigues for apt!254 triggering my obsession and later pointing out Docker as a potential benefactor of my shadow!558 change and of course shadow upstream maintainers for entertaining my tiny contribution.

Sidenote: This of course changes nothing about the problem inherent to lastlog to potentially growing absurdly and spontaneously if you happen to login to a user with a high UID. It just delays this potential failure mode on the assumption that such a login attempt will not happen in practice – or that you at least do not place the resulting system back into a tarball at least. Workarounds like --no-log-init, LASTLOG_UID_MAX or systemd-sysusers bank on that as well: They all have the latent problem of a yo-yo diet…

Sidenote 2: useradd could be made more clever by e.g. not writing zeros into an existing hole. Trimming the file on resets and so on. That isn't resolving the underlying problem of a fixed unchangeable binary format through. Alternatives exist – they even are on your system already: Hello utmp and co, see last (although, if you ask me, that should have been an alias for tail -n1 – just saying). systemd in the mean time has a TODO entry to fold them all into journal and doesn't care about lastlog otherwise.