Shadow on a sparse null diet

2022-09-05

Summer is slowly ending (at least in my hemisphere), so dieting tips to slim down in time for the beach are a bit out of season at the moment, but who doesn't like a guy who suggests that everyone's shadow is too fat and implements a \0 diet anyhow? Thankfully that isn't only surprisingly easy, but also surprisingly impactful:

This is the story of how I ended up potentially saving terabytes of disk space across the world by changing two lines of code.

So, what happened?

Some background first: If you want to create a new user on Debian, you tend to invoke adduser, which is simply put a wrapper around useradd. This tool is part of shadow-utils commonly packaged as just shadow as it deals with the /etc/passwd and /etc/shadow files. It also deals with two log files: /var/log/lastlog and /var/log/faillog.

Those logs store information as you might have guessed already by their name: (Among other things) the time a user logged in last and if login attempt(s) failed. They do so in a binary format and not by appending to the file but each user (based on their UID) has a predefined place to record this information.

So far so good (or bad, depending on how good your spider-sense is). Now, users can not only be added, but also deleted and hence their UID be recycled. So adding a user has to ensure that old log data for a since then deleted user isn't recycled for the new user with that UID. The data hence needs to be reset – which in binary and in this format means overridden with zeros.

My initial problem with this was on a personal obsession-level: If you bootstrap a Debian system the /var/log/lastlog (/var/log/faillog) is empty before apt's postinst is running and after that they are (sparse) files of size 29492 (and 3232) bytes containing nothing but zeros. Seemed like a total pointless waste of resources to me (and something I can't reasonably explain and certainly don't want to emulate in apts postinst script).

Okay, sparse means they are full of holes (yes that is a technical term, see lseek(2) manpage for details) meaning most of those zeros are implicitly here, but don't take up any space in reality. Or at least in some realities, but we will come back to that. Lets just assume for now that I would just prefer those all-zero files, sparse or not, to not be created (or, well, filled in, as they are already created, just empty).

If we called useradd directly, we could use the --no-log-init which is documented to avoid resetting the databases, but Debian uses adduser and that doesn't have such a flag and what would happen if the files weren't empty but in fact a user with that UID existed previously and was deleted so that we indeed need to reset… – wait: Didn't I say the files were empty before useradd was run in my use-case? Empty files surely have no data of previously deleted users, so why should a reset be performed? Turns out the reset is performed as long as the files exist, regardless of their size.

So, how about changing access() to a stats() call and don't perform the reset if the file is smaller than the offset we want to reset? Merged upstream after a short discussion in a single week. Now patiently waiting for an upstream release and/or Debian to package it (an MR to fasttrack the patch into Debian was already merged in salsa).

And there you have it, the files remain empty after apts postinst runs, waiting to be filled by actual data at the time someone actually performs a login (probably not to _apt user, but that doesn't matter). I am happy and the rest of the world doesn't care…

Or perhaps they do?

The UID apt usually gets assigned is 100. That is pretty tiny. The libvirt-daemon-system package e.g. creates a user with the fixed UID 64055. In other words: On install your lastlog file grows to 18 MBs even if you probably never will log in to that user and likely never had a user with that UID before. Okay, pretty much all of those 18 MBs are one big hole, but with a popcon of 12694 (6.64%) that means a lower bound of ~230 GB of holes on Debian machines alone. Still a tiny UID through, some upstreams want to avoid stepping on individual distribution toes and pick UIDs well past 1000000000…

It is just a hole and nobody cares about the size of a hole, right?!

Just ship it in a container

Some people seem to disagree on that. So much so that the documentation of Docker calls using --no-log-init a best practice mentioning both that Debians adduser doesn't have that flag and that an unresolved go bug from 2015 is ultimately to blame that Docker images blow out of proportions if you deal with big holes in lastlog.

Sparse files and their holes are operation system und underlying storage support dependent (all the most interesting filesystems like btrfs, ext4 and tmpfs support it since at least Linux kernel 3.8), but as sparse files are by design behaving like normal files to unsuspecting applications your choice of file copy/transfer, backup and tarball creation can result in a sudden decompression of the holes.

Go's tar implementation is e.g. far from the only one missing support for sparse files. It is even ahead of the curve with supporting them while reading. So, if your docker/podman/… images, tarballs or your off-site backups turn out to be a bit (or a lot, depending on the UIDs you set up in them) smaller than before even if you didn't follow the best practice to opt out of the reset: You are welcome.

And the moral of the story? Don't document workarounds as best practice.

Thanks to Johannes Schauer Marin Rodrigues for apt!254 triggering my obsession and later pointing out Docker as a potential benefactor of my shadow!558 change and of course shadow upstream maintainers for entertaining my tiny contribution.

Sidenote: This of course changes nothing about the problem inherent to lastlog to potentially growing absurdly and spontaneously if you happen to login to a user with a high UID. It just delays this potential failure mode on the assumption that such a login attempt will not happen in practice – or that you at least do not place the resulting system back into a tarball at least. Workarounds like --no-log-init, LASTLOG_UID_MAX or systemd-sysusers bank on that as well: They all have the latent problem of a yo-yo diet…

Sidenote 2: useradd could be made more clever by e.g. not writing zeros into an existing hole. Trimming the file on resets and so on. That isn't resolving the underlying problem of a fixed unchangeable binary format through. Alternatives exist – they even are on your system already: Hello utmp and co, see last (although, if you ask me, that should have been an alias for tail -n1 – just saying). systemd in the mean time has a TODO entry to fold them all into journal and doesn't care about lastlog otherwise.

APT for Advent of Code

Advent of Code 2021

2021-12-09

Advent of Code, for those not in the know, is a yearly Advent calendar (since 2015) of coding puzzles many people participate in for a plenary of reasons ranging from speed coding to code golf with stops at learning a new language or practicing already known ones.

I usually write boring C++, but any language and then some can be used. There are reports of people implementing it in hardware, solving them by hand on paper or using Microsoft Excel… so, after solving a puzzle the easy way yesterday, this time I thought: CHALLENGE ACCEPTED! as I somehow remembered an old 2008 article about solving Sudoku with aptitude (Daniel Burrows via archive.org as the blog is long gone) and the good same old a package management system that can solve [puzzles] based on package dependency rules is not something that I think would be useful or worth having (Russell Coker).

Day 8 has a rather lengthy problem description and can reasonably be approached in a bunch of different way. One unreasonable approach might be to massage the problem description into Debian packages and let apt help me solve the problem (specifically Part 2, which you unlock by solving Part 1. You can do that now, I will wait here.)

Be warned: I am spoiling Part 2 in the following, so solve it yourself first if you are interested.

I will try to be reasonable consistent in naming things in the following and so have chosen: The input we get are lines like acedgfb cdfbe gcdfa fbcad dab cefabd cdfgeb eafb cagedb ab | cdfeb fcadb cdfeb cdbaf. The letters are wires mixed up and connected to the segments of the displays: A group of these letters is hence a digit (the first 10) which represent one of the digits 0 to 9 and (after the pipe) the four displays which match (after sorting) one of the digits which means this display shows this digit. We are interested in which digits are displayed to solve the puzzle. To help us we also know which segments form which digit, we just don't know the wiring in the back.

So we should identify which wire maps to which segment! We are introducing the packages wire-X-connects-to-Y for this which each provide & conflict¹ with the virtual packages segment-Y and wire-X-connects. The later ensures that for a given wire we can only pick one segment and the former ensures that not multiple wires map onto the same segment.

As an example: wire a's possible association with segment b is described as:

Package: wire-a-connects-to-b
Provides: segment-b, wire-a-connects
Conflicts: segment-b, wire-a-connects

Note that we do not know if this is true! We generate packages for all possible (and then some) combinations and hope dependency resolution will solve the problem for us. So don't worry, the hard part will be done by apt, we just have to provide all (im)possibilities!

What we need now is to translate the 10 digits (and 4 outputs) from something like acedgfb into digit-0-is-eight and not, say digit-0-is-one. A clever solution might realize that a one consists only of two segments so a digit wiring up seven segments can not be a 1 (and must be 8 instead), but again we aren't here to be clever: We want apt to figure that out for us! So what we do is simply making every digit-0-is-N (im)possible choice available as a package and apply constraints: A given digit-N can only display one number and each N is unique as digit – so for both we deploy Provides & Conflicts again.

We also need to reason about the segments in the digits: Each of the digit packages gets Depends on wire-X-connects-to-Y where X is each possible wire (e.g. acedgfb) and Y each segment forming the digit (e.g. cf for one). The different choices for X are or'ed together, so that either of them satisfies the Y.

We know something else too through: The segments which are not used by the digit can not be wired to any of the Xs. We model this with Conflicts on wire-X-connects-to-Y.

As an example: If digit-0s acedgfb would be displaying a one (remember, it can't) the following package would be installable:

Package: digit-0-is-one
Version: 1
Depends: wire-a-connects-to-c | wire-c-connects-to-c | wire-e-connects-to-c | wire-d-connects-to-c | wire-g-connects-to-c | wire-f-connects-to-c | wire-b-connects-to-c,
         wire-a-connects-to-f | wire-c-connects-to-f | wire-e-connects-to-f | wire-d-connects-to-f | wire-g-connects-to-f | wire-f-connects-to-f | wire-b-connects-to-f
Provides: digit-0, digit-is-one
Conflicts: digit-0, digit-is-one,
  wire-a-connects-to-a, wire-c-connects-to-a, wire-e-connects-to-a, wire-d-connects-to-a, wire-g-connects-to-a, wire-f-connects-to-a, wire-b-connects-to-a,
  wire-a-connects-to-b, wire-c-connects-to-b, wire-e-connects-to-b, wire-d-connects-to-b, wire-g-connects-to-b, wire-f-connects-to-b, wire-b-connects-to-b,
  wire-a-connects-to-d, wire-c-connects-to-d, wire-e-connects-to-d, wire-d-connects-to-d, wire-g-connects-to-d, wire-f-connects-to-d, wire-b-connects-to-d,
  wire-a-connects-to-e, wire-c-connects-to-e, wire-e-connects-to-e, wire-d-connects-to-e, wire-g-connects-to-e, wire-f-connects-to-e, wire-b-connects-to-e,
  wire-a-connects-to-g, wire-c-connects-to-g, wire-e-connects-to-g, wire-d-connects-to-g, wire-g-connects-to-g, wire-f-connects-to-g, wire-b-connects-to-g

Repeat such stanzas for all 10 possible digits for digit-0 and then repeat this for all the other nine digit-N. We produce pretty much the same stanzas for display-0(-is-one), just that we omit the second Provides & Conflicts from above (digit-is-one) as in the display digits can be repeated. The rest is the same (modulo using display instead of digit as name of course).

Lastly we create a package dubbed solution which depends on all 10 digit-N and 4 display-N – all of them virtual packages apt will have to choose an installable provider from – and we are nearly done!

The resulting Packages file² we can give to apt while requesting to install the package solution and it will spit out not only the display values we are interested in but also which number each digit represents and which wire is connected to which segment. Nifty!

$ ./skip-aoc 'acedgfb cdfbe gcdfa fbcad dab cefabd cdfgeb eafb cagedb ab | cdfeb fcadb cdfeb cdbaf'
[…]
The following additional packages will be installed:
  digit-0-is-eight digit-1-is-five digit-2-is-two digit-3-is-three
  digit-4-is-seven digit-5-is-nine digit-6-is-six digit-7-is-four
  digit-8-is-zero digit-9-is-one display-1-is-five display-2-is-three
  display-3-is-five display-4-is-three wire-a-connects-to-c
  wire-b-connects-to-f wire-c-connects-to-g wire-d-connects-to-a
  wire-e-connects-to-b wire-f-connects-to-d wire-g-connects-to-e
[…]
0 upgraded, 22 newly installed, 0 to remove and 0 not upgraded.

We are only interested in the numbers on the display through, so grepping the apt output (-V is our friend here) a bit should let us end up with what we need as calculating³ is (unsurprisingly) not a strong suit of our package relationship language so we need a few shell commands to help us with the rest.

$ ./skip-aoc 'acedgfb cdfbe gcdfa fbcad dab cefabd cdfgeb eafb cagedb ab | cdfeb fcadb cdfeb cdbaf' -qq
5353

I have written the skip-aoc script as a testcase for apt, so to run it you need to place it in /path/to/source/of/apt/test/integration and built apt first, but that is only due to my laziness. We could write a standalone script interfacing with the system installed apt directly – and in any apt version since ~2011.

To hand in the solution for the puzzle we just need to run this on each line of the input (~200 lines) and add all numbers together. In other words: Behold this beautiful shell one-liner: parallel -I '{}' ./skip-aoc '{}' -qq < input.txt | paste -s -d'+' - | bc

(You may want to run parallel with -P to properly grill your CPU as that process can take a while otherwise – and it still does anyhow as I haven't optimized it at all… the testing framework does a lot of pointless things wasting time here, but we aren't aiming for the leaderboard so…)

That might or even likely will fail through as I have so far omitted a not unimportant detail: The default APT resolver is not able to solve this puzzle with the given problem description – we need another solver!

Thankfully that is as easy as installing apt-cudf (and with it aspcud) which the script is using via --solver aspcud to make apt hand over the puzzle to a "proper" solver (or better: A solver who is supposed to be good at "answering set" questions). The buildds are using this for experimental and/or backports builds and also for installability checks via dose3 btw, so you might have encountered it before.

Be careful however: Just because aspcud can solve this puzzle doesn't mean it is a good default resolver for your day to day apt. One of the reasons the default resolver has such a hard time solving this here is that or-groups have usually an order in which the first is preferred over every later option and so fort. This is of no concern here as all these alternatives will collapse to a single solution anyhow, but if there are multiple viable solutions (which is often the case) picking the "wrong" alternative can have bad consequences. A classic example would be exim4 | postfix | nullmailer. They are all MTAs but behave very different. The non-default solvers also tend to lack certain features like keeping track of auto-installed packages or installing Recommends/Suggests. That said, Julian is working on another solver as I write this which might deal with more of these issues.

And lastly: I am also relatively sure that with a bit of massaging the default resolver could be made to understand the problem, but I can't play all day with this… maybe some other day.

Disclaimer: Originally posted in the daily megathread on reddit, the version here is just slightly better understandable as I have hopefully renamed all the packages to have more conventional names and tried to explain what I am actually doing. No cows were harmed in this improved version, either.

If you would upload those packages somewhere, it would be good style to add Replaces as well, but it is of minor concern for apt so I am leaving them out here for readability. ↩
We have generated 49 wires, 100 digits, 40 display and 1 solution package for a grant total of 190 packages. We are also making use of a few purely virtual ones, but that doesn't add up to many packages in total. So few packages are practically childs play for apt given it usually deals with thousand times more. The instability for those packages tends to be a lot better through as only 22 of 190 packages we generated can (and will) be installed. Britney will hate you if your uploads to Debian unstable are even remotely as bad as this. ↩
What we could do is introduce 10.000 packages which denote every possible display value from 0000 to 9999. We would then need to duplicate our 10.190 packages for each line (namespace them) and then add a bit more than a million packages with the correct dependencies for summing up the individual packages for apt to be able to display the final result all by itself. That would take a while through as at that point we are looking at working with ~22 million packages with a gazillion amount of dependencies probably overworking every solver we would throw at it… a bit of shell glue seems the better option for now. ↩

Unicode emojis in python-markdown

Firefox 71 rendering emojis with Twemoji

2020-04-24

Took me only four years, but I finally published the first of a bunch of tricks around staticsite I am using: Proudly presenting python-markdown-unicodeemoji 🤩🎉

As the mouthful of a name suggests it is a python-markdown extension dealing with unicode and emojis. Specifically it searches in the markdown text for occurrences of shortcodes like :heart: and <3 and converts them to harder to type and remember unicode ❤ properly wrapped and annotated so they can be styled with CSS. For the most part this means specifying a font which can actually display them as most normal fonts aren't particularly good at it.

The difference between this and most other implementations like mdx_unimoji is that it is not using a) a small hardcoded list of emojis and shortcodes and b) no images involved, just unicode. The shortcodes come from JoyPixels (formerly EmojiOne) and Github as well as the unicode spec. The later means that you can browse for a character with e.g. gucharmap and insert it directly in your text without needing to figure out a shortcode.

Here, have some (the total list would be 3060 emojis currently) in three variants: no specified variant (default), text and emoji. Press Ctrl+Shift+I (or whatever the magic incarnation to open the HTML inspector in your browser is) and disable the font-override for .emoji to see how your browser would show them without which can vary even more than with…

😀 😀︎ 😀️ 😃 😃︎ 😃️ 😄 😄︎ 😄️ 😁 😁︎ 😁️ 😆 😆︎ 😆️ 😅 😅︎ 😅️ 🤣 🤣︎ 🤣️ 😂 😂︎ 😂️ 🙂 🙂︎ 🙂️ 🙃 🙃︎ 🙃️ 😉 😉︎ 😉️ 😊 😊︎ 😊️ 😇 😇︎ 😇️ 🥰 🥰︎ 🥰️ 😍 😍︎ 😍️ 🤩 🤩︎ 🤩️ 😘 😘︎ 😘️ 😗 😗︎ 😗️ ☺ ☺︎ ☺️ 😚 😚︎ 😚️ 😙 😙︎ 😙️ 😋 😋︎ 😋️ 😛 😛︎ 😛️ 😜 😜︎ 😜️ 🤪 🤪︎ 🤪️ 😝 😝︎ 😝️ 🤑 🤑︎ 🤑️ 🤗 🤗︎ 🤗️ 🤭 🤭︎ 🤭️ 🤫 🤫︎ 🤫️ 🤔 🤔︎ 🤔️ 🤐 🤐︎ 🤐️ 🤨 🤨︎ 🤨️ 😐 😐︎ 😐️ 😑 😑︎ 😑️ 😶 😶︎ 😶️ 😏 😏︎ 😏️ 😒 😒︎ 😒️ 🙄 🙄︎ 🙄️ 😬 😬︎ 😬️ 🤥 🤥︎ 🤥️ 😌 😌︎ 😌️ 😔 😔︎ 😔️ 😪 😪︎ 😪️ 🤤 🤤︎ 🤤️ 😴 😴︎ 😴️ 😷 😷︎ 😷️ 🤒 🤒︎ 🤒️ 🤕 🤕︎ 🤕️ 🤢 🤢︎ 🤢️ 🤮 🤮︎ 🤮️ 🤧 🤧︎ 🤧️ 🥵 🥵︎ 🥵️ 🥶 🥶︎ 🥶️ 🥴 🥴︎ 🥴️ 😵 😵︎ 😵️ 🤯 🤯︎ 🤯️ 🤠 🤠︎ 🤠️ 🥳 🥳︎ 🥳️ 😎 😎︎ 😎️ 🤓 🤓︎ 🤓️ 🧐 🧐︎ 🧐️ 😕 😕︎ 😕️ 😟 😟︎ 😟️ 🙁 🙁︎ 🙁️ ☹ ☹︎ ☹️ 😮 😮︎ 😮️ 😯 😯︎ 😯️ 😲 😲︎ 😲️ 😳 😳︎ 😳️ 🥺 🥺︎ 🥺️ 😦 😦︎ 😦️ 😧 😧︎ 😧️ 😨 😨︎ 😨️ 😰 😰︎ 😰️ 😥 😥︎ 😥️ 😢 😢︎ 😢️ 😭 😭︎ 😭️ 😱 😱︎ 😱️ 😖 😖︎ 😖️ 😣 😣︎ 😣️ 😞 😞︎ 😞️ 😓 😓️ 😩 😩︎ 😩️ 😫 😫︎ 😫️ 🥱 🥱︎ 🥱️ 😤 😤︎ 😤️ 😡 😡︎ 😡️ 😠 😠︎ 😠️ 🤬 🤬︎ 🤬️ 😈 😈︎ 😈️ 👿 👿︎ 👿️ 💀 💀︎ 💀️ ☠ ☠︎ ☠️ 💩 💩︎ 💩️ 🤡 🤡︎ 🤡️ 👹 👹︎ 👹️ 👺 👺︎ 👺️ 👻 👻︎ 👻️ 👽 👽︎ 👽️ 👾 👾︎ 👾️ 🤖 🤖︎ 🤖️ 😺 😺︎ 😺️ 😸 😸︎ 😸️ 😹 😹︎ 😹️ 😻 😻︎ 😻️ 😼 😼︎ 😼️ 😽 😽︎ 😽️ 🙀 🙀︎ 🙀️ 😿 😿︎ 😿️ 😾 😾︎ 😾️ 🙈 🙈︎ 🙈️ 🙉 🙉︎ 🙉️ 🙊 🙊︎ 🙊️ 💋 💋︎ 💋️ 💌 💌︎ 💌️ 💘 💘︎ 💘️ 💝 💝︎ 💝️ 💖 💖︎ 💖️ 💗 💗︎ 💗️ 💓 💓︎ 💓️ 💞 💞︎ 💞️ 💕 💕︎ 💕️ 💟 💟︎ 💟️ ❣ ❣︎ ❣️ 💔 💔︎ 💔️ ❤ ❤︎ ❤️ 🧡 🧡︎ 🧡️ 💛 💛︎ 💛️ 💚 💚︎ 💚️ 💙 💙︎ 💙️ 💜 💜︎ 💜️ 🤎 🤎︎ 🤎️ 🖤 🖤︎ 🖤️ 🤍 🤍︎ 🤍️ 💯 💯︎ 💯️ 💢 💢︎ 💢️ 💥 💥︎ 💥️ 💫 💫︎ 💫️ 💦 💦︎ 💦️ 💨 💨︎ 💨️ 🕳 🕳︎ 🕳️ 💣 💣︎ 💣️ 💬 💬︎ 💬️ 👁‍🗨 👁‍🗨︎ 👁‍🗨️ 🗨 🗨︎ 🗨️ 🗯 🗯︎ 🗯️ 💭 💭︎ 💭️ 💤 💤︎ 💤️ 👋 👋︎ 👋️ 👋‍🏻 👋‍🏻︎ 👋‍🏻️ 👋‍🏼 👋‍🏼︎ 👋‍🏼️ 👋‍🏽 👋‍🏽︎ 👋‍🏽️ 👋‍🏾 👋‍🏾︎ 👋‍🏾️ 👋‍🏿 👋‍🏿︎ 👋‍🏿️ 🤚 🤚︎ 🤚️ 🤚‍🏻 🤚‍🏻︎ 🤚‍🏻️ 🤚‍🏼 🤚‍🏼︎ 🤚‍🏼️ 🤚‍🏽 🤚‍🏽︎ 🤚‍🏽️ 🤚‍🏾 🤚‍🏾︎ 🤚‍🏾️ 🤚‍🏿 🤚‍🏿︎ 🤚‍🏿️ 🖐 🖐︎ 🖐️ 🖐‍🏻 🖐‍🏻︎ 🖐‍🏻️ 🖐‍🏼 🖐‍🏼︎ 🖐‍🏼️ 🖐‍🏽 🖐‍🏽︎ 🖐‍🏽️ 🖐‍🏾 🖐‍🏾︎ 🖐‍🏾️ 🖐‍🏿 🖐‍🏿︎ 🖐‍🏿️ ✋ ✋︎ ✋️ ✋‍🏻 ✋‍🏻︎ ✋‍🏻️ ✋‍🏼 ✋‍🏼︎ ✋‍🏼️ ✋‍🏽 ✋‍🏽︎ ✋‍🏽️ ✋‍🏾 ✋‍🏾︎ ✋‍🏾️ ✋‍🏿 ✋‍🏿︎ ✋‍🏿️ 🖖 🖖︎ 🖖️ 🖖‍🏻 🖖‍🏻︎ 🖖‍🏻️ 🖖‍🏼 🖖‍🏼︎ 🖖‍🏼️ 🖖‍🏽 🖖‍🏽︎ 🖖‍🏽️ 🖖‍🏾 🖖‍🏾︎ 🖖‍🏾️ 🖖‍🏿 🖖‍🏿︎ 🖖‍🏿️ 👌 👌︎ 👌️ 👌‍🏻 👌‍🏻︎ 👌‍🏻️ 👌‍🏼 👌‍🏼︎ 👌‍🏼️ 👌‍🏽 👌‍🏽︎ 👌‍🏽️ 👌‍🏾 👌‍🏾︎ 👌‍🏾️ 👌‍🏿 👌‍🏿︎ 👌‍🏿️ 🎅 🎅︎ 🎅️ 🎅‍🏻 🎅‍🏻︎ 🎅‍🏻️ 🎅‍🏼 🎅‍🏼︎ 🎅‍🏼️ 🎅‍🏽 🎅‍🏽︎ 🎅‍🏽️ 🎅‍🏾 🎅‍🏾︎ 🎅‍🏾️ 🎅‍🏿 🎅‍🏿︎ 🎅‍🏿️ 🤶 🤶︎ 🤶️ 🤶‍🏻 🤶‍🏻︎ 🤶‍🏻️ 🤶‍🏼 🤶‍🏼︎ 🤶‍🏼️ 🤶‍🏽 🤶‍🏽︎ 🤶‍🏽️ 🤶‍🏾 🤶‍🏾︎ 🤶‍🏾️ 🤶‍🏿 🤶‍🏿︎ 🤶‍🏿️

I know what you are thinking… I have no smartphone or social media account, so that is probably the reason.

APT for downloaders

Image by fancycrave1 on pixabay

2019-05-24

Remember what I said last time? I started with "One of the main jobs of a package manager like apt is to download packages". So let us talk about downloading a bit more this time.

APT doesn't hardcode certain protocols like HTTP for downloading. It uses instead external binaries it calls "transport" via a self-defined text protocol similar in style to HTTP.

APT comes with a large set of more or less common protocols by default, but can in this way also be extended to support other protocols… lets look at the available protocols for APT:

The strange kids on the block: `gpgv`, `rred` & `store`¶

Methods like gpgv, rred and store usually have only a short stay in the progress indication of an update call and are hardly useable by anyone, but they exist as standalone methods to have (potentially many of them) run in parallel and make use of features like switching to a non-root user for their execution.

There is no cloud: `file` and `copy`¶

Feeling limited by the bandwidth-cap of your internet connection? No problem, apt doesn't need the internet – at least if you have a local mirror or a repository in general which resides on a local hard-drive or is e.g. mounted via NFS as a network share.

file will try to reuse the data in the location it is at the moment – which can lead to apt seeing changes in the repository without an update call – while copy will as the name implies store a copy of the data in the usual places data copied from the internet would reside.

In the past sort-of temporary repositories were added to apt with these, but given the features mentioned in APT for package self-builders they slowly approach the state of being considered also one of the strange kids mentioned previously.

Blast from the past: `ftp` and `ssh`¶

ftp is heavily on the decline for file transfer as for apt the usage of http is actually much better as we can make us of many advanced HTTP features. Debian dropped its ftp mirrors and we might end up dropping the method some day… it has surely not seen any active development for years at least.

ssh (also known as rsh) never was that popular to begin with, but can still be used to access repositories on remote systems via SSH assuming you manage to configure it correctly. It could also use some love from active users…

Teaching an old dog new tricks: `cdrom`¶

cdrom is something many users will need for the initial installation, but after that… you can through. It is also nowadays a misnomer as it can of course handle all the other rotating discs you can place in a drive like DVDs and Bluerays – as well as the (usually) less round ISO file storage device: USB sticks. The method is so special that it comes with its own binary apt-cdrom to help users work with it. It would also really need some love through, so if you were looking for some way to contribute to Debian and love playing with CDs & USB sticks…

The usual suspect: `http`¶

In all likelihood that is what you are using on all your systems. So, there isn't much to say about it expect that in buster it finally got its own manpage… have a look at apt-transport-http(1) someday (even translations are available).

That it is used so often doesn't mean people know or use all the features through: See e.g. auth.conf (granted, not that many repositories are password protected) and automatic proxy configuration. Tip: Have a look at the auto-apt-proxy package.

HTTP2 isn't supported yet and that might still be a while given there isn't a whole lot of point in it for the apt usecase as we know which files we want to acquire & that they are static, but at some point perhaps… [Yours truly still vividly remembers being told by some proxy/mirror people at a conference years ago that pipelining is way too much state keeping to support and apt shouldn't use it just like browsers! These guys probably love HTTP2…]

What most people don't realize is that this transport actually does more than wget, curl or even your webbrowser like having support for SRV records – something your webbrowser doesn't and probably never will support. SRV records are what powers deb.debian.org in case you wonder whats the point. Oh, and apropos point…

The pointless: `apt-transport-https`¶

SCNR. What the title actually refers to is that apt contains in the 1.6 series the https transport directly, so apt-transport-https is now an empty transitional package (aka: pointless to install).

Implied is of course something different: That HTTPS would be pointless for APT. It might or might not be, depending on your specific angle. There was a lot written about it, so feel free to read that if you must – e.g. Why does APT not use HTTPS. My point here is mainly that APT can if you want and its easier than ever. deb.debian.org can be accessed via https if you are looking for a mirror.

It also comes with a manpage since recently with apt-transport-https(1) which also mentions the most interesting feature of HTTPS: Client certificates – as access control via username and password is boring.

A small "gotcha" of sorts is that we have opted to forbid redirects from https to http, which breaks a lot more https sources than you would hope, but we decided that if you go for https, you probably don't want to compromise it all for an unsafe redirect. Other less specialized downloaders like wget or curl are less picky…

Sidenote: https is nowadays implemented as a tiny layer over http. We used to use the curl library to implement a semi-independent https but over time that became really ugly. The redirect-downgrade mentioned in the previous paragraph was colossal pain, redirections in general needed to be handled carefully, SRV support not on the horizon and so on. That isn't to say that curl is bad – it is just not really compatible with the architecture we already have.

The tearjerker: `apt-transport-tor`¶

Pretty much every reason for using HTTPS is potentially better served by using Tor and thankfully it is super easy to use it: Just install the package and prepend tor+ to all URLs you have in your sources.list files. The README has details and also points to various onion-addresses you can use instead of boring normal domains (that hopefully explains also the tear-pun).

Sidenote: Implementation wise Tor is just a SOCKS proxy, so all this method does is setting some Proxy configuration and then let http(s) do its job, so we wouldn't really need an extra package for it – but its easier for a user that way and I would really like to make it even easier if we had some more contributions on the documentation and scripting front… (hint hint).

The magician: `mirror`¶

With my rewrite in 1.6 mirror became my personal favorite and it might be yours too at the end of this section. 🙂

This method doesn't implement a download protocol on its own, it instead just acts as manager instructing other methods to do stuff by first downloading a file listing one or more mirrors and then distributing all requests it is asked to handle to a mirror from the list – and potentially a different mirror for each request… so, it is a potentially local variant of the decommissioned httpredir service, but integrated into apt resolving some (or all) problems it had.

Beside the obvious "I want apt to pull packages from 3 mirrors at the same time" usecase this obviously has it can deal gracefully with partial mirrors as well as mirrors which are less frequently synced without requiring a clever service keeping tabs on it (which was one of the reasons httpredir eventually died).

The manpage apt-transport-mirror uses this contrived advanced example:

file:/srv/local/debian/mirror/     priority:1 type:index
http://partial.example.org/mirror/ priority:2 arch:amd64 arch:all type:deb
http://ftp.us.debian.org/debian/   type:deb
http://ftp.de.debian.org/debian/   type:deb
https://deb.debian.org/debian/

That is just to show off, but should be enough reason for you to go read that manpage. Yours will likely be a lot simpler… mine just mentions some mirrors, is a local file and accessed via the slight arcane tor+mirror+file transport which means: get the local mirror file and access all mirrors listed in there via Tor…

There is a lot you can do with that already, but there is certainly some more stuff missing or could be improved. Feel free to get in touch if you have ideas, with or without patch attached. 😉

Street vendors united: Third-party transports¶

All transports mentioned so far are either bundled with apt or maintained by the team, but with -s3 and -spacewalk there are at least two transports in Debian maintained by others and in the past there was also -debtorrent but that is no longer maintained. -tor started in this group here as well.

Sadly, some things which should be implemented in transports (if at all) aren't like the dreaded never-ending stream of apt-fast implementations which usually ship with enormous security problems – but at least they are very fast at being insecure –, so I can only encourage exploring the transport system if you think apt should learn to acquire files in a certain fashion or over a certain protocol.

Bonus: Using apt as wget/curl replacement¶

Okay, it might be a bit of an overstatement, but for a quick download you can call /usr/lib/apt/apt-helper download-file https://example.org /path/to/file and with an optional third parameter you can provide a hashsum for the file. The killer feature might be that you can use any transport here, so tor+http works and does the right thing: That tends to be harder to do with wget/curl. As a bonus this will drop privileges and might even use seccomp, but security might be the topic of another day…

tl;dr

I will leave you now to reconfiguring your sources, especially mirror hopefully gave you some ideas. See you next time! 🙂

P.S.: As seems usual by now, this post was basically done months ago, but I didn't want to post it while people where arguing if curl or apt has more CVEs in its implementation as that was rather silly to watch and also not at the time some people were asking for my head for suspected intended CVE code injection. But after that nonsense cooled off I had forgotten that this was never published… well, better late than never.

Newbie contributor: A decade later

Image by ilyessuti on pixabay

2019-05-19

Time flies. On this day, 10 years ago, a certain someone sent in his first contribution to Debian in Debbugs#433007: --dry-run can mark a package manually installed (in real life). What follows is me babbling randomly about what lead to and happened after that first patch.

That wasn't my first contribution to open source: I implemented (more like copy-pasted) mercurial support in the VCS plugin in the editor I was using back in 2008: Geany – I am pretty sure my code is completely replaced by now, I just remain being named in THANKS, which is very nice considering I am not a user anymore. My contributions to apt were coded in vim(-nox) already.

It was the first time I put my patch under public scrutiny through – my contribution to geanyvc was by private mail to the plugin maintainer – and not by just anyone but by the venerable masters operating in a mailing list called deity@…

I had started looking into apt code earlier and had even written some patches for me without actually believing that I would go as far as handing them in. Some got in anyhow later, like the first commit with my name dated May the 7th allowing codenames to be used in pinning which dates manpage changes as being written on the 4th. So then I really started with apt is lost to history by now, but today (a decade ago) I got serious: I joined IRC, the mailing list and commented the bugreport mentioned above. I even pushed my branch of random things I had done to apt to launchpad (which back then was hosting the bzr repository).

The response was overwhelming. The bugreport has no indication of it, but Michael jumped at me. I realized only later that he was the only remaining active team member in the C++ parts. Julian was mostly busy with Python at the time and Christian turned out to be Mr. L18n with duties all around Debian. The old guard had left as well as the old-old guard before them.

I got quickly entangled in everything. Michael made sure I got invited by Canonical to UDS-L in November of 2009 – 6 months after saying hi. I still can't really believe that 21y old me made his first-ever fly across the ocean to Dallas, Texas (USA) because some people on the internet invited him over. So there was I, standing in front of the airport with the slow realisation that while I had been busy being scared about the fly, the week and everything I never really had worried about how to get from the airport to the hotel. An inner monologue started: "You got this, you just need the name of the hotel and look for a taxi. You wrote the name down right? No? Okay, you can remember the name anyhow, right? Just say it and … why are you so silent? Say it! … Goddammit, you are …" – "David?" was interrupting my inner voice. Of all people in the world, I happened to meet Michael for the first time right in front of the airport. "Just as planned you meany inner voice", I was kidding myself after getting in a taxi with a few more people.

I meet so many people over the following days! It was kinda scary, very taxing for an introvert but also 100% fun. I also meet the project that would turn me from promising newbie contributor to APT developer via Google Summer of Code 2010: MultiArch. There was a session about it and this time around it should really happen. I was sitting in the back, hiding but listening closely. Thankfully nobody had called me out as I was scared: I can't remember who it was, but someone said that in dpkg MultiArch could be added in two weeks. Nobody had to say it, for me it was clear that this meant APT would be the blocker as that most definitely would not happen in two weeks. Not even months. More like years if at all. What was I to do? Cut my looses and run? Na, sunk cost fallacy be damned. I hadn't lost anything, I had learned and enjoyed plenty of things granted to me by supercow and that seemed like a good opportunity to give back.

But there was so much to do. The cache had to grow dynamically (remember "mmap ran out of room" and feel old), commandline interfaces needed to be adapted, the resolver… oh my god, the resolver! And to top it all of APT had no tests to speak of. So after the UDS I started tackling them all: My weekly reports for GSoC2010 provide a glimpse into the abyss but before and after lots happened still. Many of the decisions I made back then are still powering APT. The shell scripting framework I wrote to be able to perform some automatic testing of apt as I got quickly tired of manual testing consists as of today of 255 scripts run not only by me but many CI services including autopkgtest. It probably prevented me from introducing thousands of regressions over the years. Even through it grew into kind of a monster (2000+ lines of posix shellscript providing the test framework alone), can be a bit slow (it can take more than the default 30min on salsa; for me locally it is about 3 minutes) and it has a strange function naming convention (all lowercase no separator: e.g. insertinstalledpackage). Nobody said you can't make mistakes.

And I made them all: First bug caused by me. First regression with complains hitting d-devel. First security bug. It was always scary. It still is, especially as simple probability kicks in and the numbers increase combined with seemingly more hate generated on the internet: The last security bug had people identify me as purposefully malicious. All my contributions should be removed – reading that made me smile.

Lots and lots of things happened since my first patch. git tells me that 174+ people contributed to APT over the years. The top 5 of contributors of all time (as of today) list:

2904 commits by Michael Vogt (active mostly as wizard)
2647 commits by David Kalnischkies (active)
1304 commits by Arch Librarian (all retired, see note)
1008 commits by Julian Andres Klode (active)
641 commits by Christian Perrier (retired)

Note that "Arch Librarian" isn't a person, but a conversion artefact: Development started in 1998 in CVS which was later converted to arch (which eventually turned into bzr) and this CVS→arch conversion preserved the names of the initial team as CVS call signs in the commit messages only. Many of them belong hence to Jason Gunthorpe (jgg). Christians commits meanwhile are often times imports of po files for others, but there is still lots of work involved with this so that spot is well earned even if nowadays with git we have the possibility of attributing the translator not only in the changelog but also as author in the commit.

There is a huge gap after the top 5 with runner up Matt Zimmerman with 116 counted commits (but some Arch Librarian commits are his, too). And that gap for me to claim the throne isn't that small either, but I am working on it… 😉 I have also put enough distance between me and Julian that it will still take a while for him to catch up even if he is trying hard at the moment.

The next decade will be interesting: Various changes are queuing up in the master branch for a major break in ABI and API and a bunch of new stuff is still in the pipeline or on the drawing board. Some of these things I patched in all these years ago never made it into apt so far: I intend to change that this decade – you are supposed to have read this in "to the moon" style and erupt in a mighty cheer now so that you can't hear the following – time permitting, as so far this is all talk on my part.

The last year(s) had me not contribute as much as I would have liked due to – pardon my french – crazy shit I will hopefully be able to leave behind this (or at least next) year. I hadn't thought it would show that drastically in the stats, but looking back it is kinda obvious:

In year 2009 David made 167 commits
In year 2010 David made 395 commits
In year 2011 David made 378 commits
In year 2012 David made 274 commits
In year 2013 David made 161 commits
In year 2014 David made 352 commits
In year 2015 David made 333 commits
In year 2016 David made 381 commits
In year 2017 David made 110 commits
In year 2018 David made 78 commits
In year 2019 David made 18 commits so far

Lets make that number great again this year as I finally applied and got approved as DD in 2016 (I didn't want to apply earlier) and decreasing contributions (completely unrelated but still) since then aren't a proper response! 😉

Also: I enjoyed the many UDSes, the DebConfs and other events I got to participate in in the last decade and hope there are many more yet to come!

tl;dr: Looking back at the last decade made me realize that a) I seem to have a high luck stat, b) too few people contribute to apt given that I remain the newest team member and c) I love working on apt for all the things which happened due to it. If only I could do that full-time like I did as part of summer of code…

P.S.: The series APT for … will return next week with a post I had promised months ago.

APT for package self-builders

Image by paulbr75 on pixabay

2018-06-03

One of the main jobs of a package manager like apt is to download packages (ideally in a secure way) from a repository so that they can be processed further – usually installed. FSVO "normal user" this is all there ever is to it in terms of getting packages.

Package maintainers and other users rolling their own binary packages on the other hand tend to have the packages they want to install and/or play-test with already on their disk. For them, it seems like an additional hassle to push their packages to a (temporary) repository, so apt can download data from there again… for the love of supercow, there must be a better way… right?

For the sake of a common start lets say I want to modify (and later upload) hello, so I acquire the source via apt source hello. Friendly as apt is it ran dpkg-source for me already, so I have (at the time of writing) the files hello_2.10.orig.tar.gz, hello_2.10-1.debian.tar.xz and hello_2.10-1.dsc in my working directory as well as the extracted tarballs in the subdirectory hello-2.10.

Anything slightly more complex than hello probably has a bunch of build-dependencies, so what I should do next is install build-dependencies: Everyone knows apt build-dep hello and that works in this case, but given that you have a dsc file we could just as well use that and free us from our reliance on the online repository: apt build-dep ./hello_2.10-1.dsc. We still depend on having a source package built previously this way… but wait! We have the source tree and this includes the debian/control file so… apt build-dep ./hello-2.10 – the later is especially handy if you happen to add additional build-dependencies while hacking on your hello.

So now that we can build the package have fun hacking on it! You probably have your preferred way of building packages, but for simplicity lets just continue using apt for now: apt source hello -b. If all worked out well we should have now (if you are on a amd64 machine) also a hello_2.10-1_amd64.changes file as well as two binary packages named hello_2.10-1_amd64.deb and hello-dbgsym_2.10-1_amd64.deb (you will also get a hello_2.10-1_amd64.buildinfo which you can hang onto, but apt has currently no way of making use of it, so I ignore it for the moment).

Everyone should know by now that you can install a deb via apt install ./hello_2.10-1_amd64.deb but that quickly gets boring with increasing numbers, especially if the packages you want to install have tight relations. So feel free to install all debs included in a changes file with apt install ./hello_2.10-1_amd64.changes.

So far so good, but all might be a bit much. What about install only some debs of a changes file? Here it gets interesting as if you play your cards right you can test upgrades this way as well. So lets add a temporary source of metadata (and packages) – but before you get your preferred repository builder setup and your text editor ready: You just have to add an option to your apt call. Coming back to our last example of installing packages via a changes file, lets say we just want to install hello and not hello-dbgsym: apt install --with-source ./hello_2.10-1_amd64.changes hello.

That will install hello just fine, but if you happen to have hello installed already… apt is going to tell you it has already the latest version installed. You can look at this situation e.g. with apt policy --with-source ./hello_2.10-1_amd64.changes hello. See, the Debian repository ships a binary-only rebuild as 2.10-1+b1 at the moment, which is a higher version than the one we have locally build. Your usual apt-knowledge will tell you that you can force apt to install your hello with apt install --with-source ./hello_2.10-1_amd64.changes hello=2.10-1 but that isn't why I went down this path: As you have seen now metadata inserted via --with-source participates as usual in the candidate selection process, so you can actually perform upgrade tests this way: apt upgrade --with-source ./hello_2.10-1_amd64.changes (or full-upgrade).

The hello example reaches its limits here, but if you consider time travel a possibility we will jump back into a time in which hello-debhelper existed. To be exact: Right to the moment its maintainer wanted to rename hello-debhelper to hello. Most people consider package renames hard. You need to get file overrides and maintainerscripts just right, but at least with figuring out the right dependency relations apt can help you a bit. How you can feed in changes files we have already seen, so lets imagine you deal with with multiple packages from different sources – or just want to iterate quickly! In that case you want to create a Packages file which you would normally find in a repository. You can write those by hand of course, but its probably easier to just call dpkg-scanpackages . > Packages (if you have dpkg-dev installed) or apt-ftparchive packages . > Packages (available via apt-utils) – they behave slightly different, but for our proposes its all the same. Either way, ending up with a Packages file nets you another file you can feed to --with-source (sorry, you can't install a Packages file). This also allows you to edit the dependency relations of multiple packages in a single file without constant "fiddle and build" loops of the included packages – just make sure to run as non-root & in simulation mode (-s) only or you will make dpkg (and in turn apt) very sad.

Of course upgrade testing is only complete if you can influence what is installed on your system before you try to upgrade easily. You can with apt install --with-source ./Packages hello=2.10-1 -s -o Dir::state::status=/dev/null (it will look like nothing is installed) or feed a self-crafted file (or some compressed /var/backups/dpkg.status file from days past), but to be fair that gets a bit fiddly, so at some point its probably easier to write an integration test for apt which are just little shellscript in which (nearly) everything is possible, but that might be the topic of another post some day.

Q: How long do I have to wait to use this?

A: I think I have implemented the later parts of this in the 1.3 series. Earlier parts are in starting with 1.0. Debian stable (stretch) has the 1.4 series, so… you can use it now. Otherwise use your preferred package manager to upgrade your system to latest stable release. I hope it is clear which package manager that should be… 😉

Q: Does this only work with apt?

A: This works just the same with apt-cache (where the --with-source option is documented in the manpage btw) and apt-get. Everything else using libapt (so aptitude included) does not at the moment, but potentially can and probably will in the future. If you feel like typing a little bit more you can at least replicate the --with-source examples by using the underlying generic option: aptitude install -s hello-dbgsym -o APT::Sources::With::=./hello_2.10-1_amd64.changes (That is all you really need anyhow, the rest is syntactic sugar). Before you start running off to report bugs: Check before reporting duplicates (and don't forget to attach patches)!

Q: Why are you always typing ./packages.deb?

A: With the --with-source option the ./ is not needed actually, but for consistency I wrote it everywhere. In the first examples we need it as apt needs to know somehow if the string it sees here is a package name, a glob, a regex, a task, … or a filename. The string "package.deb" could be a regex after all. And any string could be a directory name… Combine this with picking up files and directories in the current directory and you would have a potential security risk looming here if you start apt in /tmp (No worries, we hadn't realized this from the start either).

Q: But, but, but … security anyone?!?

The files are on your disk and apt expects that you have verified that they aren't some system-devouring malware. How should apt verify that after all as there is no trustpath. So don't think that downloading a random deb suddently became a safe thing to do because you used apt instead of dpkg -i. If the dsc or changes files you use are signed and you verfied them through, you can rest assured that apt is verifying that the hashes mentioned in those files apply to the files they index. Doesn't help you at all if the files are unsigned or other users are able to modify the files after you verified them, but apt will check hashes in those cases anyhow.

Q: I ❤ u, 🍑 tl;dr

Just 🏃 those, you might 😍 some of them:

apt source hello
apt build-dep ./hello-*/ -s
apt source -b hello
apt install ./hello_*.deb -s
apt install ./hello_*.changes -s
apt install --with-source ./hello_*.changes hello -s
apt-ftparchive packages . > ./Packages
apt upgrade --with-source ./Packages -s

P.S.: If you have expected this post to be published sometime inbetween the last two months… welcome to the club! I thought I would do it, too. Lets see how long I will need for the next one… I have it partly written already, but that was the case for this one as well… we will see.

APT for DPL Candidates

Image by Yomare on pixabay

2018-04-01

(Note: Updated table in 2019 for the new election, the rest of the article remains unchanged and is still as true as it is provocative)

Today is a special day for apt: 20 years ago after much discussion in the team as well as in the Debian project at large "APT" was born.

What happened in all these years? A lot! But if there is one common theme then it is that many useful APT features, tricks and changes are not as known to the general public or even most Debian Developers as they should be.

A few postings are unlikely to change that, but I will try anyhow and this post is the start of a mini-series of "APT for …" articles showing off things: For the first installment I want to show nothing less but the longest running vote-rigging scheme in the known free (software) world. But lets start from the beginning:

Humans pride themselves having evolved over simple animals following their instincts by having a mind of their own to form decisions. On this concept, humans hold votes to agree upon stuff, including the Debian Project Leader for the next year.

DPL candidates are encouraged to provide a platform and a discussion between the candidates and the voters ensures that everyone can form a well informed opinion on the candidates in question and based on this information choose a candidate with their own free will.

Or, That is at least the theory Debian Developers want to believe in.

The following table tallies each leader vote since 1999 with the information if the candidate (68 over all, 37 unique) had contributed to APT (31/13), dpkg (29/17) or both (18/9) for cases I could identify (if I missed anything feel free to get in touch). The candidate in bold has won the election in the given year, candidates in italics won a later election:

Year	Candidate	APT?	dpkg?

1999	Joseph Carter	no	no
	Ben Collins	no¹	yes
	Wichert Akkerman	no	yes
	Richard Braakman	no	yes

2000	Ben Collins	no¹	yes
	Wichert Akkerman	no	yes
	Joel Klecker	no	yes
	Matthew Vernon	no	yes

2001	Branden Robinson	yes²	yes
	Anand Kumria	no	yes
	Ben Collins	yes¹	yes
	Bdale Garbee	yes³	no

2002	Branden Robinson	yes²	yes
	Raphaël Hertzog	yes⁴	yes
	Bdale Garbee	yes³	no

2003	Moshe Zadka	no	no
	Bdale Garbee	yes³	no
	Branden Robinson	yes²	yes
	Martin Michlmayr	yes⁵	no

2004	Martin Michlmayr	yes⁵	no
	Gergely Nagy	no	no
	Branden Robinson	yes²	yes

2005	Matthew Garrett	no	no
	Andreas Schuldei	no	yes
	Angus Lees	no	no
	Anthony Towns	yes⁶	yes
	Jonathan Walther	no	no
	Branden Robinson	yes²	yes

2006	Jeroen van Wolffelaar	yes⁷	yes
	Ari Pollak	no	no
	Steve McIntyre	yes⁸	no
	Anthony Towns	yes⁶	yes
	Andreas Schuldei	no	yes
	Jonathan (Ted) Walther	no	no
	Bill Allombert	no	yes

2007	Wouter Verhelst	no	no
	Aigars Mahinovs	no	no
	Gustavo Franco	no	no
	Sam Hocevar	yes¹⁰	yes
	Steve McIntyre	yes⁸	no
	Raphaël Hertzog	yes⁴	yes
	Anthony Towns	yes⁶	yes
	Simon Richter	yes⁹	yes

2008	Marc Brockschmidt	no	no
	Raphaël Hertzog	yes⁴	yes
	Steve McIntyre	yes⁸	no

2009	Stefano Zacchiroli	yes¹¹	no
	Steve McIntyre	yes⁸	no

2010	Stefano Zacchiroli	yes¹¹	no
	Wouter Verhelst	no	no
	Charles Plessy	yes¹²	yes
	Margarita Manterola	no	yes

2011	Stefano Zacchiroli	yes¹¹	no

2012	Wouter Verhelst	no	no
	Gergely Nagy	no	no
	Stefano Zacchiroli	yes¹¹	no

2013	Gergely Nagy	no	no
	Moray Allan	no	no
	Lucas Nussbaum	no	no

2014	Lucas Nussbaum	no	no
	Neil McGovern	no	no

2015	Mehdi Dogguy	no	no
	Gergely Nagy	no	no
	Neil McGovern	no	no

2016	Mehdi Dogguy	no	no

2017	Mehdi Dogguy	no	no
	Chris Lamb	yes¹³	yes

2018	Chris Lamb¹⁴	yes¹³	yes

2019	Joerg Jaspert	yes¹⁶	no
	Jonathan Carter	maybe¹⁷	no
	Sam Hartman	yes¹⁸	no
	Martin Michlmayr	yes⁵	no

We can see directly that until recently it was nearly mandatory to have contributed to apt or dpkg to be accepted as a DPL candidate. The recent no streak before Chris entered the table gets doubly weird if we factor in that I joined Debian and the APT project around 2009… We might get to the bottom of this coincident in the future, but for now lets get back to the topic at hand:

DDs have no free will

It is hard to accept for a human, but the table above shows that DDs aren't as free in their choice as they think they are; they follow a simple rule:

If at least one of the candidates contributed to APT, an APT contributor wins the election.

You can read this directly from the table above (20 votes total, including 6 votes without an apt candidate). Interestingly, the same rule for dpkg does not hold at all. In fact there are years in which the defining difference for the winning candidate is that he¹⁵ hasn't contributed to dpkg but only to apt (e.g. 2002).

Praying via bugreports and sacrifices in the form of patches in the Pantheon of the Supercow, the deity@ mailinglist, can have profound effects: Take the very first elections as an example: After being an unsuccessful candidate in 1999 and 2000, candidate Ben implemented the rsh transport method for apt and as a result became DPL in 2001.

And as if that wouldn't be enough, being on good terms with Supercow has additional benefits:

Contribution beats Re-Election

While it seems like DPLs are granted a second term if they wish the recent 2017 election shows that contributing to APT is stronger. Two other non-re-elections are on record: In 2003 where the bugreporter Bdale lost against the patchprovider Martin, so contribution size and recency seem to play a role as well, but that might not be everything there is to infights between contributors as shown in 2007 where Anthony lost against Sam in the biggest vote so far with 5 out of 8 candidates supported by apt (including the present and future DPL in this year).

The "Super" rubs of on the DPL

Many DPLs run for two terms, but only a single one managed to run a third time: After unsuccessful campaigning in 2009 Stefano not only worked on apt in 2010 and the following year(s) but also consulted with a certain highpriest of the Supercow netting a record three-year stint as DPL as a result.

It is to be seen if intercession of a highpriest is needed for long DPL terms, but it certainly doesn't hurt – and I make myself of course selflessly available (for a reasonable monetary offering) as said highpriest should a DPL (candidate) be in need of my divine bovine help (again)…

Every contribution matters – for real!

No contribution is too small, everything counts & supercow sees everything. Even "just" downgrading the severity of a bug counts¹⁰. Supercow values all contributors: Join their rank today and win the next DPL election in 2019!

Of course, supercow likes big and groundbreaking patches as much as the next project, but while other projects are just talking about how they like testers, bugreporters, translators and documentation improvers we in the apt project have the election-rigging data of 20 years to proof it! Supercow really cares for its contributors!

So to all past, present and future DPL candidates: Thanks for your contributions to APT (and Debian)!

That… this… what the f…edora?!?

Look at the calendar: Its not only easter sunday, its also the beginning of voting period for DPL 2018. What better day would there be for some fun about humans, genesis, elections and the 20th birthday of apt.

I promise that future installments in the series will be more practically useful but until then enjoy the festive days (if applicable) around apts birthday, have fun, take care and make sure to contribute to apt!

Mooooo!

Contributed the RSH/SSH method in 2000. Won the following election after two unsuccessful rounds. ↩↩↩
Credited in AUTHORS for "Man Page Documentation" ↩↩↩↩↩
Early tester as shown in e.g. #45050 ↩↩↩
Bugreporter and provider of draft patch, e.g. #793360 ↩↩↩
Bugreporter and patch provider, e.g. #417090 ↩↩↩
Multiple patches over the years since 2005 including the latest reimplementation of rred ↩↩↩
Bugreporter and patch provider: #384182 ↩
Bugreporter and tester, e.g. #218995 ↩↩↩↩
Bugreporter, tester and patch provider, e.g. #509866 ↩
RC bug triager, e.g. #454666 ↩↩
Multiple bugreports and patches, including pushing & documenting EDSP ↩↩↩↩
Documentation patches, e.g. #619088 ↩
Tester and patch provider, e.g. #848721 ↩↩
The vote hasn't happened yet, but NOTA is by definition not an apt contributor and hence can't win as outlined in this post. ↩
To this day Debian had no DPL for which "he" does not apply. With this information, you might be able to change this in future elections! ↩
Bugreporter and maintainer of apts partner in crime: the Debian archive; e.g. #499897 ↩
Produced a video about apt, but also about alternatives potentially canceling the effect ↩
Bugreporter; showing proper distaste at heretic tools, e.g. #617625 ↩

Winning the Google Open Source Lottery

Image by jackmac34 on pixabay

2017-04-01

I don't know about you, but I frequently get mails announcing that I was picked as the lucky winner of a lottery, compensation program or simply as "business associate". Obvious Spam of course, that never happens in reality. Just like my personal "favorite" at the moment: Mails notifying me of inheritance from a previously (more or less) unknown relative. Its just that this is what has happend basically a few weeks ago in reality to me (over the phone through) – and I am still dealing with the bureaucracy required of teaching everyone that I had absolutely no contact in the last two decades with the person for which I am supposed to be one of the legal successors now, regardless of how close the family relation is on paper… but that might be the topic of another day.

On the 1st March a mail titled "Google Open Source Peer Bonus Program" looked at first as if it would fall into this lottery spam class. It didn't exactly help that the mail was multipart HTML and text, but the text really only the text, not mentioning the embedded links used in the HTML part. It even included a prominent and obvious red flag: "Please fill out the form". 20% Bayes score didn't come from nothing. Still, for better or worse the words "Open Source" made it unlikely to be spam similar to how the word PGP indicates authenticity. So it happened, another spam message became true for me. I wonder which one will be next…

You have probably figured out by now that I didn't know that program before. Kinda embarrassing for a previous Google Summer of Code student (GSoC is run by the same office), but the idea behind it is simple: Google employees can nominate contributors to open source stuff for a small monetary "thank you!" gift card. Earlier this week winners for this round were announced – 52 contributors including yours truly. You might be surprised, but the rational given behind my name is APT (I got a private mail with the full rational from my "patron", just in case you wonder if at least I would know more).

It is funny how a guy who was taken aback by the prospect of needing a package manager like YaST to use Linux contributed just months later the first patch to apt and has roughly 8 years later amassed more than 2400 commits. It's birthday season in my family with e.g. mine just a few days ago, so its seems natural that apt has its own birthday today just as if it would be part of my family: 19th years this little bundle of ~~bugs~~ joy is now! In more sober moments I wonder sometimes how apt and I would have turned out if we hadn't meet. Would apt have met someone else? Would I? Given that I am still the newest team member and only recently joined Debian as DD at all…

APT has some strange ways of showing that it loves you: It e.g. helps users compose mails which end in a dilemma to give a recent example. Perhaps you need to be a special kind of crazy¹ to consider this good, but as I see it apt has a big enough userbase that regardless of what your patch is doing, someone will like it. That drastically increases the chances that someone will also like it enough to say so in public – offsetting complains from all those who don't like the (effects of the) patch which are omnipresent. And twice in a blue moon some of those will even step forward and thank you explicitly. Not that it would be necessary, but it is nice anyhow. So, thanks for the love supercow, Google & apt users! 🙂

Or in other words: APT might very well be one of the most friendly (package manager related) project to contribute to as the language-specific managers have smaller userbases and hence a smaller chance of having someone liking your work (in public)… so contribute a patch or two and be loved, too! 💖

Disclaimer: I get no bonus for posting this nor are any other strings attached. Birthdays are just a good time to reflect. In terms of what I do with my new found riches (in case I really receive them – I haven't yet so that could still be an elaborate scam…): APT is a very humble program, but even it is thinking about moving away from a dev-box with less than 4 GB of RAM and no SSD, so it is happily accepting the gift and expects me to upgrade sooner now. What kind of precedence this sets for the two decades milestone next year? If APT isn't obsolete by then… We will see.

which even ended up topping Hacker News around New Year's Eve… who would have thought that apt and reproducibility bugs are top news ;) ↩

Switching from ikiwiki to staticsite

Image by kreatikar on pixabay

2016-11-20

Earlier this year Enrico Zini experimented with static site generators and ended up with writing his own in python via component reuse. I was running ikiwiki until now myself, which is okay, but I never became ultra happy with it. One of the factors was that I don't speak Perl – but I don't speak Python either. The biggest annoyance of ikiwiki is that it supports so many many features which I don't want/need like online editing and other changes are very hard to do like working on the theme.

staticsite (aka ssite) on the other hand provides very little in terms of features in comparison, but that just means I had less to disable and could play a bit with python, markdown and jinja2 on my own to implement what I really wanted to have (which mostly isn't available in ikiwiki either, so I would have to anyway) and I like playing with such small features compared to rolling my own static site generator in perfect NIH fashion. So, what features? Thanks for asking!

Smilies, Emojis, Emoticons and Unicode

ikiwiki has a plugin which uses a markdown list to define which text is mapped to other markdown text (usually an image). That is okayish, but results in a wild mixture of picture styles. It just looks strange if you have a blogpost and one of your smilies has a 3D effect and the other hasn't. Additionally, those are a bunch of small image files… for what is effectively a bit of text – or one (displayed) character: UTF-8 supports all kinds of smilies and other lovely pictures. Or, at least if the font you are using does, but that might be the topic of another post. Assuming good support you can do many cool things with UTF-8. The usual smilies like 🙂 😉 😄 😎 the absolutely needed cat variants like 🐱 😸 places on a 🗺 like the 🗽 and of course also various people like 👨 👩 💂‍♂ 👧 and 👦 who can travel to those places by 🚗 or 🚂 And some of those unicodes can be Fitzpatrick-skinned so that 👍 becomes 👍‍🏻 👍‍🏼 👍‍🏽 👍‍🏾 👍‍🏿 ❣

That feature isn't really ssite specific: It is implemented as a simple python-markdown extension replacing things like :) with 🙂 – just many many more of those replaces. You would think there are many pre-existing extensions dealing with this, but I couldn't really find one to my liking. Most of them are actually doing the same as the ikiwiki plugin: Mapping text to tiny images: Honorable mention is githubemoji which hotlinks the emojis github supports (& has replacement images for). There are others which do similar (hot)linking of images, but a small supported character range is common and funky stuff like the mentioned fitzpatrick or region letters creating flags ( 🇩 + 🇪 = 🇩‍🇪 ) stuff isn't supported in any. So, you guessed it: I implemented it myself. Partly at least. I am still working on making this great while learning python with it, so that will take some time still as I have basically reimplemented it a few times already… and there is also this problem that support for this isn't even close to be universal so something needs to be done about that, too.

Theme of the site

ssite comes by default with a bootstrap-based theme. That is okay, but that also means it looks like nearly every other page on the planet in terms of colors and stuff. It also means I have to include hundreds of kilobytes of frameworks in CSS and JavaScript (preferably via some CDN) to get that. And then I have to change the HTML to drop classes everywhere which control look and feel of the elements I have styled with them. That is okayish for large projects I guess. I used it myself in the past but perhaps the conversion of bootstrap2 to 3 I did as part of a university project some time ago distorts my feeling in the negative direction. I kinda like fiddling with CSS and Javascript (after all, I created my own Firefox extension to fiddle with them on all sites I visit: dotPageMod) on the other hand this is a personal page I don't have much problem if it isn't working in IE8 or what not, so the theme is a personal creation. Very minimal as I actually liked that property in ikwiki but with less ugly changes in the ikiwiki specific template syntax and more with jinja2 which is its own template syntax, but at least an independent engine used by others as well, so I might stumble over it again and it feels overall more powerful & natural.

So, perhaps not pretty and not a maximum in browser compatibility, but mine and that makes me happy. 😄 Various things I want/might change in this section as well, but a website is never really done anyhow… 🚧 👷.

anything else?

Not much actually. Enrico implemented a markdown jinja2-filter based on a proof-of-concept patch from me and fixed some bugs I had reported really quickly. All in all the journey so far was quite enjoyable and it will be interesting to see how my impression is next year! As mentioned I have some ideas still, but I wanted to make a cut now and declare it version 1…

Also: One last far well to ikiwiki. I leave you for a younger & prettier alternative, but don't you worry: You served me well (and still do in some places) and there are many others which still depend on you and who knows, perhaps I will leap back if I ever want to get into perl. After all, one of the reasons I opted for ikiwiki back then was that I might learn some perl in the process. Didn't work, but perhaps next time. So long and thanks for all the fish, ikiwiki!

the new apt-transport-tor

Image by ulleo on pixabay

2016-10-01

It happened: Now that I am an uploading DD for a few months I finally made my first upload of a package – mind you, not of apt, but of a package I declared my intend to "steal" from another person a few weeks ago on deity@ and later also in a bugreport (#835128).

The result is that apt-transport-tor which used to be maintained by Tim Retout as a modified copy of apt code is now maintained by the APT team (with him and me as uploaders) using the apt code directly via a few symlinks.

That brings along a bunch of changes which I mentioned in the list/bug as well, but for completeness:

tor+https options consistently fall back to tor -> https -> http
tor+http options consistently fall back to tor -> http
socks5h isn't forced. It is just the default (and the only one which will work with (tor+)http at the moment; any with tor+https)
a tor-proxy having apt-transport-tor as username & no password (default) will automatically pick a password based on the target host to get you in a new circuit for each host.
the User-Agent isn't forced to an all-tor-users-have-the-same value. Especially with tor+http being our normal http I think its better to "hide" between other http users than saying straight that you are a tor user (even if the IP gives it away that you are).
tor+https doesn't allow redirection to tor+http. We have this for a while for https -> http already (-tor "broke" it). I think if a user went as far as configuring a https source it should stay an https source or fail.
http/https can be disabled to avoid accidentally adding such sources
http will not try to connect to .onion domains (RFC7687) and the error hints at using tor+http
the methods run as _apt instead of root (like the rest of the apt methods)

I had tried a few times to get people to provide feedback, but there wasn't much. I guess this is good as it means nobody has any complains about it. We will see if that will change now that it is on its way to archive, buildds, mirrors and users: Brace for impact in any case!