GSoC 2016: Summary

screenshot of the official GSOC 2016 Project page for this project
Screenshot of my GSoC 2016 project page

Intro, Outro and tl;dr

I participated again as a student in this years edition of the Google Summer of Code with Debian on the project APT↔dpkg communication rework. My initial proposal on the wiki details me and the plan, while this post serves as link hub and explanation of what I did. You can also find personal week-to-week posts starting with Day 0 right here on this blog, too.

The code for this project was already merged and uploaded to Debian unstable multiple times over the course of the summer, so everything described here later can be experienced directly. The official GSoC2016 final is 1.3~rc2, but APT always moves forward and I have no intention of leaving it alone, so this tag just marks the end of the GSoC2016 period and my return to "normal" contribution levels.

On a social front I finally applied for and shortly after received "Debian Developer, uploading" status. This is also the moment I want to thank Michael Vogt (mentor, apt), Julian Andres Klode (apt), Manuel A. Fernandez Montecelo (aptitude), Guillem Jover (dpkg), Enrico Zini (AM), the Debian Outreach team and the countless people I should have mentioned here, too, who have all helped me in many ways over the course of this GSoC and my entire Debian journey up to this point.

It was an overall great experience to work again on something as important as APT in Debian on a full-time basis. After two (very different) GSoCs in 2010 and now 2016 I can full heartily recommend to any student with a passion for open-source to apply next year. Perhaps in Debian and maybe in an APT project? We are waiting for YOU!


My first commit as part of GSoC was made on 25. April titled edsp: ask policy engine for the pin of the version directly (minor bugfix), the last commit I will be counting on 17. August titled methods: read config in most to least specific order (regression fix). Not all of them are directly related to the GSoC project itself (the first is), but "just" in the timeframe (like the last) but were handled as part of general emergencies or for similar reasons described later and/or in the weekly reports. This timeframe of 115 days saw a total of 222 commits authored by me + 9 commits committed by me for others (translations, patches, …). The timeframe saw 336 commits as a whole making me responsible for a bit shy of ⅔ of all APT commits in this timeframe with on average of nearly 2 commits each day. A diffstat run over my commits says "322 files changed, 11171 insertions(+), 5847 deletions(-)" consisting of code, documentation and tests (this doesn't include automatic churn like regeneration of po and pot files, which deludes the global statistic). As a special mention our tests alone changed by: "109 files changed, 2759 insertions(+), 1063 deletions(-)". In my weekly reports here on this blog I used ~10574 words (not including this post), another ~23555 words in the IRC channel #debian-apt and sometimes very long mails to deity@ and bugreports (~100 mails) [Not counting private chit-chat with mentor via IRC/mail].

APT External Installation Planner Protocol (EIPP)

The meat of the GSoC project was the ability to let libapt talk to (external) executables (called planners) which are tasked with creating a plan for the installation (and removal) of packages from the system in the order required by their various dependency relations, similar to how libapt can talk to external dependency solvers like aspcud via EDSP. The protocol (current, archive) details how apt and a planner can communicate. APT ships such an external planner already in the form of 'apt' which is "just" using the internal always existing planner implementation, but reads and talks proper EIPP. The major benefit is testing here as it is now possible to generate an EIPP request, feed it to different versions and compare results to find regressions and similar. It also helps in bugreports as such a request is now auto-generated and logged so that it can be easily attached to bugreports and a triager can use that file to reproduce the problem. Previously recreating the system state a user had before the failed upgrade was a very involved, error prune and time consuming task (actually fixing the problem still is, but at least the first step got a lot easier).

APTs good old planner implementation saw also the activation (and fixing) of many previously experimental options intended to optimize the process blocked previously by items of the next paragraph, which makes it look like a new planner now. Entirely new planners exist as prototypes, but they aren't suitable for real use yet due to not handling "edgecases" and being effected by bugs in dpkg. Summary: Everyone can create and work on his own planner in the programming language of choice and run it against realworld cases directly opening a competition space for the invention of future improvements.

APT↔dpkg communication

The other major building block and donor of much of the projects name. Assuming a planner has figured out a plan there is still much left to do which is of no concern for each planner but handled centrally in libapt: The actual calling of dpkg and interpreting its replies. That sounds easy enough, but if you imagine the need of thousand of packages to be installed/configured at once you fear hitting something as barebones as the kernels maximum allowed commandline length. That happened once in a while in the past so finding better solutions to that problem within easy reach (as in: existing in dpkg already, new interfaces for possible future use are a different matter) is in order. Other problems included the overuse of --force options, not communication the purge/removal intentions to dpkg, insufficient crossgrade handling and avoiding losing user configuration on conffile moves involving packages to be purged just to name a few. But also listening to dpkg in terms of how it processes triggers and how all this should be reported in the form of progress reports to the user especially if some steps aren't explicitly planned anymore by a planner, but left to dpkg to do at some point.

The result is mostly invisible to the user, expect that it should all be now slightly faster as e.g. triggers are run less and most "strange" errors a thing of the past.

Side projects, emergency calls and random bufixes

Not actually on my plan for GSoC and at best only marginally related if at all I ended up working on these to deal with important bugs on a "as long as we have a full-time developer" basis.

This includes hunting for strange errors if rred is involved in updating indexes, further preparing for a binary-all future, fixing SRV support, being the master of time, improving security by allowing it to be sidestepped sometimes, improving security by potentially breaking backward-compatibility a bit, stumble into libstdc++6 bugs, implement SOCKS5 proxy support and generic config fallback for acquire methods to be able to propose the mergeback of apt-transport-tor among very many other things.

A complete list can be found with the previously shared git-branch browsing starting at my first commit in the GSoC timeframe (see also statistics above).


I would love to keep working on APT full-time, but that seems rather unrealistic and all good things need to come to an end I guess, so the next weeks will have me adjust to a more "normal" contribution level of "just" in my (extended) free time again. I will also be casually "shopping" for a new money source in the form of a small job while returning to university which hasn't seen a lot of me the last few months and picking up some community work I had delayed for after GSoC. That means I will surely not keep my daily commit average up, but my journey here is far from over:

After many years in APT and Debian there is still something new in it to explore each week as both are evolving continuously – but most of it hidden in plain sight and unnoticed by the general public: Around the start of GSoC I was talking on #gsoc with an admin of another org who couldn't imagine that Debian participated at all as all projects Debian could offer would be bitsized in nature: It is just a distribution, right, not a real org producing value (slightly exaggerated for drama). I would like to concur with this view of course for obvious reasons.

My life would be massively different if I hadn't started to contribute to Debian and APT in particular ~7 years ago – even through I thought I wouldn't be "good enough" back then. I think its fair to say that I showed my past self that in fact I am. Now it is your turn!