Intro, Outro and tl;dr
I participated again as a student in this years edition of the Google
Summer of Code with
Debian on the project APT↔dpkg communication
rework.
My initial proposal on the
wiki
details me and the plan, while this post serves as link hub and
explanation of what I did. You can also find personal week-to-week posts
starting with Day 0 right here on this
blog, too.
The code for this project was already
merged and uploaded to
Debian unstable multiple times over
the course of the summer, so everything described here later can be
experienced directly. The official GSoC2016 final is
1.3~rc2,
but APT always moves forward and I have no intention of leaving it
alone, so this tag just marks the end of the GSoC2016 period and my
return to "normal" contribution levels.
On a social front I finally applied for and shortly after
received "Debian
Developer, uploading" status. This is also the moment I want to thank
Michael Vogt (mentor, apt), Julian Andres Klode (apt), Manuel A.
Fernandez Montecelo (aptitude), Guillem Jover (dpkg), Enrico Zini (AM), the Debian Outreach
team and the countless people
I should have mentioned here, too, who have all helped me in many ways
over the course of this GSoC and my entire Debian journey up to this
point.
It was an overall great experience to work again on something as
important as APT in Debian on a full-time basis. After two (very
different) GSoCs in 2010 and now 2016 I can full heartily recommend to any
student with a passion for open-source to apply next year. Perhaps in
Debian and maybe in an APT project? We are waiting for YOU!
Statistics
My first commit as part of GSoC was made on 25. April titled edsp: ask
policy engine for the pin of the version
directly
(minor bugfix), the last commit I will be counting on 17. August titled
methods: read config in most to least specific
order
(regression fix). Not all of them are directly related to the GSoC
project itself (the first is), but "just" in the timeframe (like the
last) but were handled as part of general emergencies or for similar
reasons described later and/or in the weekly reports. This timeframe of
115 days saw a total of 222
commits
authored by me + 9 commits committed by me for others (translations,
patches, …). The timeframe saw 336 commits as a whole making me
responsible for a bit shy of ⅔ of all APT commits in this timeframe
with on average of nearly 2 commits each day. A diffstat run over my
commits says "322 files changed, 11171 insertions(+), 5847 deletions(-)"
consisting of code, documentation and tests (this doesn't include
automatic churn like regeneration of po and pot files, which deludes the
global statistic). As a special mention our tests alone changed by:
"109 files changed, 2759 insertions(+), 1063 deletions(-)". In my weekly
reports here on this blog I used ~10574 words (not including this post),
another ~23555 words in the IRC channel #debian-apt and sometimes very
long mails to deity@ and bugreports (~100 mails) [Not counting private
chit-chat with mentor via IRC/mail].
APT External Installation Planner Protocol (EIPP)
The meat of the GSoC project was the ability to let libapt talk to (external)
executables (called planners) which are tasked with creating a plan for the
installation (and removal) of packages from the system in the order required by
their various dependency relations, similar to how libapt can talk to external
dependency solvers like aspcud via EDSP.
The protocol
(current,
archive)
details how apt and a planner can communicate. APT ships such an external
planner already in the form of 'apt' which is "just" using the internal always
existing planner implementation, but reads and talks proper EIPP. The major
benefit is testing here as it is now possible to generate an EIPP request, feed
it to different versions and compare results to find regressions and similar.
It also helps in bugreports as such a request is now auto-generated and logged
so that it can be easily attached to bugreports and a triager can use that file
to reproduce the problem. Previously recreating the system state a user had
before the failed upgrade was a very involved, error prune and time consuming
task (actually fixing the problem still is, but at least the first step got a
lot easier).
APTs good old planner implementation saw also the activation (and
fixing) of many previously experimental
options intended to optimize
the process blocked previously by items of the next paragraph, which
makes it look like a new planner now. Entirely new planners exist as
prototypes, but they aren't suitable for real use yet due to not
handling "edgecases" and being effected by bugs in dpkg. Summary:
Everyone can create and work on his own planner in the programming
language of choice and run it against realworld cases directly opening
a competition space for the invention of future improvements.
APT↔dpkg communication
The other major building block and donor of much of the projects name. Assuming
a planner has figured out a plan there is still much left to do which is of no
concern for each planner but handled centrally in libapt: The actual calling of
dpkg and interpreting its replies. That sounds easy enough, but if you imagine
the need of thousand of packages to be installed/configured at once you fear
hitting something as barebones as the kernels maximum allowed commandline
length. That happened once in a while in the past so finding better
solutions to that problem within easy reach (as in: existing in dpkg
already, new interfaces for
possible future use are a different matter) is in order. Other problems
included the overuse of --force
options, not communication
the purge/removal intentions to
dpkg, insufficient
crossgrade handling and
avoiding losing user configuration on conffile moves involving
packages to be purged just to
name a few. But also listening to dpkg in terms of how it processes
triggers and how all this should be reported in the form of progress
reports to the user
especially if some
steps aren't explicitly
planned anymore by a planner, but left to dpkg to do at some
point.
The result is mostly invisible to the user, expect that it should all be
now slightly faster as e.g. triggers are run
less and most "strange"
errors a thing of the past.
Side projects, emergency calls and random bufixes
Not actually on my plan for GSoC and at best only marginally related if
at all I ended up working on these to deal with important bugs on a "as
long as we have a full-time developer" basis.
This includes hunting for strange errors if rred is involved in
updating indexes, further
preparing for a binary-all
future, fixing SRV
support, being the master
of time, improving security
by allowing it to be sidestepped
sometimes, improving
security by potentially breaking backward-compatibility
a bit, stumble into
libstdc++6 bugs, implement
SOCKS5 proxy support and
generic config fallback for acquire
methods to be able to propose
the mergeback of
apt-transport-tor
among very many other things.
A complete list can be found with the previously shared git-branch
browsing starting at my first
commit
in the GSoC timeframe (see also statistics above).
Future
I would love to keep working on APT full-time, but that seems rather
unrealistic and all good things need to come to an end I guess, so the
next weeks will have me adjust to a more "normal" contribution level of
"just" in my (extended) free time again. I will also be casually
"shopping" for a new money source in the form of a small job while
returning to university which hasn't seen a lot of me the last few
months and picking up some community work I had delayed for after GSoC.
That means I will surely not keep my daily commit average up, but my
journey here is far from over:
After many years in APT and Debian there is still something new in it to
explore each week as both are evolving continuously – but most of it
hidden in plain sight and unnoticed by the general public: Around the
start of GSoC I was talking on #gsoc with an admin of another org who
couldn't imagine that Debian participated at all as all projects Debian
could offer would be bitsized in nature: It is just a distribution,
right, not a real org producing value (slightly exaggerated for drama).
I would like to concur with this view of course for obvious reasons.
My life would be massively different if I hadn't started to contribute
to Debian and APT in particular ~7 years ago – even through
I thought I wouldn't be "good enough" back then. I think its fair to say
that I showed my past self that in fact I am. Now it is your turn!
This week saw the release of 1.3~rc1 sporting my tor-changes disguised as
general acquire methods changes I mentioned last week as well as the revamp
of apt talking to dpkg (and back) I worked on the last couple weeks as part
of GSoC. It doesn't include any new planner, a stub is still lying in a wip
branch, but our good old planner looks and behaves slight different, so it
feels like a new one – and surprising as it is: So far no bugreport related to
this. Probably all user systems caught instantly fire! 🙂︎
So, the week was further used to get used to cmake,
build and run apt on various porterboxes to fix testcase failures, fixing other
incomings and especially pull some hair out while debugging the bug of the week
which lends the title to this blogpost: A one word fix for an issues
manifesting itself only in optimization level -O3 on
ppc64el.
Optimizations are evil…
Beside causing quite some time waste for this week as well as in previous years
it is also closing a loop: I introduced this problem myself while being a GSoC
student… in 2010. Time really flies. And I have no idea what I was thinking either…
I could be describing more of these "tiny" bugs, but the commit messages tend
to do a reasonable job and if you are really that damn interested: Feel free to
ask. 🙂︎
This next week will be the last official in GSoC from a students POV as I am
supposed to clean up all bases & submit my work for the final evaluation – this
submit will be as a blogpost describing & linking to everything, which equals
miles long and relatively soon, so that I purposefully have kept this one a
very short one so you will have enough energy to bear with me for the next one.
The week started badly: I had for a long while now a dead 'e' key on my
keyboard, but I didn't care that much… I just remapped CAPSLOCK,
retrained my fingers and be done with it [I have some fucked up typing
style anyhow]. All good, but at this week additional keys started to
give up. You have no idea how annoying it is to not be able use the
arrow keys. Many things I work with have at least the vim-keybindings,
but even in vim picking an autocompletion becomes a nightmare (or
navigating shell history)… So, replacement keyboard please! That took
a while, especially replacing it as my laptop makes that extra hard it
seems but oh well. All working again now! The c-key is actually working
a bit too good (you have to only touch it now, which had me worried as
it started out with producing and endless stream of 'c' out-of-the-box
before I removed the cap once) but so be it for now.
As you might guess that wasn't the ideal work-environment and slowed me
down (beside being annoying), so what I intended to do as a sideproject
turned out to be covering most of the week. Mergeback of
apt-transport-tor into
apt? Yes, no,
maybe? The first few responses by mail & IRC are in regards to the plan,
but that still had the need for a lots of code to be written and
refactored. I have to say, implementing SOCKS5 proxy support in apt was
kinda fun and not nearly as hard as I had imagined. Slightly harder it
was to get a setup working in which I could test it properly. Everyone
knows netcat, but that really targets more text-based protocols, not
binary ones like SOCKS5. Still, I managed to figure out how to do it
with socat eventually, resulting in a testscript we can at least run
manually (as it requires a specific port. Not every tool is as nice as
our webserver which can be started on port 0 and reports the port it
eventually picked for real). Playful as I am I even compared my
implementation to others like curl, which our https method is using,
where I ended up reporting minor
bugs.
But why and why now you might ask: apt-transport-tor can be (surprise
surprise) used to let apt talk to Tor network. Conceptionally that isn't
incredibly hard: The Tor software provides a SOCKS5 proxy an application
can connect to be done. Two years ago then apt-transport-tor was
introduced only our curl-backed https method could do that & the
intention was be able to make backports of that transport available,
too, so even through I wasn't all that happy about it, we ended up with
a modified copy of our https method named tor in the archive and as it
is with most modified copies of code, they aren't really kept in sync
with the original. I really want to get this resolved for stretch, so it
slowly gets hightime to start this as if it turns out that I need to
take over maintenance for it without previous-maintainer consent there
is quiet a bit of waiting involved stuff like this should really not be
changed last-minute before the freeze… you will find more details in the
mentioned mail on the reasons for proposing this solution in the mail.
Beside, SOCKS support is actually a so much requested feature that the
internet actually believes apt would support it already via
Acquire::socks::proxy … which is and will also be in future wrong as
there is no socks method we would configure a proxy for – if at all you
configure a method like http to use a socks proxy…
Of course, bundled with this comes a bunch of other things like better
redirection handling across methods and stuff, but that isn't really
user visible, so I skip it here and instead refer you to the git
branches if you are really interested. A few things I will surely also
mention then the relevant code is in the archive so that interested
peers can test…
On my actual battle front, the ordering progress was insignificant.
I got lots of manual testing and review done, but not much new stuff.
The problem is mostly that ordering is easy as long as the problem is
easy, but as soon as Pre-Depends are entering the picture you suddenly
have to account for all kinds of strange things like temporal removals,
loops conflicting or-groups, … stuff you don't want to loose hair over
while losing hair over your broken keyboard already. 😉︎
This week for realz, although target is now really more to merge the
current stuff for apt 1.3. A new ordering algorithm is as detailed in
the initial proposal buster material anyhow – and given all the changes
in terms of trigger delaying and pending calls you are likely not to
recognize our "old" ordering anymore, but more on this in the next two
weeks as that will be the end of GSoC and hence I am going to look back
at "the good old times" before GSoC compared to what we have now. 🙂︎
P.S.: This weeks weekend marks the start of a big wine festival in our
state
capital
my family is one of the founding members of. I am "just" going to help
building the booth through, so no giant timesink this time – just
a couple hours – just in case you hear me saying something about wine
again on IRC.
Picture me as a messenger kicked into an endless pit of complexity by
what was supposed to be an easy victim. It wasn't /that/ bad, but
massaging apt to treat crossgrades right took some time – and then some
more to request that dpkg would handle some of
it, too, as
it gets confused by multi-instance packages. Much like apt although that
was just effecting apts progress reporting so not that bad… perhaps I am
play-testing too
much…
In unrelated news I dealt with the two acquire bugs which started last
week as such bugs are annoying & carry the risk of being security
problems which would require immediate attention. Thankfully non of them
seems to be one, but #831762 had me seriously worried for a while.
Trivial in retrospective, but getting to a point in which you consider
the possibility of that happening at all…
But back to the topic: There was one thing still needed to get our
current internal planner a bit "smarter" by enabling options which
existed for a while now, but were never activated by default, and that
thing was simulation. While its kinda appealing to have the simulation
only display what the planner explicitly told use to do, ignoring what
would implicitly be done by the --pending calls we can't really do that
as this would be an interface break. There are surely scripts out there
doing funny things with this output so having it be incomplete is not an
option, which in turn means that what I did internally for the progress
reporting (and hook scripts) must also be done in the simulation. Easier
said then done through as the implementation of it followed a direct
approach running the simulation of each action as soon as the action was
called rather than collecting all the actions first to post-process them
(as I want to do it) and execute them only then. Add to this that this
is a public class, so ABI is a concern… the solution I arrived at is
slightly wrong, but is going to satisfy all existing callers (which is
only aptitude in the archive thanks to codesearch.d.n) and reuses the
"dpkg specific" code, which is a layer violation, but reuse is better
than copy&paste without breaking ABI, so I am happy all things
considered.
So, with that out of the way glory awaits: Changing the default of
PackageManager::Configure from "all" to "smart"… and tada: It works! The
simulation shows everything, the dpkg invocations are much shorter,
trigger executions delayed, we rely more on --pending calls and progress
reporting is properly moving forward as well! Not 100% production ready but
good enough for a public wip branch for
now (= wip aka: going to be
rebased at will).
This also includes the barebones 'dpkg' planner I mentioned last week,
based on that I was playing with ideas to find a more viable
implementation (= handling Pre-Depends) but nothing of particular note
produced yet. Maybe I can get something working this week – it is at
least part of the plan beside polishing my wip branch – after leaving
that pit that is… Hello, is anyone up there? Hello? Hello? …
On the public side of things I did a bunch of things this week which
weren't exactly related to the GSoC project which were triggered by
incoming mails (and IRC highlights) with bugreports and interesting
whichlists, which isn't completed yet as there are two new responses
with debug logs waiting for me next week.
I say that as I brushed up a smallish commit for merge this week which
is supposed to deal better with triggers in progress reporting. That
sounds boring – writing progress reporting stuff – but many people care
deeply about it. While this one is more of a cosmetic change its
conceptionally a big one: With the actions the planner proposed, apt
builds for each package a list of states it will pass through while it
is installed/upgraded/removed/purged. Triggers rain in this parade as we
don't know beforehand that a package will be triggered. Instead,
a status message from dpkg will tell us that a package was triggered, so
if we don't want to end up in a state in which apt tells via progress
report that it is done, but still has a gazillion triggers to run we
have to notice this and add some progress states to the list of the
triggered package – easy right? The "problem" starts with packages which
are triggered but are destined to be upgraded to. A triggered package
will loose its trigger state if it is unpacked, so our progress report
has to skip the trigger states if the package is unpacked – we can't
just exclude packages if they will be unpacked as it can easily be that
a package is triggered, the trigger is acted upon and is upgraded "ages"
later in this apt run.
My wip branch contains many more progress related commits as there is
a big gotcha in the description above: I said "with the actions the
planner proposed", so what about the actions the planner isn't proposing
but will happen as part of dpkg --configure/--remove/--purge --pending
calls? And what about hook scripts like apt-listbugs/changes which get
told the actions to perform their own magic?
The solution is simple: We don't tell dpkg about this, but for our own
usage we do trivial expansions of the --pending commands and use these
for progress report planning as well as for telling the hookscripts
about them. That sounds like a very simple and optional thing, but it
was actually what blocked the activation of various config options I had
implemented years ago which delay trigger execution, avoid explicit
configuration of all packages at the end and all that which I could now
all enable – a bit more on that after this hits the master branch. 🙂︎
I also looked into supporting crossgrades better. In apts conception
a crossgrade is the remove of a package of arch A and the installation
of a new package (with the same name) of arch B. dpkg on the other hand
deals with it like a 'normal' upgrade, just that the architecture of the
package changes. The issue with that isn't gigantic usually, but it
becomes big with essential packages like if you try to crossgrade dpkg
itself with apt: APT refuses to do that by default, but with enough
force it will tell dpkg to remove dpkg:A and then it tells dpkg to
unpack dpkg:B – just that there is no dpkg anymore which could unpack
itself. At least in that case we can skip the remove of dpkg:A, but we
can't do it unconditionally as that might very well be some part of an
order requirement, so progress reporting should be prepared for either
to (not) happen… That isn't finished yet and will surely leak into next
week.
Next week will also see my freshly built planner 'dpkg' get a proper
tour: With all the --pending calls it seems like a good idea to try to
be extra dumb and have a planner just unpack everything in one go, let
the rest be covered by --pending calls and see what breaks: Obviously
the harder stuff, but I have two directions I would like to explore
based on this minimal planner to make it viable. First I have to finish
the crossgrading through, my usual self-review of commits and the
bugreports I participated in this week want to trigger further actions,
too… see you next week!
These are the latest five out of a total of 17 posts tagged gsoc2016.
You can browse a list of all posts tagged gsoc2016 here or
subscribe to the RSS feed or
⚛︎ Atom feed for this tag to be notified of future posts.