Drop unnecessary blocking of all net udev rules

Bug #1577844 reported by Martin Pitt
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
Fix Released
Medium
Scott Moser
Xenial
Fix Released
Medium
Scott Moser

Bug Description

cloud-inits networking setup currently jumps through a lot of bad hoops to make sure that ifup@ does not run until after cloud-init-local.service. This includes blocking udev rules for an indefinite time, which is racy, a potential deadlock, and highly non-elegant.

This is also not necessary: while ifupdown's net udev rule certainly can fire before cloud-init-local, it only asynchronously starts ifup@.service which will be deferred until after network-pre.target and thus after cloud-init-local.service.

-------
Original description, which turned out to be completely false and just us being misled:

ifup@.service can (and often does) run for a particular interface before networking.service runs. This is brittle as during early boot ifup is prone to fail: / might still be read-only, /var might not yet exist or be writable, dhclient-enter-hooks.d/ or if-up.d/ hooks might silently fail, etc. It is also unnecessary as networking.service will bring up all "auto" and all present "allow-hotplug" interfaces anyway, and it runs at the right time.

We should make either 80-ifupdown.rules or ifup@.service ignore events until networking.service is active, or wait until after it has run (slower, but avoids race conditions when hotplug events happen while networking.service is running). Thus we need to add After=networking.service to ifup@.service, so that this only does stuff after doing the "coldplug" configuration.

This also affects cloud-init's setup of networking: this currently jumps through a lot of bad hoops to make sure that ifup@ does not run until after cloud-init-local.service. This includes blocking udev rules for an indefinite time, which is racy, a potential deadlock, and highly non-elegant.

https://bugs.debian.org/752919 is related to this issue.

Related branches

Revision history for this message
Martin Pitt (pitti) wrote :

Adding cloud-init task as once we fix ifup@.service we can drop the udev rule blocking.

description: updated
Changed in ifupdown (Ubuntu):
status: New → In Progress
assignee: nobody → Martin Pitt (pitti)
Martin Pitt (pitti)
description: updated
Changed in ifupdown (Debian):
status: Unknown → New
Martin Pitt (pitti)
no longer affects: ifupdown (Debian)
Revision history for this message
Martin Pitt (pitti) wrote :

Turns out this was all just an illusion. ifup@.service already has "After=network-pre.target", i. e. is actual execution will already be deferred until after network-pre.target. cloud-init-local.service is Before=network-pre.target, so the effect of any hotplug event will sort correctly.

I added a "ExecStart=/bin/sleep 5" to cloud-init-local.service, and this shows that network-pre.target and ifup@ correctly blocks:

May 04 14:57:05 autopkgtest cloud-init[336]: [CLOUDINIT] stages.py[INFO]: network config is disabled by /var/lib/cloud/data/upgraded-network
May 04 14:57:05 autopkgtest cloud-init[336]: Cloud-init v. 0.7.7 running 'init-local' at Wed, 04 May 2016 12:57:04 +0000. Up 3.64 seconds.
May 04 14:57:10 autopkgtest systemd[1]: Started Initial cloud-init job (pre-networking).
May 04 14:57:10 autopkgtest systemd[1]: Reached target Network (Pre).
May 04 14:57:10 autopkgtest systemd[1]: Starting Raise network interfaces...
May 04 14:57:11 autopkgtest systemd[1]: Started ifup for ens3.

So there is nothing to fix in ifupdown, and it should be safe to drop all the hackery in cloud-init: /run/cloud-init/network-config-ready from cloud-init-local.service, the whole cloud-init-wait, and /lib/udev/rules.d/79-cloud-init-net-wait.rules.

Changed in ifupdown (Ubuntu):
status: In Progress → Invalid
Changed in cloud-init (Ubuntu):
status: New → Triaged
summary: - don't run ifup@.service before networking.service
+ Drop unnecessary blocking of all net udev rules
description: updated
Changed in cloud-init (Ubuntu):
assignee: nobody → Scott Moser (smoser)
Changed in ifupdown (Ubuntu):
assignee: Martin Pitt (pitti) → nobody
Revision history for this message
Scott Moser (smoser) wrote :

some futher thought one reason for the blocking of udev was to have the .link files that cloud-init wrote apply.
It turns out that systemd.link will only rename devices once (based on a property in the devices /sys that says it has been renamed).

So, for cloud-init to have these apply we will probably have to unbind and bind the device to clear that. that will also generate a hotplug event to get the device changed.

http://paste.ubuntu.com/16220589/

Revision history for this message
Martin Pitt (pitti) wrote :

Also note that blocking the uevents will not actually work for the renaming if there is an initramfs involved. I. e. it will work in containers, but not QEMU or bare metal.

Scott Moser (smoser)
Changed in cloud-init (Ubuntu):
importance: Undecided → Medium
Scott Moser (smoser)
no longer affects: ifupdown (Ubuntu)
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.7~bzr1227-0ubuntu1

---------------
cloud-init (0.7.7~bzr1227-0ubuntu1) yakkety; urgency=medium

  * New upstream snapshot.
    - fix one more unit test to run inside buildd.

 -- Scott Moser <email address hidden> Sat, 04 Jun 2016 20:55:07 -0400

Changed in cloud-init (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Scott Moser (smoser) wrote :

Hello,
An SRU upload of cloud-init for 16.04 that contains a fix for this bug has been made under bug 1595302. Please track that bug if you are interested.

Changed in cloud-init (Ubuntu Xenial):
importance: Undecided → Medium
status: New → In Progress
assignee: nobody → Scott Moser (smoser)
Revision history for this message
Scott Moser (smoser) wrote :

fix is now released to xenial under bug 1595302. daily cloud-images with this newer version of cloud-init should appear in the next few days. Any image with a serial number *newer* than 20160707 should have cloud-init at 0.7.7~bzr1246-0ubuntu1~16.04.1 .

Changed in cloud-init (Ubuntu Xenial):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.