DHCP addresses for containers should be released on teardown

Bug #1348663 reported by Mark Shuttleworth
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Undecided
Unassigned
juju-core
Fix Released
High
Dimiter Naydenov
1.24
Fix Released
High
Dimiter Naydenov

Bug Description

We've noticed a problem in the Garage MAAS where the number of container IPs leased and no longer used, but unreleased, clogs up DHCP. Containers get their IP address from DHCP, when they die that IP address could be released but unless it's told to do so the DHCP server tries to keep the IP address reserved for 8 hours.

dhclient can release IP addresses:

  dhclient -r

In cases where a container is being destroyed, it would be polite to explicitly release the leases assigned to it.

And of course MAAS needs to notice and free up the lease when the container signals it to do so.

This may be just as applicable to full machines as it is to containers. Essentially, Juju is in the privileged position of knowing that the machine is *going away* conceptually, not just rebooting or shutting down temporarily. If it sent a DHCP Release message, the resources consumed would be immediately available for another machine.

I don't believe that normal ifupdown scripts or network-manager scripts use DHCP Release at all - I see no logged messages to that effect on my home network, but I may be wrong.

This will need careful consideration as to timing - the machine may need to access the network on its way down as part of unmounting network drives or terminating services. So we can't just turn its network off. But we should try to find an elegant way of releasing the IP addresses as a final step on the way out before we turn out the lights.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

If we call ifdown on the way to shutdown or reboot of a container, then it might be we can set a flag which will make ifdown send the dhcp release message.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

MAAS doesn't know about the containers that juju creates; do you think MAAS should grow support for them so they can be managed natively in the MAAS API?

As things stand, once 1.6.0 is released the new static IP work provides all the API necessary for juju to manage the extra IPs that it needs for containers, and I apprised Dimiter of the feature.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote : Re: [Bug 1348663] Re: DHCP addresses for containers should be released on teardown

No need for MAAS to learn about containers, in this case MAAS just needs
to make sure that it's DHCP respects DHCP-RELEASE.

Mark

Revision history for this message
Curtis Hovey (sinzui) wrote :

This issue follows the theme of some recently closed openstack bugs that ensure Juju releases addresses and security groups when the unit is deleted.

Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → next-stable
tags: added: networking
tags: added: maas-provider
Revision history for this message
Stéphane Graber (stgraber) wrote :

I just checked and ifdown does call -r and so should release the address, specifically, we call:
    dhclient -v -r -pf /run/dhclient.%iface%.pid -lf /var/lib/dhcp/dhclient.%iface%.leases %iface%

Now, that'd obviously only be called if the container was cleanly shutdown and not simply killed.
So assuming a container called "test":
 - sudo lxc-stop -n test => will call ifdown in the shutdown sequence and timeout after 30s forcefully killing the container then
 - sudo lxc-stop -n test -k => will NOT call ifdown as all the container tasks will just get SIGKILLed

This is for LXC 1.x, before it, we had a bunch of different commands doing those steps. So if you are shelling out to LXC and need to support 12.04 and 14.04, you'd want something like this:

Clean shutdown:
 - If lxc-shutdown exists, call: lxc-shutdown -n <container> -t 30
 - Otherwise, call: lxc-stop -n <container>

Forcefully killing stuff (ifdown won't get called):
 - If lxc-shutdown exists, call: lxc-stop -n <container>
 - Otherwise, call: lxc-stop -n <container> -k

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

Thanks Stgraber. Can anyone comment on:

 * whether Juju has anything like the "clean shutdown of a container"
outlined above?
 * whether MAAS' dhcpd is seeing and acting on those releases?
     - and if so, why anyone has run out of DHCP addresses because of
containers?

Mark

Revision history for this message
Adam Collard (adam-collard) wrote :

Hi Mark,

On 28 July 2014 19:11, Mark Shuttleworth <email address hidden> wrote:

> Thanks Stgraber. Can anyone comment on:
>
> * whether Juju has anything like the "clean shutdown of a container"
> outlined above?
>

Landscape's cloud-installer does a juju destroy-environment --force, to
forcefully bring down the Juju environment. This was necessary in one of
the Juju 1.19.x builds because of a bug where it was trying to tell MAAS to
release the containers (which MAAS knew nothing about). We also want to
make sure that the environment is really torn down. I mention this, because
it's likely to side step any clean shutdowns that Juju might otherwise do :/

Ryan Harper (raharper)
tags: added: oil
Revision history for this message
John A Meinel (jameinel) wrote :

Generally "juju destroy-environment --force" will, indeed, go around any clean shutdown and just be stopping instances from the client directly to the Provisioner.

Curtis Hovey (sinzui)
tags: added: network
removed: networking
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Invalidating the maas task as there's nothing for it to do here, unless someone finds a bug in the dhcp handling.

Changed in maas:
status: New → Invalid
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Juju uses the golxc package to manage LXC containers. Right now golxc uses "lxc-stop -n name" which, as stgraber commented earlier, should do a clean shutdown and release the DHCP-allocated address. So it seems juju-core does the right thing, but MAAS perhaps does not see the DHCP_RELEASE request?

Anyway, I've filed a bug #1390429 to improve golxc so that it uses "lxc-shutdown" when available, but even if implemented it doesn't look like it will change the current behavior.

Changed in juju-core:
status: Triaged → Invalid
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Reopening, because after a few experiments with both LXC and KVM containers using DHCP-assigned addresses from MAAS, I can confirm we *don't* explicitly release the addresses.

Changed in juju-core:
status: Invalid → Triaged
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.21 → 1.22
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

I'm currently investigating the issue to determine why DHCPRELEASE is not happening as it should.

Changed in juju-core:
assignee: nobody → Dimiter Naydenov (dimitern)
Changed in juju-core:
milestone: 1.22-alpha1 → 1.23
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.23 → 1.24-alpha1
Changed in juju-core:
importance: High → Critical
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

We can do proper cleanup using the new MAAS devices API (not yet released still, but should be soon I'm told). Should I assume it's critical for 1.23 as well?

Revision history for this message
Alexis Bruemmer (alexis-bruemmer) wrote :

We are targeting 1.24 for this fix and currently do not have plans to add the fix to a 1.23 point release

Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.24-alpha1 → 1.25.0
Revision history for this message
Michael Foord (mfoord) wrote :

As far as I can tell we don't release DHCP leases even in the graceful shutdown cases (neither remove-machine on a container nor a destroy-environment without force). The MAAS DHCPLease database seems to still contain all dhcp leases.

The new "addressable containers" in 1.23 (behind a feature flag) will address this for containers by explicitly acquiring IP addresses from the provider and releasing them. Any reason for the machine agent in the container not to call release on graceful shutdown in the meantime? (In 1.23 / 1.24.)

Revision history for this message
Michael Foord (mfoord) wrote :

In the graceful shutdown case we should be calling lxcBroker.StopInstances() which calls manager.DestroyContainer(id) which goes to golxc cotnainer.Stop() which *does* call lxc-stop (without the -k). So we *should* see DHCP release (according to comments earlier in this bug). Adding some logging to confirm this is really called.

Revision history for this message
Michael Foord (mfoord) wrote :

The specific call to lxc-stop for a "remove-machine" on a container is (logging output):
machine-0: 2015-05-06 22:49:10 WARNING golxc.stop golxc.go:278 lxc-stop [--name juju-machine-0-lxc-0]

Still trying to confirm this *should* do ifdown which should release the lease. Will try manually with a new container on the machine to see.

Revision history for this message
Michael Foord (mfoord) wrote :

Manually creating an lxc container on a machine owned by juju, bridged to juju-br0, then calling lxc-stop does *not* clear the DHCP lease with MaaS . It looks like we'd have to call "dhclient -r" inside the container on shutdown.

Revision history for this message
Michael Foord (mfoord) wrote :

Inside an lxc container, bridged via juj-br0 to get an IP address from MAAS, neither "dhclient -r" *nor* "ifdown eth0" cause MAAS to forget the lease.

However if I do ifdown, delete the dhclient leases file and *then* do an ifup the lease is removed and then re-added (with the same ip address of course but I definitely see the removal). I think it's going to be hard for us to fix this outside of the addressable containers story.

Ian Booth (wallyworld)
Changed in juju-core:
assignee: Dimiter Naydenov (dimitern) → Michael Foord (mfoord)
Revision history for this message
Michael Foord (mfoord) wrote :

Sniffing traffic with an lxc container on my network (not inside a juju environment - shouldn't be any different though):

* "dhclient -r" (without further arguments) does not send a DHCP Release
* "ifdown eth0" does send it

Given that doing an ifdown inside a container does not cause the lease to be released, this would *appear* to be a MaaS DHCP bug. Needs confirming. (I'm checking DHCP leases by inspecting the DHCPLease model in the MAAS database.)

Revision history for this message
Michael Foord (mfoord) wrote :

Ah, no. I need to retry all of this form a clean environment - however "lxc-stop" does *not* appear to cause a DHCP release. ifdown *does*, so an explicit ifdown in the right place (prior to the lxc-stop) might at least cause the DHCP release.

I'm not seeing the result of that ifdown in the MAAS database, but it does get written to dhcpd.leases - maybe this would be sufficient(?).

Revision history for this message
Michael Foord (mfoord) wrote :

I've had it confirmed that the leases file is authoritative, not the MAAS dhcplease database table - so my previous assertions about dhcp release from inspecting the database are flawed.

However, from investigation it seems that neither an lxc-stop from outside the container, nor a "halt" from inside the container, cause a dhcp release. An explicit "ifdown" from inside the container *does*. So it seems like the shutdown sequence does not do an "ifdown". Investigating further. (To see if we *should* be seeing this as Stephane asserted, and if so why it isn't happening, if not to see what it would take for us to add it to the shutdown.)

Revision history for this message
Michael Foord (mfoord) wrote :

The "right" fix for this problem is to use addressable containers which statically allocate addresses. Bug #1441206 is about getting those addresses released and is much easier to complete. I'm demoting this bug to high (as it is still fixable as is - but more work).

Changed in juju-core:
importance: Critical → High
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

We've planned to fix this issue in our next 2-week iteration starting May 25, but since on 25 most of the team is off it won't make it for the pre-scheduled 1.24.0 release on the same day. It should make it shortly after, in a point release, or if the release slips a few days in 1.24.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

Curtis, please reset this to high, it causes a lot of problems in
high-traffic environments like OIL. Every time we build a cloud we use
many containers, tearing down the cloud needs to release the addresses
properly.

Thanks,
Mark

Revision history for this message
Curtis Hovey (sinzui) wrote :

I have tentatively scheduled this issue to be fixes in 1.24.1. Maybe Core can fix this before the 1.24.0 release this week, or we choose to delay 1.24.0 until this issue is fixed.

Revision history for this message
Alexis Bruemmer (alexis-bruemmer) wrote :

A fix for this bug will land in 1.25 as continued work on addressable containers. The team will also investigate possible solutions for 1.24 that would land in a 1.24 point release.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

We've successfully verified the normal LXC shutdown path (lxc-stop -n <name> juju uses) DOES NOT send a DHCPRELEASE packet (verified by running dhcpdump -i juju-br0 on the host). However, injecting a simple init job (upstart or systemd) that does /sbin/ifdown -a --force during shutdown does the trick. Forcefully destroying the host machine (with juju or outside of it) prevents that job in the container from executing, so the proposed fix will only work for non-forceful machine (and environment) destruction. Containers can still be destroyed with --force (in case there are units on them) and the job still works.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

On 09/06/15 10:23, Dimiter Naydenov wrote:
> We've successfully verified the normal LXC shutdown path (lxc-stop -n
> <name> juju uses) DOES NOT send a DHCPRELEASE packet (verified by
> running dhcpdump -i juju-br0 on the host). However, injecting a simple
> init job (upstart or systemd) that does /sbin/ifdown -a --force during
> shutdown does the trick. Forcefully destroying the host machine (with
> juju or outside of it) prevents that job in the container from
> executing, so the proposed fix will only work for non-forceful machine
> (and environment) destruction. Containers can still be destroyed with
> --force (in case there are units on them) and the job still works.

Is there a reason not to ensure that all system shutdowns (physical,
virtual, container) do DHCPRELEASE? I would prefer to generalise this
than have a container-specific behaviour.

Also, we need to handle cases where addresses were allocated directly by
APIs and not at random by DHCP. Currently, a container does DHCP and
gets an IP address. But in future, the container should be *given* the
address by Juju based on API conversations between Juju and the
substrate (cloud / maas). In that case, statically allocated addresses
should ALSO be released.

Mark

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

I don't see a reason not to do as suggested for both machines and containers on them. Will do.

As for the case where containers get static IPs from MAAS, we have 2 scenarios:
1. with the address-allocation feature flag and with MAAS 1.7+ we use the ipaddresses API to reserve addresses for containers (this is available in juju 1.23.2+)
2. with the (upcoming) devices-api-maas feature branch and with MAAS 1.8+ we use the devices API to register/unregister containers into MAAS (with their parent node) and then claim-sticky-ip API to reserve addresses for them.

In case 1. we release the allocated addresses on container shutdown (or it's parent node), but there's still a chance the addresses are not released (e.g. with destroy-environment --force).

In case 2. we notify maas when a container is removed and maas takes care of releasing the addresses used. This will also work with destroy-environment --force, as maas knows the parent/child relationships between nodes and devices on them.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

On 09/06/15 13:29, Dimiter Naydenov wrote:
> I don't see a reason not to do as suggested for both machines and
> containers on them. Will do.

I think there is a risk that this means that a machine which reboots
comes back with a new IP address. But having declared the machine to use
DHCP, this cannot be an entirely unexpected event to the administrator
:) We would just be making it much more likely to happen. Let's try it
in Wily and see if there is fallout (though the Ubuntu development list
should be consulted before an upload).

Mark

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

That's correct, but whether a machine will get another address depends on the lease duration as well.
We didn't have issues with machines leaking leases AFAIK, only containers so that's why I suggested initially to fix it for containers only. But it should be OK to do it for machines as well.

We're testing on trusty and vivid first, and wily should come after, yes.
Do you think we should prevent the "ifdown on shutdown" behavior by default for wily, unless a feature flag is set?

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

During the live tests with LXC, KVM, and regular machines (on MAAS) I've discovered an interesting discrepancy between trusty and vivid. Without the "clean-shutdown" job, reboots and shutdowns of containers or machines causes DHCPRELEASE to be sent, while on trusty (with same setup and commands) DHCPRELEASE is not sent. With the "clean-shutdown" job in both vivid and trusty DHCPRELEASE is sent as expected. I'm working on a branch to introduce the clean-shutdown job for machines and containers in 1.24.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Fix for 1.24 proposed with https://github.com/juju/juju/pull/2548. Live tested on EC2 and MAAS in various setups.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

The fix for 1.24 has landed (affects all machines - containers or instances). I've filed a separate bug 1464237 to track the devices API feature implementation on MAAS.

I'm testing the port of that fix for 1.25 and will propose it when done.

Changed in juju-core:
status: Triaged → In Progress
assignee: Michael Foord (mfoord) → Dimiter Naydenov (dimitern)
Revision history for this message
Dimiter Naydenov (dimitern) wrote :
Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.