Can't force delete an errored instance with no info cache

Bug #1316373 reported by Sam Morrison
48
This bug affects 8 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Vladik Romanovsky
Icehouse
Fix Released
Medium
Abhishek Kekane

Bug Description

Sometimes when an instance fails to launch for some reason when trying to delete it using nova delete or nova force-delete it doesn't work and gives the following error:

This is when using cells but I think it possibly isn't cells related. Deleting is expecting an info cache no matter what. Ideally force delete should ignore all errors and delete the instance.

2014-05-06 10:48:58.368 21210 ERROR nova.cells.messaging [req-a74c59d3-dc58-4318-87e8-0da15ca2a78d d1fa8867e42444cf8724e65fef1da549 094ae1e2c08f4eddb444a9d9db71ab40] Error processing message locally: Info cache for instance bb07522b-d705-4fc8-8045-e12de2affe2e could not be found.
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging Traceback (most recent call last):
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging File "/opt/nova/nova/cells/messaging.py", line 200, in _process_locally
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging resp_value = self.msg_runner._process_message_locally(self)
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging File "/opt/nova/nova/cells/messaging.py", line 1532, in _process_message_locally
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging return fn(message, **message.method_kwargs)
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging File "/opt/nova/nova/cells/messaging.py", line 894, in terminate_instance
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging self._call_compute_api_with_obj(message.ctxt, instance, 'delete')
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging File "/opt/nova/nova/cells/messaging.py", line 855, in _call_compute_api_with_obj
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging instance.refresh(ctxt)
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging File "/opt/nova/nova/objects/base.py", line 151, in wrapper
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging return fn(self, ctxt, *args, **kwargs)
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging File "/opt/nova/nova/objects/instance.py", line 500, in refresh
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging self.info_cache.refresh()
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging File "/opt/nova/nova/objects/base.py", line 151, in wrapper
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging return fn(self, ctxt, *args, **kwargs)
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging File "/opt/nova/nova/objects/instance_info_cache.py", line 103, in refresh
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging self.instance_uuid)
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging File "/opt/nova/nova/objects/base.py", line 112, in wrapper
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging result = fn(cls, context, *args, **kwargs)
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging File "/opt/nova/nova/objects/instance_info_cache.py", line 70, in get_by_instance_uuid
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging instance_uuid=instance_uuid)
2014-05-06 10:48:58.368 21210 TRACE nova.cells.messaging InstanceInfoCacheNotFound: Info cache for instance bb07522b-d705-4fc8-8045-e12de2affe2e could not be found.

Tracy Jones (tjones-i)
tags: added: compute
Revision history for this message
Tiago Mello (timello) wrote :

I'm wondering why the info_cache does not exist for that instance.

Revision history for this message
Sam Morrison (sorrison) wrote :

The instance failed to build for some reason and never got scheduled to a host or got an IP address etc. It went to error state.

Changed in nova:
assignee: nobody → Vladik Romanovsky (vladik-romanovsky)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/93860

Changed in nova:
status: New → In Progress
Revision history for this message
Vladik Romanovsky (vladik-romanovsky) wrote :

It looks, to me, like this is related to cells mostly.
All methods in cells messaging are trying to refresh the instance and fails, when it doesn't have the info_cache.
I think we can just handle the exception and not re-raise it, when it's delete operation related.

I've sent a patch, please take a look.
Thanks.

Changed in nova:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/93860
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=47898ba8f9526c88a03209dbc35a59d90b79e809
Submitter: Jenkins
Branch: master

commit 47898ba8f9526c88a03209dbc35a59d90b79e809
Author: Vladik Romanovsky <email address hidden>
Date: Mon May 12 17:24:48 2014 -0400

    Do not fail cell's instance deletion, if it's missing info_cache

    Currently the methods in cell messaging are trying to refresh the
    instance. However, in some corner cases info_cache is not being
    created for instances in ERROR state. This makes the delete
    operation, of such instances, to fail, while it should not.

    Handling the InstanceInfoCacheNotFound exception and not
    re-raising it, for delete operations.

    Closes-Bug: #1316373
    Change-Id: I33c33e3ac1180e8293d950d60fb126e325a2c0cf

Changed in nova:
status: In Progress → Fix Committed
tags: added: icehouse-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/113208

Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-3
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/icehouse)

Reviewed: https://review.openstack.org/113208
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=82cc3be42fbcb7b3088d15ed15af520ae3fa0cec
Submitter: Jenkins
Branch: stable/icehouse

commit 82cc3be42fbcb7b3088d15ed15af520ae3fa0cec
Author: Vladik Romanovsky <email address hidden>
Date: Mon May 12 17:24:48 2014 -0400

    Do not fail cell's instance deletion, if it's missing info_cache

    Currently the methods in cell messaging are trying to refresh the
    instance. However, in some corner cases info_cache is not being
    created for instances in ERROR state. This makes the delete
    operation, of such instances, to fail, while it should not.

    Handling the InstanceInfoCacheNotFound exception and not
    re-raising it, for delete operations.

    (cherry picked from commit 47898ba8f9526c88a03209dbc35a59d90b79e809)

    Conflicts:
            nova/tests/cells/test_cells_messaging.py

    Closes-Bug: #1316373
    Change-Id: I33c33e3ac1180e8293d950d60fb126e325a2c0cf

tags: added: in-stable-icehouse
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-3 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.