Instance's XXX_resize dir never be deleted if we resize a pre-grizzly instance in havana

Bug #1290294 reported by wangpan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
wangpan
Icehouse
Fix Released
Undecided
wangpan

Bug Description

reproduce steps:
1. create an instance under Folsom
2. update nova to Havana
3. resize the instance to another host
4. confirm the resize
5. examine the instance dir on source host

you will find the instance-0000xxxx_resize dir exists there which was not deleted while confirming resize.

the reason is that:
in the _cleanup_resize in libvirt driver:
def _cleanup_resize(self, instance, network_info):
        target = libvirt_utils.get_instance_path(instance) + "_resize"

we get the instance path by using get_instance_path method in libvirt utils,
but we check the original instance dir of pre-grizzly instances' before we return it,
if this instance is a resized one which original instance dir exists on another host(the dest host),
the wrong instance path with uuid will be returned, and then the `target` existing check will be failed,
then the instance-xxxx_resize dir will never be deleted.

def get_instance_path(instance, forceold=False, relative=False):
    """Determine the correct path for instance storage.

    This method determines the directory name for instance storage, while
    handling the fact that we changed the naming style to something more
    unique in the grizzly release.

    :param instance: the instance we want a path for
    :param forceold: force the use of the pre-grizzly format
    :param relative: if True, just the relative path is returned

    :returns: a path to store information about that instance
    """
    pre_grizzly_name = os.path.join(CONF.instances_path, instance['name'])
    if forceold or os.path.exists(pre_grizzly_name): ############### here we check the original instance dir, but if we have resized the instance to another host, this check will be failed, and a wrong dir with instance uuid will be returned.
        if relative:
            return instance['name']
        return pre_grizzly_name

    if relative:
        return instance['uuid']
    return os.path.join(CONF.instances_path, instance['uuid'])

wangpan (hzwangpan)
summary: Instance's XXX_resize dir never be deleted if we resize a pre-grizzly
- instance
+ instance in havana
Revision history for this message
wangpan (hzwangpan) wrote :

This is a terrible bug because if we resize this instance several times between two hosts, the qcow2 converted operation will not be implemented, and the remote disk copied by ssh/rsync is a qcow2 with backing file one.
you can think that:
1. hostA has an instance-a_resize dir(which is the residual dir after the first resize)
2. hostA also has instance-a dir, the running instance's dir, and the disk file is a whole qcow2 one without backing file
3. if we resize instance-a to hostB, we firstly mv instance-a to instance-a_resize, but if the instance-a_resize dir is exists there, we just mv instance-a to it(under the instance-a_resize we have an instance-a dir now), then the instance's dir is wrong
4. nova believe the instance-a's disk is whole qcow2 file without backing file(checking this before we mv the instance dir), so nova copy a wrong disk with backing file(the wrong residual disk file), and then the instance will become ERROR if the backing file is not exist on the dest hostB.

Changed in nova:
assignee: nobody → wangpan (hzwangpan)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/79288

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/79288
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b4964eb6a570e290545f95d45411dc8441985cd5
Submitter: Jenkins
Branch: master

commit b4964eb6a570e290545f95d45411dc8441985cd5
Author: Wangpan <email address hidden>
Date: Mon Mar 10 18:19:40 2014 +0800

    libvirt: return the correct instance path while cleanup_resize

    If we resized a pre-grizzly instance with grizzly or later nova
    to another host, while the resize confirmation process,
    _cleanup_resize will find the instance resize backup dir and
    delete it, but a wrong xxx_resize dir like ${uuid}_resize,
    instead of the correct ${name}_resize will be found.
    This is because the instance is a resized one which original
    instance dir exists on another host(the dest host),
    get_instance_path method could not find the original instance
    dir on the source host, so the path with uuid will be returned,
    and the `target` existing check in _cleanup_resize is failed,
    then the ${name}_resize dir will never be deleted.

    Closes-bug: #1290294
    Change-Id: I904b6751dec740e001f5ec29f637ef456528746f

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/104050

Changed in nova:
milestone: none → juno-2
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/icehouse)

Reviewed: https://review.openstack.org/104050
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=aeb71a88ae8d05ff6f5f3f092965f12369fec07a
Submitter: Jenkins
Branch: stable/icehouse

commit aeb71a88ae8d05ff6f5f3f092965f12369fec07a
Author: Wangpan <email address hidden>
Date: Mon Mar 10 18:19:40 2014 +0800

    libvirt: return the correct instance path while cleanup_resize

    If we resized a pre-grizzly instance with grizzly or later nova
    to another host, while the resize confirmation process,
    _cleanup_resize will find the instance resize backup dir and
    delete it, but a wrong xxx_resize dir like ${uuid}_resize,
    instead of the correct ${name}_resize will be found.
    This is because the instance is a resized one which original
    instance dir exists on another host(the dest host),
    get_instance_path method could not find the original instance
    dir on the source host, so the path with uuid will be returned,
    and the `target` existing check in _cleanup_resize is failed,
    then the ${name}_resize dir will never be deleted.

    Closes-bug: #1290294
    Change-Id: I904b6751dec740e001f5ec29f637ef456528746f
    (cherry picked from b4964eb6a570e290545f95d45411dc8441985cd5)

tags: added: in-stable-icehouse
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-2 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.