udev NIC renaming race with mlx5_core driver

Bug #2002445 reported by Nick Rosbrook
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Fix Released
Undecided
Nick Rosbrook
Focal
Fix Released
Medium
Nick Rosbrook
Jammy
Fix Released
Medium
Nick Rosbrook
Kinetic
Fix Released
Undecided
Nick Rosbrook
Lunar
Fix Released
Undecided
Nick Rosbrook

Bug Description

[Impact]
On systems with mellanox NICs, udev's NIC renaming races with the mlx5_core driver's own configuration of subordinate interfaces. When the kernel wins this race, the device cannot be renamed as udev has attempted, and this causes systemd-network-online.target to timeout waiting for links to be configured. This ultimately results in boot being delayed by about 2 minutes.

[Test Plan]
Repeated launches of Standard_D8ds_v5 instance types will generally hit this race around 1 in 10 runs. Create a vm snapshot with updated systemd from ppa:enr0n/systemd-245. Launch 100 Standard_D8ds_v5 instances with updated systemd. Assert not failure in cloud-init status and no 2 minute delay in network-online.target.

To check for failure symptom:
  - Assert that network-online.target isn't the longest pole from systemd-analyze blame.

To assert success condition during net rename busy race:
  - assert when "eth1" is still the primary device name, that two altnames are listed (preserving the altname due to the primary NIC rename being hit).

Sample script uses pycloudlib to create modified base image for test and launches 100 VMs of type Standard_D8ds_v5, counting both successes and any failures seen.

#!/usr/bin/env python3
# This file is part of pycloudlib. See LICENSE file for license information.
"""Basic examples of various lifecycle with an Azure instance."""

import json
import logging
import os
import sys
from enum import Enum

import pycloudlib

LOG = logging.getLogger()

base_cfg = """#cloud-config
ssh-import-id: [chad.smith, enr0n, falcojr, holmanb, aciba]
"""

# source: "deb [allow-insecure=yes] https://ppa.launchpadcontent.net/enr0n/systemd-245/ubuntu focal main"
# - apt install systemd udev -y --allow-unauthenticated

apt_cfg = """
# Add developer PPA
apt:
 sources:
   systemd-testing:
     source: {source}
# upgrade systemd after cloud-init is nearly done
runcmd:
 - apt install systemd udev -y --allow-unauthenticated
"""

debug_systemd_cfg = """
# Create systemd-udev debug override.conf in base image
write_files:
- path: /etc/systemd/system/systemd-networkd.service.d/override.conf
  owner: root:root
  defer: {defer}
  content: |
    [Service]
    Environment=SYSTEMD_LOG_LEVEL=debug

- path: /etc/systemd/system/systemd-udevd.service.d/override.conf
  owner: root:root
  defer: {defer}
  content: |
    [Service]
    Environment=SYSTEMD_LOG_LEVEL=debug
    LogRateLimitIntervalSec=0
"""

cloud_config = base_cfg + apt_cfg + debug_systemd_cfg
cloud_config2 = base_cfg + debug_systemd_cfg

class BootCondition(Enum):
    SUCCESS_WITHOUT_RENAME_RACE = "network bringup success without rename race"
    SUCCESS_WITH_RENAME_RACE = "network bringup success rename race condition"
    ERROR_NETWORK_TIMEOUT = "error: timeout on systemd-networkd-wait-online"

def batch_launch_vm(
    client, instance_type, image_id, user_data, instance_count=5
):
    instances = []
    while len(instances) < instance_count:
        instances.append(
            client.launch(
                image_id=image_id,
                instance_type=instance_type,
                user_data=user_data,
            )
        )
    return instances

def get_boot_condition(test_idx, instance):
    blame = instance.execute("systemd-analyze blame").splitlines()
    try:
        LOG.info(
            f"--- Attempt {test_idx} ssh ubuntu@{instance.ip} Blame: {blame[0]}"
        )
    except IndexError:
        LOG.warning("--- Attempt {test_idx} Empty blame {blame}?")
        LOG.info(instance.execute("systemd-analyze blame"))
        blame = [""]
    altnames_persisted = False
    ip_addr = json.loads(instance.execute("ip -j addr").stdout)
    rename_race_present = False # set true when we see eth1 not renamed
    for d in ip_addr:
        if d["ifname"] == "eth1":
            rename_race_present = True
            if len(d.get("altnames", [])) > 1:
                LOG.info(
                    f"--- SUCCESS persisting altnames {d['altnames']} due to rename race on resource busy on {d['ifname']}"
                )
                altnames_persisted = True
            else:
                LOG.error(
                    f"FAILURE: to preserve altnames for {d['ifname']}. Only preserved {d.get('altnames', [])}"
                )
                LOG.info(
                    instance.execute(
                        "journalctl -u systemd-udevd.service -b 0 --no-pager"
                    )
                )
    LOG.info(
        "\n".join([f'{d["ifname"]}: {d.get("altnames")}' for d in ip_addr])
    )
    if "systemd-networkd-wait-online.service" not in blame[0]:
        if rename_race_present:
            return BootCondition.SUCCESS_WITH_RENAME_RACE, altnames_persisted
        else:
            LOG.info(f"Destroying instance, normal boot seen: {blame[0]}")
            return (
                BootCondition.SUCCESS_WITHOUT_RENAME_RACE,
                altnames_persisted,
            )
    else:
        LOG.info(
            f"--- Attempt {attempt} found delayed instance boot: {blame[0]}: ssh ubuntu@{instance.ip}"
        )
        r = instance.execute(
            "journalctl -u systemd-udevd.service -b 0 --no-pager"
        )
        LOG.info(r)
        if "Failure to rename" in str(r):
            LOG.info(f"Found rename refusal!: {r[0]}")
        return BootCondition.ERROR_NETWORK_TIMEOUT, altnames_persisted

def debug_systemd_image_launch_overlake_v5_with_snapshot(
    release="jammy", with_ppa=False
):
    """Test overlake v5 timeouts

    test procedure:
    - Launch base jammy image
    - enable ppa:enr0n/systemd-245 and systemd/udev debugging
    - cloud-init clean --logs && deconfigure waalinux agent before shutdown
    - snapshot a base image
    - launch v5 system from snapshot
    - check systemd-analyze for expected timeout
    """
    apt_source = (
        '"deb http://archive.ubuntu.com/ubuntu $RELEASE-proposed main"'
    )
    if with_ppa:
        apt_source = '"deb [allow-insecure=yes] https://ppa.launchpadcontent.net/enr0n/{ppa}/ubuntu $RELEASE main"'
        ppas = {
            "focal": "systemd-245",
            "jammy": "systemd-249",
            "kinetic": "systemd-251",
        }
        apt_source = apt_source.format(ppa=ppas.get(release, "systemd"))

    client = pycloudlib.Azure(tag="azure")

    image_id = client.daily_image(release=release)
    pub_path = "/home/ubuntu/.ssh/id_rsa.pub"
    priv_path = "/home/ubuntu/.ssh/id_rsa"

    client.use_key(pub_path, priv_path)

    base_instance = client.launch(
        image_id=image_id,
        instance_type="Standard_DS1_v2",
        user_data=cloud_config.format(defer="true", source=apt_source),
    )

    LOG.info(f"base instance: ssh ubuntu@{base_instance.ip}")
    base_instance.wait()
    LOG.info(base_instance.execute("apt policy systemd"))
    snapshotted_image_id = client.snapshot(base_instance)

    reproducer = False
    success_count_with_race = 0
    success_count_no_race = 0
    failure_count_network_delay = 0
    failure_count_no_altnames = 0
    tests_launched = 0
    TEST_SUMMARY_TMPL = """
    ----- Test run complete: {tests_launched} attempted -----
    Successes without rename race: {success_count_no_race}
    Successes with rename race and preserved altname: {success_count_with_race}
    Failures due to network delay: {failure_count_network_delay}
    Failures due to no altnames persisted: {failure_count_no_altnames}
    ===================================
    """
    instances = [base_instance]
    for batch_count in [10] * 10:
        test_instances = batch_launch_vm(
            client=client,
            image_id=snapshotted_image_id,
            instance_type="Standard_D8ds_v5",
            user_data=cloud_config.format(defer="false", source=apt_source),
            instance_count=batch_count,
        )
        for test_idx, instance in enumerate(test_instances, tests_launched):
            LOG.info(f"--- Attempt {test_idx} ssh ubuntu@{instance.ip}")
            instance.wait()
            boot_condition, altnames_persisted = get_boot_condition(
                test_idx, instance
            )
            if boot_condition == BootCondition.SUCCESS_WITH_RENAME_RACE:
                instance.delete(wait=False)
                success_count_with_race += 1
                if not altnames_persisted:
                    failure_count_no_altnames += 1
            elif boot_condition == BootCondition.SUCCESS_WITHOUT_RENAME_RACE:
                instance.delete(wait=False)
                success_count_no_race += 1
                if not altnames_persisted:
                    failure_count_no_altnames += 1
            elif boot_condition == BootCondition.ERROR_NETWORK_TIMEOUT:
                instances.append(instance)
                failure_count_network_delay += 1
                if not altnames_persisted:
                    failure_count_no_altnames += 1
            else:
                raise RuntimeError(f"Invalid boot condition: {boot_condition}")
        tests_launched += len(test_instances)
    LOG.info(
        TEST_SUMMARY_TMPL.format(
            success_count_with_race=success_count_with_race,
            success_count_no_race=success_count_no_race,
            failure_count_network_delay=failure_count_network_delay,
            failure_count_no_altnames=failure_count_no_altnames,
            tests_launched=tests_launched,
        )
    )
    base_instance.delete(wait=False)

if __name__ == "__main__":
    # Avoid polluting the log with azure info
    logging.getLogger("paramiko").setLevel(logging.WARNING)
    logging.getLogger("pycloudlib").setLevel(logging.WARNING)
    logging.getLogger("adal-python").setLevel(logging.WARNING)
    logging.getLogger("cli.azure.cli.core").setLevel(logging.WARNING)
    release = "jammy" if len(sys.argv) < 2 else sys.argv[1]
    with_ppa = os.environ.get("WITH_PPA", "").lower() in ["y", "true", "1"]
    prefix = "ppa" if with_ppa else "sru"
    logging.basicConfig(
        filename=f"{prefix}-systemd-{release}.log", level=logging.INFO
    )
    debug_systemd_image_launch_overlake_v5_with_snapshot(release, with_ppa)

[Where problems could occur]
The patches effectively make it so that if a interface cannot be renamed from udev, then the new name is left as an alternative name as a fallback. If problems occur, it would be related to device renaming, and particularly related to the devices alternative names.

For Jammy and Kinetic, there are additional patches in udev. These patches clean up/revert device properties that were changed as a part of the rename attempt. If there were regressions due to these patches, we would likely see erroneous device properties (e.g. shown by udevadm info) on network devices after a rename failure.

Related branches

Nick Rosbrook (enr0n)
description: updated
Changed in systemd (Ubuntu Focal):
status: New → Triaged
importance: Undecided → Medium
Changed in systemd (Ubuntu Jammy):
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Lukas Märdian (slyon) wrote :
Nick Rosbrook (enr0n)
Changed in systemd (Ubuntu Lunar):
status: New → Fix Committed
Chad Smith (chad.smith)
description: updated
description: updated
Chad Smith (chad.smith)
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.2 KiB)

This bug was fixed in the package systemd - 252.5-2ubuntu1

---------------
systemd (252.5-2ubuntu1) lunar; urgency=medium

  * Merge 252.5-2 from Debian unstable
    - Drop test-handle-Debian-s-etc-default-locale-in-testsuite-74.f.patch.
      Applied upstream: https://github.com/systemd/systemd/commit/9b42646b22
      File: debian/patches/test-handle-Debian-s-etc-default-locale-in-testsuite-74.f.patch
      https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=1b0789416172ec60d8086fe2b458b5396bb7e857
    - Drop test-make-sure-mount-point-exists-in-testsuite-64.sh.patch.
      Applied upstream: https://github.com/systemd/systemd/commit/07e4787106
      File: debian/patches/test-make-sure-mount-point-exists-in-testsuite-64.sh.patch
      https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=f97b2d5ae1a1f35668c4648f1c7fc715a588de50
    - Drop test-remove-no-longer-needed-quirk-for-set-locale-on-Debi.patch.
      Fixed upstream: https://github.com/systemd/systemd-stable/commit/1c325f6d7f
      File: debian/patches/test-remove-no-longer-needed-quirk-for-set-locale-on-Debi.patch
      https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=5f85226d61393c08d7ea51c2f28db7fd4c79bcc6
  * udev: avoid NIC renaming race with kernel (LP: #2002445)
    Files:
    - debian/patches/lp2002445-sd-netlink-add-a-test-for-rtnl_set_link_name.patch
    - debian/patches/lp2002445-sd-netlink-do-not-swap-old-name-and-alternative-name.patch
    - debian/patches/lp2002445-sd-netlink-restore-altname-on-error-in-rtnl_set_link_name.patch
    - debian/patches/lp2002445-test-network-add-a-test-for-renaming-device-to-current-al.patch
    - debian/patches/lp2002445-udev-attempt-device-rename-even-if-interface-is-up.patch
    - debian/patches/lp2002445-udev-net-allow-new-link-name-as-an-altname-before-renamin.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=58d29c2b376f03c44ed5a719877c95b332018cdc
  * Deny-list TEST-74-AUX-UTILS on s390x.
    Since this currently is only known to fail on the autopkgtest
    infrastructure, we believe this is a temporary issue.
    File: debian/patches/Deny-list-TEST-74-AUX-UTILS-on-s390x.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=a3a059d86e2fe3a104419ae2afcab557171f9809

systemd (252.5-2) unstable; urgency=medium

  * Fix boot-and-services autopkgtest.

systemd (252.5-1) unstable; urgency=medium

  [ Nick Rosbrook ]
  * debian/tests: remove systemd-fsckd autopkgtest. This test never runs
    in Debian autopkgtest because of missing machine isolation
    requirements, and it nevers runs in Ubuntu because: SKIP: root file
    system is being checked by initramfs already Since the test is not
    providing any good feedback, and generally has not been maintained,
    let's just remove it.

  [ Luca Boccassi ]
  * New upstream version 252.5
  * Drop patches merged in v252.5
  * Refresh patches
  * Set default status format to 'combined': show both unit name and
    description in logs/boot messages

systemd (252.4-2) unstable; urgency=medium

  [ Michael Biebl ]
  * Refresh patches
  * Tweak descript...

Read more...

Changed in systemd (Ubuntu Lunar):
status: Fix Committed → Fix Released
Nick Rosbrook (enr0n)
Changed in systemd (Ubuntu Kinetic):
status: New → Triaged
Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello Nick, or anyone else affected,

Accepted systemd into kinetic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/251.4-1ubuntu7.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-kinetic to verification-done-kinetic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-kinetic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Kinetic):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-kinetic
Revision history for this message
Steve Langasek (vorlon) wrote :

Hello Nick, or anyone else affected,

Accepted systemd into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/249.11-0ubuntu3.8 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Jammy):
status: Triaged → Fix Committed
tags: added: verification-needed-jammy
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (systemd/251.4-1ubuntu7.2)

All autopkgtests for the newly accepted systemd (251.4-1ubuntu7.2) for kinetic have finished running.
The following regressions have been reported in tests triggered by the package:

openrazer/3.4.0+dfsg-1ubuntu1 (i386)
asterisk/1:18.14.0~dfsg+~cs6.12.40431414-1 (arm64)
network-manager/1.40.0-1ubuntu2 (amd64)
gvfs/1.50.2-2 (amd64)
freedombox/unknown (armhf)
php8.1/unknown (s390x)
lemonldap-ng/2.0.14+ds-1 (s390x, arm64)
udisks2/2.9.4-3 (amd64)
mandos/1.8.15-1 (arm64)
libsoup3/3.2.0-1 (ppc64el)
postgresql-common/242ubuntu1 (s390x)
nftables/1.0.5-1 (arm64)
mutter/43.0-1ubuntu4 (amd64)
ostree/2022.5-3 (s390x)
dropbear/2022.82-4 (armhf)
lighttpd/1.4.65-2ubuntu1.1 (s390x)
exim4/4.96-3ubuntu1.1 (ppc64el)
resource-agents/unknown (amd64)
liblinux-systemd-perl/1.201600-3build1 (s390x, arm64)
tinyssh/20190101-1ubuntu1 (s390x)
linux-lowlatency/5.19.0-1020.21 (amd64, arm64)
mediawiki/1:1.35.7-1 (s390x)
linux-oem-5.17/5.17.0-1003.3 (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/kinetic/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (systemd/249.11-0ubuntu3.8)

All autopkgtests for the newly accepted systemd (249.11-0ubuntu3.8) for jammy have finished running.
The following regressions have been reported in tests triggered by the package:

linux-intel-iotg/5.15.0-1027.32 (amd64)
linux-lowlatency/5.15.0-68.75 (arm64, amd64)
systemd/249.11-0ubuntu3.8 (arm64, ppc64el)
openrazer/3.2.0+dfsg-3 (i386)
multipath-tools/0.8.8-1ubuntu1.22.04.1 (s390x)
netplan.io/0.105-0ubuntu2~22.04.1 (amd64)
fwupd/1.7.9-1~22.04.1 (armhf)
udisks2/2.9.4-1ubuntu2 (amd64)
zfs-linux/unknown (s390x)
dbus/1.12.20-2ubuntu4.1 (s390x)
linux-ibm/5.15.0-1027.30 (amd64)
samba/unknown (s390x)
tang/11-1 (arm64)
gvfs/1.48.2-0ubuntu1 (arm64)
php8.1/unknown (s390x)
linux-nvidia/5.15.0-1018.18 (amd64)
csync2/2.0-25-gc0faaf9-1 (s390x)
tinyssh/20190101-1ubuntu1 (s390x)
mosquitto/2.0.11-1ubuntu1 (amd64)
casync/2+20201210-1build1 (ppc64el)
linux-gke/5.15.0-1029.34 (arm64, amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/jammy/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Nick Rosbrook (enr0n) wrote :

The fix is incomplete for jammy and newer, and we will need additional patches from upstream.

tags: added: verification-failed-jammy verification-failed-kinetic
removed: verification-needed-jammy verification-needed-kinetic
Changed in systemd (Ubuntu Lunar):
status: Fix Released → In Progress
Changed in systemd (Ubuntu Jammy):
status: Fix Committed → In Progress
Changed in systemd (Ubuntu Kinetic):
status: Fix Committed → In Progress
Changed in systemd (Ubuntu Jammy):
assignee: nobody → Nick Rosbrook (enr0n)
Changed in systemd (Ubuntu Kinetic):
assignee: nobody → Nick Rosbrook (enr0n)
Changed in systemd (Ubuntu Lunar):
assignee: nobody → Nick Rosbrook (enr0n)
Nick Rosbrook (enr0n)
description: updated
Lukas Märdian (slyon)
Changed in systemd (Ubuntu Lunar):
status: In Progress → Fix Committed
Changed in systemd (Ubuntu Focal):
assignee: nobody → Mustafa Kemal Gilor (mustafakemalgilor)
status: Triaged → In Progress
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello Nick, or anyone else affected,

Accepted systemd into kinetic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/251.4-1ubuntu7.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-kinetic to verification-done-kinetic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-kinetic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Kinetic):
status: In Progress → Fix Committed
tags: added: verification-needed-kinetic
removed: verification-failed-kinetic
Changed in systemd (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed-jammy
removed: verification-failed-jammy
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello Nick, or anyone else affected,

Accepted systemd into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/249.11-0ubuntu3.9 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Nick Rosbrook (enr0n)
Changed in systemd (Ubuntu Focal):
assignee: Mustafa Kemal Gilor (mustafakemalgilor) → Nick Rosbrook (enr0n)
Revision history for this message
Chad Smith (chad.smith) wrote :

Test script referenced in the bug description
https://paste.ubuntu.com/p/Xkc7bSZ8fB/

description: updated
Revision history for this message
Chad Smith (chad.smith) wrote :

Jammy success for proposed launches, rename race seen and properly handled without delaying boot time due to systemd-networkd-wait-online.target.

 *** 249.11-0ubuntu3.9 500
        500 http://archive.ubuntu.com/ubuntu jammy-proposed/main amd64 Packages
        100 /var/lib/dpkg/status

    ----- Test run complete: 100 attempted -----
    Successes without rename race: 93
    Successes with rename race and preserved altname: 7
    Failures due to network delay: 0
    Failures due to no altnames persisted: 93
    ===================================

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 252.5-2ubuntu3

---------------
systemd (252.5-2ubuntu3) lunar; urgency=medium

  * udev: gracefully handle rename failures (LP: #2002445)
    Files:
    - debian/patches/lp2002445/core-device-ignore-failed-uevents.patch
    - debian/patches/lp2002445/sd-device-introduce-device_get_property_int.patch
    - debian/patches/lp2002445/sd-device-make-device_set_syspath-clear-sysname-and-sysnu.patch
    - debian/patches/lp2002445/udev-restore-syspath-and-properties-on-failure.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=79536dbb165dbcc402629684e0911693626df5b1

 -- Nick Rosbrook <email address hidden> Mon, 20 Mar 2023 10:17:24 -0400

Changed in systemd (Ubuntu Lunar):
status: Fix Committed → Fix Released
Revision history for this message
Chad Smith (chad.smith) wrote :

Kinetic success for proposed launches, rename race seen and properly handled without delaying boot time due to systemd-networkd-wait-online.target.

 *** 251.4-1ubuntu7.3 500
        500 http://archive.ubuntu.com/ubuntu kinetic-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
...
    ----- Test run complete: 100 attempted -----
    Successes without rename race: 81
    Successes with rename race and preserved altname: 19
    Failures due to network delay: 0
    Failures due to no altnames persisted: 81
    ===================================

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello Nick, or anyone else affected,

Accepted systemd into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/245.4-4ubuntu3.21 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed-focal
Nick Rosbrook (enr0n)
tags: added: verification-done-jammy verification-done-kinetic
removed: verification-needed-jammy verification-needed-kinetic
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (systemd/249.11-0ubuntu3.9)

All autopkgtests for the newly accepted systemd (249.11-0ubuntu3.9) for jammy have finished running.
The following regressions have been reported in tests triggered by the package:

casync/2+20201210-1build1 (ppc64el)
dpdk/21.11.3-0ubuntu0.22.04.1 (arm64)
fwupd/1.7.9-1~22.04.1 (armhf)
init-system-helpers/unknown (s390x)
initramfs-tools/0.140ubuntu13.1 (amd64)
libsdl2/2.0.20+dfsg-2ubuntu1.22.04.1 (s390x)
libsoup3/3.0.7-0ubuntu1 (s390x)
libvirt/8.0.0-1ubuntu7.4 (s390x)
linux-lowlatency-hwe-5.19/5.19.0-1021.22~22.04.1 (amd64, arm64)
linux-nvidia/5.15.0-1018.18 (amd64)
mutter/42.5-0ubuntu1 (amd64)
netplan.io/0.105-0ubuntu2~22.04.3 (arm64, s390x)
network-manager/1.36.6-0ubuntu2 (s390x)
nextepc/unknown (s390x)
open-iscsi/2.1.5-1ubuntu1 (s390x)
openrazer/3.2.0+dfsg-3 (i386)
pdns-recursor/unknown (s390x)
postgresql-14/unknown (s390x)
pystemd/0.7.0-5build1 (s390x)
samba/unknown (s390x)
strongswan/unknown (s390x)
systemd/249.11-0ubuntu3.9 (s390x)
umockdev/0.17.7-1 (s390x)
zfs-linux/unknown (s390x)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/jammy/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (systemd/251.4-1ubuntu7.3)

All autopkgtests for the newly accepted systemd (251.4-1ubuntu7.3) for kinetic have finished running.
The following regressions have been reported in tests triggered by the package:

csync2/2.0-42-g83b3644-1 (s390x)
dpdk/21.11.3-0ubuntu0.22.10.1 (arm64)
initramfs-tools/0.140ubuntu17 (s390x)
lemonldap-ng/2.0.14+ds-1 (s390x)
liblinux-systemd-perl/unknown (s390x)
libsoup3/3.2.0-1 (armhf, s390x)
libvirt/unknown (s390x)
libvirt-dbus/unknown (s390x)
lighttpd/1.4.65-2ubuntu1.1 (s390x)
linux-lowlatency/5.19.0-1021.22 (amd64, arm64)
mutter/43.0-1ubuntu4 (amd64)
netplan.io/0.105-0ubuntu2.2 (s390x)
openrazer/3.4.0+dfsg-1ubuntu1 (i386)
openvpn/unknown (s390x)
procps/unknown (s390x)
prometheus-postfix-exporter/unknown (s390x)
remctl/3.18-1build1 (s390x)
resource-agents/unknown (s390x)
snapd/unknown (s390x)
tinyssh/20190101-1ubuntu1 (arm64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/kinetic/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (systemd/245.4-4ubuntu3.21)

All autopkgtests for the newly accepted systemd (245.4-4ubuntu3.21) for focal have finished running.
The following regressions have been reported in tests triggered by the package:

apt/2.0.9 (armhf)
bilibop/unknown (s390x)
freeipa/unknown (s390x)
fwupd/1.7.9-1~20.04.1 (armhf)
libsoup2.4/2.70.0-1 (s390x)
libusb-1.0/unknown (s390x)
libvirt/unknown (s390x)
linux-aws-5.11/blacklisted (amd64, arm64)
linux-aws-5.13/blacklisted (amd64, arm64)
linux-aws-5.15/5.15.0-1033.37~20.04.1 (arm64)
linux-aws-5.15/unknown (amd64)
linux-aws-5.8/blacklisted (amd64)
linux-azure-5.11/blacklisted (amd64, arm64)
linux-azure-5.13/blacklisted (arm64)
linux-azure-5.8/blacklisted (amd64)
linux-azure-cvm/5.4.0-1105.111+cvm1 (amd64)
linux-gcp-5.11/blacklisted (amd64)
linux-gcp-5.13/blacklisted (amd64)
linux-gcp-5.15/5.15.0-1030.37~20.04.1 (amd64)
linux-gcp-5.15/5.15.0-1031.38~20.04.1 (arm64)
linux-gcp-5.8/blacklisted (amd64)
linux-gke-5.15/5.15.0-1029.34~20.04.1 (amd64, arm64)
linux-gkeop-5.15/5.15.0-1016.21~20.04.1 (amd64)
linux-hwe-5.11/blacklisted (amd64, arm64, armhf, ppc64el, s390x)
linux-hwe-5.13/blacklisted (amd64, arm64, armhf, ppc64el, s390x)
linux-hwe-5.15/5.15.0-69.76~20.04.1 (amd64, armhf)
linux-hwe-5.8/blacklisted (amd64, arm64, ppc64el, s390x)
linux-ibm/5.4.0-1046.51 (amd64)
linux-intel-5.13/blacklisted (amd64)
linux-intel-iotg-5.15/5.15.0-1027.32~20.04.1 (amd64)
linux-oem-5.10/blacklisted (amd64)
linux-oem-5.13/blacklisted (amd64)
linux-oem-5.14/5.14.0-1059.67 (amd64)
linux-oem-5.6/blacklisted (amd64)
linux-oracle-5.11/blacklisted (amd64)
linux-oracle-5.13/blacklisted (amd64, arm64)
munin/2.0.56-1ubuntu1 (arm64, ppc64el)
pdns-recursor/unknown (s390x)
php7.4/unknown (s390x)
polkit-qt-1/unknown (s390x)
puppet/5.5.10-4ubuntu3 (s390x)
tinyssh/unknown (s390x)
udisks2/2.8.4-1ubuntu2 (amd64)
upower/0.99.11-1build2 (s390x)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/focal/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Nick Rosbrook (enr0n) wrote :

I used the test script to verify that 245.4-4ubuntu3.21 from focal-proposed fixes the issue:

    ----- Test run complete: 29 attempted -----
    Successes without rename race: 27
    Successes with rename race and preserved altname: 2
    Failures due to network delay: 0
    Failures due to no altnames persisted: 0
    ===================================

See attached log for full output.

The customer tested systemd and udev from 245.4-4ubuntu3.21 in their own environment and confirmed the fix as well.

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 251.4-1ubuntu7.3

---------------
systemd (251.4-1ubuntu7.3) kinetic; urgency=medium

  * udev: gracefully handle rename failures (LP: #2002445)
    Files:
    - debian/patches/lp2002445/core-device-ignore-failed-uevents.patch
    - debian/patches/lp2002445/sd-device-introduce-device_get_property_int.patch
    - debian/patches/lp2002445/sd-device-make-device_set_syspath-clear-sysname-and-sysnu.patch
    - debian/patches/lp2002445/udev-restore-syspath-and-properties-on-failure.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=deb435fbb84fde1fd39da47231a7473fc2a412e8

systemd (251.4-1ubuntu7.2) kinetic; urgency=medium

  * network/dhcp4: accept local subnet routes from DHCP (LP: #2004478)
    File: debian/patches/lp2004478-network-dhcp4-accept-local-subnet-routes-from-DHCP.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=751bac59b405025964d76c4ef8e0603457a605af
  * udev: avoid NIC renaming race with kernel (LP: #2002445)
    Files:
    - debian/patches/lp2002445/sd-netlink-add-a-test-for-rtnl_set_link_name.patch
    - debian/patches/lp2002445/sd-netlink-do-not-swap-old-name-and-alternative-name.patch
    - debian/patches/lp2002445/sd-netlink-restore-altname-on-error-in-rtnl_set_link_name.patch
    - debian/patches/lp2002445/udev-attempt-device-rename-even-if-interface-is-up.patch
    - debian/patches/lp2002445/udev-net-allow-new-link-name-as-an-altname-before-renamin.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=ffb1e85fdd3f0fe9b158b28a95cfa6d241fcbe70

 -- Nick Rosbrook <email address hidden> Mon, 20 Mar 2023 10:25:23 -0400

Changed in systemd (Ubuntu Kinetic):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for systemd has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 249.11-0ubuntu3.9

---------------
systemd (249.11-0ubuntu3.9) jammy; urgency=medium

  * udev: gracefully handle rename failures (LP: #2002445)
    Files:
    - debian/patches/lp2002445/core-device-ignore-failed-uevents.patch
    - debian/patches/lp2002445/sd-device-introduce-device_get_property_int.patch
    - debian/patches/lp2002445/sd-device-make-device_set_syspath-clear-sysname-and-sysnu.patch
    - debian/patches/lp2002445/udev-restore-syspath-and-properties-on-failure.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=a7ad4a9fc708500c61e3b8127f112d8c90049b2c

systemd (249.11-0ubuntu3.8) jammy; urgency=medium

  * network/dhcp4: accept local subnet routes from DHCP (LP: #2004478)
    File: debian/patches/lp2004478-network-dhcp4-accept-local-subnet-routes-from-DHCP.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=96928d5f45ebbfe682b47e842d63506fa0ac9583
  * udev: avoid NIC renaming race with kernel (LP: #2002445)
    Files:
    - debian/patches/lp2002445/sd-netlink-add-a-test-for-rtnl_set_link_name.patch
    - debian/patches/lp2002445/sd-netlink-do-not-swap-old-name-and-alternative-name.patch
    - debian/patches/lp2002445/sd-netlink-restore-altname-on-error-in-rtnl_set_link_name.patch
    - debian/patches/lp2002445/udev-attempt-device-rename-even-if-interface-is-up.patch
    - debian/patches/lp2002445/udev-net-allow-new-link-name-as-an-altname-before-renamin.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=20dc4d51a340669c26c446c23b5a84516e82ea74
  * network: create stacked netdevs after the underlying link is (LP: #2000880)
    File: debian/patches/lp2000880-network-create-stacked-netdevs-after-the-underlying-link-.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=ab620e709f3f62eda86af26fd66c00d6e5165a25
  * Enable /dev/sgx_vepc access for the group 'sgx' (LP: #2009502)
    File: debian/patches/lp2009502-Enable-dev-sgx_vepc-access-for-the-group-sgx.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=434480ae4059a16ccbde9613be0c26ff1983cc3a

 -- Nick Rosbrook <email address hidden> Mon, 20 Mar 2023 10:32:08 -0400

Changed in systemd (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 245.4-4ubuntu3.21

---------------
systemd (245.4-4ubuntu3.21) focal; urgency=medium

  * udev: avoid NIC renaming race with kernel (LP: #2002445)
    Files:
    - debian/patches/lp2002445-netlink-do-not-fail-when-new-interface-name-is-already-us.patch
    - debian/patches/lp2002445-netlink-introduce-rtnl_get-delete_link_alternative_names.patch
    - debian/patches/lp2002445-sd-netlink-restore-altname-on-error-in-rtnl_set_link_name.patch
    - debian/patches/lp2002445-udev-attempt-device-rename-even-if-interface-is-up.patch
    - debian/patches/lp2002445-udev-net-allow-new-link-name-as-an-altname-before-renamin.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=69ab4a02e828e20ea0ddbd75179324df7a8d1175
  * test-seccomp: accept ENOSYS from sysctl(2) too (LP: #1933090)
    Thanks to Roxana Nicolescu
    File: debian/patches/lp1933090-test-seccomp-accept-ENOSYS-from-sysctl-2-too.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=adaddd1441370ebcdb8bc33d7406b95d85b744f9
  * debian/test: ignore systemd-remount-fs.service failure in containers (LP: #1991285)
    File: debian/tests/boot-and-services
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=264bdc86f1e4dcd10e8d914d095581c54c33199a

 -- Nick Rosbrook <email address hidden> Wed, 15 Mar 2023 11:04:15 -0400

Changed in systemd (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.