systemd-resolved configures no Current Scopes on start

Bug #1896772 reported by Dan Watkins
40
This bug affects 8 people
Affects Status Importance Assigned to Milestone
ifupdown (Ubuntu)
Fix Released
Undecided
Unassigned
Impish
Won't Fix
Undecided
Unassigned
Jammy
Fix Released
Undecided
Unassigned
isc-dhcp (Ubuntu)
Fix Released
High
Unassigned
Impish
Won't Fix
Low
Unassigned
Jammy
Triaged
Low
Unassigned
systemd (Ubuntu)
Invalid
High
Unassigned

Bug Description

Running groovy on the desktop, with the systemd packages that migrated today(/overnight EDT).

# Steps to reproduce:

1) `systemctl restart systemd-resolved.service`

(This is a minimal reproducer, but I first saw this after an apt upgrade of systemd.)

# Expected behaviour:

DNS continues to work, status looks like this:

Link 2 (enp5s0)
      Current Scopes: DNS
DefaultRoute setting: yes
       LLMNR setting: yes
MulticastDNS setting: no
  DNSOverTLS setting: no
      DNSSEC setting: no
    DNSSEC supported: no
  Current DNS Server: 192.168.1.1
         DNS Servers: 192.168.1.1
          DNS Domain: ~.
                      lan

# Actual behaviour:

DNS is unconfigured:

Link 2 (enp5s0)
      Current Scopes: none
DefaultRoute setting: no
       LLMNR setting: yes
MulticastDNS setting: no
  DNSOverTLS setting: no
      DNSSEC setting: no
    DNSSEC supported: no

# Workaround

Disconnecting and reconnecting my network connection restored DNS functionality.

ProblemType: Bug
DistroRelease: Ubuntu 20.10
Package: systemd 246.5-1ubuntu1
ProcVersionSignature: Ubuntu 5.8.0-18.19-generic 5.8.4
Uname: Linux 5.8.0-18-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair nvidia_modeset nvidia
ApportVersion: 2.20.11-0ubuntu45
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: i3
CurrentDmesg: Error: command ['dmesg'] failed with exit code 1: dmesg: read kernel buffer failed: Operation not permitted
Date: Wed Sep 23 09:05:42 2020
InstallationDate: Installed on 2019-05-07 (504 days ago)
InstallationMedia: Ubuntu 18.04.2 LTS "Bionic Beaver" - Release amd64 (20190210)
MachineType: Gigabyte Technology Co., Ltd. B450M DS3H
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.8.0-18-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash resume=UUID=73909634-a75d-42c9-8f66-a69138690756 pcie_aspm=off vt.handoff=7
RebootRequiredPkgs: gnome-shell
SourcePackage: systemd
SystemdDelta:
 [EXTENDED] /lib/systemd/system/rc-local.service → /lib/systemd/system/rc-local.service.d/debian.conf
 [EXTENDED] /lib/systemd/system/user@.service → /lib/systemd/system/user@.service.d/timeout.conf

 2 overridden configuration files found.
SystemdFailedUnits:
 Error: command ['systemctl', 'status', '--full', '●'] failed with exit code 4: Invalid unit name "●" escaped as "\xe2\x97\x8f" (maybe you should use systemd-escape?).
 Unit \xe2\x97\x8f.service could not be found.
 ------
 Error: command ['systemctl', 'status', '--full', '●'] failed with exit code 4: Invalid unit name "●" escaped as "\xe2\x97\x8f" (maybe you should use systemd-escape?).
 Unit \xe2\x97\x8f.service could not be found.
UpgradeStatus: Upgraded to groovy on 2020-06-22 (92 days ago)
dmi.bios.date: 01/25/2019
dmi.bios.release: 5.13
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: F4
dmi.board.asset.tag: Default string
dmi.board.name: B450M DS3H-CF
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrF4:bd01/25/2019:br5.13:svnGigabyteTechnologyCo.,Ltd.:pnB450MDS3H:pvrDefaultstring:rvnGigabyteTechnologyCo.,Ltd.:rnB450MDS3H-CF:rvrx.x:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: Default string
dmi.product.name: B450M DS3H
dmi.product.sku: Default string
dmi.product.version: Default string
dmi.sys.vendor: Gigabyte Technology Co., Ltd.

Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :

I haven't been able to reproduce in a lxd container or an EC2 instance; I don't have a convenient way of testing a different NetworkManager system, unfortunately.

Revision history for this message
Balint Reczey (rbalint) wrote :

I can't reproduce that on an up to date 20.10 laptop with wifi connection only.
Could you please add some more details that could help in reproduction?
If you downgrade to systemd 246.4-1ubuntu1 do you still observe this bug?
.

Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Balint Reczey (rbalint) wrote :

The latest upload (246.6-1ubuntu1) may have fixed this as well.

Revision history for this message
Dan Watkins (oddbloke) wrote : Re: [Bug 1896772] Re: systemd-resolved configures no Current Scopes on start

On Thu, Sep 24, 2020 at 09:42:28PM -0000, Balint Reczey wrote:
> The latest upload (246.6-1ubuntu1) may have fixed this as well.

This happened again just now when I upgraded my system to the new
systemd, so I assume not.

Here's a log snippet of restarting:

Sep 29 09:28:14 surprise systemd[1]: Starting Network Name Resolution...
Sep 29 09:28:15 surprise systemd-resolved[31479]: Positive Trust Anchors:
Sep 29 09:28:15 surprise systemd-resolved[31479]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
Sep 29 09:28:15 surprise systemd-resolved[31479]: Negative trust anchors: 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 23.172.in-addr.arpa 24.172.in-addr.arpa 25.172.in-addr.arpa 26.172.in-addr.arpa 27.172.in-addr.arpa 28.172.in-addr.arpa 29.172.in-addr.arpa 30.172.in-addr.arpa 31.172.in-addr.arpa 168.192.in-addr.arpa d.f.ip6.arpa corp home internal intranet lan local private test
Sep 29 09:28:15 surprise systemd-resolved[31479]: Using system hostname 'surprise'.
Sep 29 09:28:15 surprise systemd[1]: Started Network Name Resolution.

At this point, I do not have working DNS resolution. If I reconnect my
network interface, then I do get it, but I see this line in the log,
repeated multiple times:

Sep 29 09:28:23 surprise systemd-resolved[31479]: Failed to save link data /run/systemd/resolve/netif/2: Permission denied

/run/systemd/resolve is owned by systemd-resolve, but
/run/systemd/resolve/netif is owned by root.

Could this be related to the issue I'm observing?

Revision history for this message
Dan Watkins (oddbloke) wrote :

I've just tested: changing the ownership of /run/systemd/resolve/netif to systemd-resolve:systemd-resolve resolves (haha) this issue. The first restart of systemd-resolved after the change did not address it (because the permissions issue means that the state was not persisted); on a network interface reconnect, the state _is_ persisted, so future systemd-resolved restarts do not lose DNS resolution.

Revision history for this message
Dan Streetman (ddstreet) wrote :

How did resolve/netif get owned by root? It's created by systemd-resolved, which should be running as the systemd-resolve user, as that's the User= set in its service file

Revision history for this message
Dan Watkins (oddbloke) wrote :

> How did resolve/netif get owned by root?

I don't believe I've ever touched it before, so I'm not sure. I haven't rebooted since that last comment, I'll do that at some point today to check if ownership reverts to root.

If it does, what debugging can I perform to determine what's doing it?

Revision history for this message
Dan Watkins (oddbloke) wrote :

On Thu, Oct 01, 2020 at 01:41:46PM -0000, Dan Watkins wrote:
> > How did resolve/netif get owned by root?
>
> I don't believe I've ever touched it before, so I'm not sure. I haven't
> rebooted since that last comment, I'll do that at some point today to
> check if ownership reverts to root.

Ownership is `root` on boot; whatever is responsible for creating this
in /run is presumably to blame?

> If it does, what debugging can I perform to determine what's doing it?

Let me know!

Revision history for this message
Max (bubuta) wrote :

I've hit exactly the same problem:

tree -p -u -d -L 3 /run/systemd/
...
├── [drwxr-xr-x systemd-resolve] resolve
│   └── [drwxr-xr-x root ] netif
...

The issue happens with wi-fi interface. Could it be somehow related to iwd?

I can add workaround to force proper permissions on the folder but I'd rather digg into the root cause.

Changed in systemd (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Dan Watkins (oddbloke) wrote :

When investigating another issue, I found this line in my journal, repeated a few times:

nm-dispatcher[3938]: /etc/network/if-up.d/resolved: 12: mystatedir: not found

Not sure if that's related, but it seems suspicious at least.

Revision history for this message
Freddy Angel (fhangel) wrote :

I have the same issue:

   systemd-resolved: Failed to save link data /run/systemd/resolve/netif/3: Permission denied

The netif folder is indeed owned by root. If I change the ownership to systemd-resolved and restart the service, there is no error but as soon as I reboot the system, the ownership is reverted to root and the error is back.

Also, I searched the log for other resolved messages and found the error below as well:

    nm-dispatcher[4641]: /etc/network/if-up.d/resolved: 12: mystatedir: not found

Revision history for this message
janl (janl) wrote :

I still have this

  Failed to save link data /run/systemd/resolve/netif/2: Permission denied

problem on 21.10 when getting a dhcp lease. The netif directory:

  drwxr-xr-x 2 root root 40 nov. 24 2021 /run/systemd/resolve/netif

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

I see this on 22.04 after upgrading from 20.04.

$ journalctl |grep 'Failed to save link data'
Apr 17 15:25:52 hostname systemd-resolved[19095]: Failed to save link data /run/systemd/resolve/netif/3: Permission denied
Apr 17 15:25:52 hostname systemd-resolved[19095]: Failed to save link data /run/systemd/resolve/netif/3: Permission denied

$ ls -ld /run/systemd/resolve/netif
drwxr-xr-x 2 root root 40 Apr 17 14:46 /run/systemd/resolve/netif

(note, I had tried to restart systemd-resolved)

Revision history for this message
Jamie Strandboge (jdstrand) wrote (last edit ):

I grep'd for 'netif' in /etc and noticed:

$ sudo grep -r netif /etc
/etc/network/if-down.d/resolved: statedir=/run/systemd/resolve/netif
/etc/network/if-up.d/resolved: statedir=/run/systemd/resolve/netif
/etc/dhcp/dhclient-exit-hooks.d/resolved: statedir=/run/systemd/resolve/netif

/etc/network/if-up.d/resolved, /etc/network/if-down.d/resolved and /etc/dhcp/dhclient-exit-hooks.d/resolved all have code like this:

statedir=/run/systemd/resolve/netif
mkdir -p $statedir

but do not have a corresponding chown of /run/systemd/resolve/netif. There is a chown for `chown systemd-resolve:systemd-resolve "$statedir/$ifindex"` in /etc/network/if-up.d/resolved and /etc/dhcp/dhclient-exit-hooks.d/resolved.

This system has been upgraded many, many times (at least since yakkety). dhclient is being used for this interface. ifupdown is installed.

I adjusted both /etc/network/if-up.d/resolved and /etc/dhcp/dhclient-exit-hooks.d/resolved to have

  chown systemd-resolve:systemd-resolve "$statedir"

after the `mkdir -p $statedir`, then rebooted and the directory has the correct permissions. `journalctl --unit systemd-resolved.service` doesn't show the 'systemd-resolved[19095]: Failed to save link data /run/systemd/resolve/netif/3: Permission denied' errors on boot any more either.

UPDATE: my system uses NetworkManager which is what spawns dhclient. I noticed that if I purged the `ifupdown` package from universe (my system was using netplan and not /etc/network/interfaces already, so this was a safe operation for me), removed the chown I added to the /etc/dhcp/dhclient-exit-hooks.d/resolved and rebooted, the directory had the correct permissions. I think what is happening is that in the ifupdown case, something early in boot was calling /etc/network/if-up.d/resolved which created the directory with the wrong permissions, but with NetworkManager as the netplan renderer, the dhclient script is called later and the dir is created correctly. This feels racy and I believe the isc-dhcp-client package should be updated to include the chown.

As ifupdown is in universe, I'll prepare an upload for it that includes the chown which will hopefully help people who upgrade who happen to have it installed.

Changed in ifupdown (Ubuntu):
status: New → In Progress
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ifupdown - 0.8.36+nmu1ubuntu3

---------------
ifupdown (0.8.36+nmu1ubuntu3) jammy; urgency=medium

  * debian/if-up.d/resolved: also chown $statedir to
    systemd-resolve:systemd-resolve (LP: #1896772)

 -- Jamie Strandboge <email address hidden> Sun, 17 Apr 2022 21:21:49 +0000

Changed in ifupdown (Ubuntu):
status: In Progress → Fix Released
Changed in isc-dhcp (Ubuntu):
status: New → Triaged
Steve Langasek (vorlon)
tags: added: rls-jj-incoming
Changed in isc-dhcp (Ubuntu):
importance: Undecided → High
Changed in systemd (Ubuntu):
importance: Undecided → High
tags: added: fr-2319
Changed in ifupdown (Ubuntu Jammy):
status: New → Fix Released
Changed in isc-dhcp (Ubuntu Jammy):
status: New → Triaged
importance: Undecided → High
tags: removed: rls-jj-incoming
Revision history for this message
Lukas Märdian (slyon) wrote :

Thank you very much, Jamie, for your detailed analysis in #15!

I've applied the same fix to isc-dhcp https://launchpad.net/ubuntu/+source/isc-dhcp/4.4.1-2.3ubuntu3

We can consider SRUing this to Jammy and Impish, which are affect too. But it doesn't feel too critical, as systemd-resolved usually wins the race vs NetworkManager/dhclient, as you stated: "but with NetworkManager as the netplan renderer, the dhclient script is called later and the dir is created correctly."

I'm marking the systemd component as "Invalid", as the fix is needed in other packages.

no longer affects: systemd (Ubuntu Jammy)
Changed in systemd (Ubuntu):
status: Confirmed → Invalid
Changed in isc-dhcp (Ubuntu):
status: Triaged → Fix Committed
no longer affects: systemd (Ubuntu Impish)
Changed in isc-dhcp (Ubuntu Impish):
status: New → Triaged
Changed in isc-dhcp (Ubuntu Jammy):
importance: High → Low
Changed in isc-dhcp (Ubuntu Impish):
importance: Undecided → Low
Changed in ifupdown (Ubuntu Impish):
status: New → Triaged
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package isc-dhcp - 4.4.1-2.3ubuntu3

---------------
isc-dhcp (4.4.1-2.3ubuntu3) kinetic; urgency=medium

  * debian/resolved: chown $statedir to systemd-resolve (LP: #1896772)

 -- Lukas Märdian <email address hidden> Thu, 05 May 2022 10:27:34 +0200

Changed in isc-dhcp (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Lukas Märdian (slyon) wrote :

Impish is EOL

Changed in ifupdown (Ubuntu Impish):
status: Triaged → Won't Fix
Changed in isc-dhcp (Ubuntu Impish):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.