failed to install packages that are not in pool, flawed network detection

Bug #2004659 reported by Ken VanDine
92
This bug affects 9 people
Affects Status Importance Assigned to Milestone
subiquity
Fix Released
Undecided
Dan Bungert
ubuntu-desktop-installer
Fix Released
Undecided
Unassigned
curtin (Ubuntu)
Invalid
Medium
Unassigned

Bug Description

Failed to install missing packages

ProblemType: Bug
DistroRelease: Ubuntu 23.04
Package: subiquity (unknown)
ProcVersionSignature: Ubuntu 5.19.0-21.21-generic 5.19.7
Uname: Linux 5.19.0-21-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.24.0-0ubuntu2
Architecture: amd64
CasperMD5CheckResult: pass
CasperVersion: 1.479
CurtinAptConfig: /var/log/installer/subiquity-curtin-apt.conf
Date: Fri Feb 3 15:01:26 2023
ExecutablePath: /snap/ubuntu-desktop-installer/761/bin/subiquity/subiquity/cmd/server.py
InterpreterPath: /snap/ubuntu-desktop-installer/761/usr/bin/python3.8
LiveMediaBuild: Ubuntu 23.04 "Lunar Lobster" - Alpha amd64 (20230203)
MachineType: FUJITSU LIFEBOOK E744
ProcAttrCurrent: snap.hostname-desktop-installer.subiquity-server (complain)
ProcCmdline: /snap/hostname-desktop-installer/761/usr/bin/python3.8 -m subiquity.cmd.server --use-os-prober --storage-version=2 --postinst-hooks-dir=/snap/hostname-desktop-installer/761/etc/subiquity/postinst.d --autoinstall=
ProcEnviron:
 LANG=C.UTF-8
 PATH=(custom, no user)
ProcKernelCmdLine: BOOT_IMAGE=/casper/vmlinuz layerfs-path=minimal.standard.live.squashfs maybe-ubiquity --- quiet splash
Python3Details: /usr/bin/python3.10, Python 3.10.9, python3-minimal, 3.10.6-1ubuntu1
PythonDetails: N/A
SnapChannel:

SnapRevision: 761
SnapUpdated: False
SnapVersion: 0+git.d97c7eb6
SourcePackage: subiquity
Title: install failed crashed with CalledProcessError
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 08/06/2015
dmi.bios.release: 1.23
dmi.bios.vendor: FUJITSU // Phoenix Technologies Ltd.
dmi.bios.version: Version 1.23
dmi.board.name: FJNB26F
dmi.board.vendor: FUJITSU
dmi.board.version: P3
dmi.chassis.type: 10
dmi.chassis.vendor: FUJITSU
dmi.modalias: dmi:bvnFUJITSU//PhoenixTechnologiesLtd.:bvrVersion1.23:bd08/06/2015:br1.23:svnFUJITSU:pnLIFEBOOKE744:pvr:rvnFUJITSU:rnFJNB26F:rvrP3:cvnFUJITSU:ct10:cvr:sku:
dmi.product.name: LIFEBOOK E744
dmi.sys.vendor: FUJITSU

Revision history for this message
Ken VanDine (ken-vandine) wrote :
information type: Private → Public
Revision history for this message
Dan Bungert (dbungert) wrote :

Hey Ken. This is the part that failed.

        Running command ['unshare', '--fork', '--pid', '--', 'chroot', '/target', 'eatmydata', 'apt-get', '--quiet', '--assume-yes', '--option=Dpkg::options::=--force-unsafe-io', '--option=Dpkg::Options::=--force-confold', 'install', 'efibootmgr', 'grub-efi-amd64', 'grub-efi-amd64-signed', 'shim-signed'] with allowed return codes [0] (capture=False)
        Reading package lists...
        Building dependency tree...
        Reading state information...
        Package efibootmgr is not available, but is referred to by another package.
        This may mean that the package is missing, has been obsoleted, or
        is only available from another source

        E: Package 'efibootmgr' has no installation candidate
        E: Unable to locate package grub-efi-amd64-signed
        E: Unable to locate package shim-signed

Revision history for this message
Dan Bungert (dbungert) wrote :

For u-d-i, please add the needed packages to the correct ship-live seed.
For curtin, curthooks are producing a flawed sources.list and that should be resolved.

tags: removed: need-duplicate-check
Changed in curtin (Ubuntu):
importance: Undecided → Medium
Dan Bungert (dbungert)
summary: - install failed crashed with CalledProcessError
+ failed to install efi packages, not in pool, bad apt config
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote : Re: failed to install efi packages, not in pool, bad apt config

I confirm the issue on hardware with EFI + SB enabled.

Dan Bungert (dbungert)
Changed in curtin (Ubuntu):
status: New → Invalid
Dan Bungert (dbungert)
Changed in subiquity:
status: New → Confirmed
Revision history for this message
Dan Bungert (dbungert) wrote :

Summary of problem on Subiquity side:

TLDR: we want to patch subiquity to read the list of default routes when we receive a route_change event

* when we believe that we are offline, we intentionally limit the scope of where we look for packages to just the cdrom, this part is working fine
* when we are actually online, if we are acting as if we are offline (has_network == False), then that is incorrect behavior
* has_network is derived from the existence of default routes - if the default_route info is not correct, we risk the wrong has_network state
* subiquity receives route_change events from probert
* probert builds route_change events from libnl-route-3
* per https://github.com/thom311/libnl/issues/226#issuecomment-527888667, libln-route-3 route events are hashed using certain keys, and metric is not among those keys
* on live-server, which is not using network manager, adjusting the link state to active / inactive will cause a single route event to be generated, and subiquity can rely on that to know if there is a default route or not
* on desktop, which is using network manager, a different sequence of network events is generated. `ip monitor route|grep default` is helpful here. Using `ip monitor`, we can see the following sequence:

link goes active - ip monitor route reports 3 events
1) default route added at metric 20100 for interface X for family Y
2) default route added at metric 100 for interface X for family Y
3) default route deleted at metric 20100 for interface X for family Y

libnl-route-3 reports only 2 events
- default route added for interface X for family Y
- default route deleted for interface X for family Y

* subiquity sees only that interface X has no default route (more correctly, it sees a default route for a few milliseconds) - the unintended coalescing of the two events with different metrics means the events aren't enough info for subiquity to know conclusively that there is a default route.

In summary we need to use these events as a trigger to read the default routes instead of using the events directly to maintain the list.

Some notes from testing
* why do VMs work better - I believe these VMs usually or always have ipv6 in them, which is causing a different set of route events, which happens to produce a has_network state more out of luck than anything
* why does server work better - simpler set of route events with no coalesing hiding one of the ones we need
* why does physical installs of desktop work only sometimes on some hardware - different ordering of events from libnl-route3

tags: added: fr-3364
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Wow amazing debugging. What a mess.

A little further digging shows that the crazy high metric is related to connectivity checking: networkmanager adds a 20000 "penalty" to interfaces until they have demonstrated connectivity by accessing http://connectivity-check.ubuntu.com./ successfully. It's unfortunate (and maybe slightly hilarious) that this approach runs straight into the libnl-route issue you found.

Your proposed change sounds reasonable. I guess we should ignore routes that have a metric over 20000.

I wonder how all this interacts with the changes ogayot is making around mirror checking. After all, when that is done we'll know if the configured archive is accessible, which is the most consequential decision hooked on the value of has_network today. I guess we probably don't even try to configure a mirror if has_network is false so we do want to get this right.

Dan Bungert (dbungert)
Changed in subiquity:
status: Confirmed → In Progress
assignee: nobody → Dan Bungert (dbungert)
summary: - failed to install efi packages, not in pool, bad apt config
+ failed to install packages that are not in pool, flawed network
+ detection
Revision history for this message
Dan Bungert (dbungert) wrote :
Revision history for this message
Ubuntu QA Website (ubuntuqa) wrote :

This bug has been reported on the Ubuntu ISO testing tracker.

A list of all reports related to this bug can be found here:
https://iso.qa.ubuntu.com/qatracker/reports/bugs/2004659

tags: added: iso-testing
Dan Bungert (dbungert)
Changed in subiquity:
status: In Progress → Fix Committed
Dan Bungert (dbungert)
Changed in ubuntu-desktop-installer:
status: New → Fix Released
Revision history for this message
Dan Bungert (dbungert) wrote :

We believe this issue has been resolved in Subiquity 23.04.2.

If you had tested this with a pre-final version of Ubuntu 23.04, it's
recommended to download the final install media.

For testing with Ubuntu Server 22.04.x or 20.04.x, when running
Subiquity, you should offered a new version of the installer. Please
take that update to version 23.04.2 or later to get the fix.

If this is still a problem for you, please make a comment and set the state
back to New. Thank you for the bug report.

Changed in subiquity:
status: Fix Committed → Fix Released
To post a comment you must log in.