Installer crash in s390x environments w/o OSA network adapters
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu on IBM z Systems |
Fix Released
|
High
|
Skipper Bug Screeners | ||
subiquity |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
The subiquity installer should support installations of s390x environments that have no OSA (CCW-) devices in place, but Mellanox Connect-X (RoCE Express) PCIe-based card instead, that act like a regular ethernet devices.
But on s390x the installer seem to expect OSA devices (or maybe any CCW devices) on top, at least it calls chzdev always by default, which will end up in a crash in case no CCW / OSA devices are in place.
A fix is needed, so that PCIe RoCE only network based installations are properly supported.
(initial report)
In the Ubuntu installer there is a section that lists zdev's and there are none.
With OSA network cards you would see qeth devices to include.
Further down the install a network card is detected and the IP we assign is showing.
Looking at the debug output the errors are occurring around the chzdev command.
I am wondering if this is because there are no zdev's and we are hitting a bug.
I think the kernel is correctly finding the NIC as a Mellanox per mlx5 module (these IBM cards are indeed Mellanox hardware):
109.848474] mlx5_core 0001:00:00.0 ens1: Link up
Here's the total debug output <see attachment>
And the failing output around the kernel install and chzdev below:
Running
command ['chzdev', '--quiet', '--active', '--online', '--export', '-'] with allowed return codes [0] (capture=True)
finish:
cmd-install/
finish:
cmd-install/
Traceback
(most recent call last):
File
"/snap/
ret
= args.func(args)
File
"/snap/
builtin_
target, state)
File
"/snap/
chzdev_
target)
File
"/snap/
(chzdev_conf,
_) = chzdev_
File
"/snap/
return
util.subp(cmd, capture=True)
File
"/snap/
return
_subp(*args, **kwargs)
File
"/snap/
raise
ProcessExecuti
curtin.
Unexpected error while running command.
Command:
['chzdev', '--quiet', '--active', '--online', '--export', '-']
Exit
code: 8
Reason:
-
Stdout:
''
:[K
[K
Stderr: chzdev: No settings found to export
Background info:
Initially 'RoCE Express' (Mellanox Connect-X based) non-OSA network adapters on s390x were planned to improve network connectivity between Linux on s390x and z/OS and an OSA interface was needed on top of RoCE for the initial handshake between those two systems.
But with the increasing popularity of LinuxONE systems (and a tendency away from CCW devices, like OSA) more towards PCIe devices, like RoCE Express (but also NVMe), there are more and more cases where installations are needed in s390x environments that have no (CCW-based) OSA network adapters at all, but RoCE (Mellanox) instead (in ethernet mode).
Related branches
- Michael Hudson-Doyle: Approve
- Server Team CI bot: Approve (continuous-integration)
-
Diff: 62 lines (+23/-3)2 files modifiedcurtin/commands/curthooks.py (+10/-2)
tests/unittests/test_curthooks.py (+13/-1)
information type: | Private → Public |
Changed in subiquity: | |
status: | In Progress → Fix Committed |
Changed in ubuntu-z-systems: | |
status: | In Progress → Fix Committed |
Logs from an attempt to install 22.04 on a (PCIe-based) Mellanox network only LPAR system (with NVMe disk storage only).
(So virtually no CCW-devices at all in that system).