boot failure: can't open /root/dev/console: no such file

Bug #833783 reported by Scott Moser
22
This bug affects 5 people
Affects Status Importance Assigned to Milestone
udev (Ubuntu)
Confirmed
High
Steve Langasek
Oneiric
Confirmed
High
Steve Langasek
Precise
Confirmed
High
Unassigned

Bug Description

On a nova compute instance running in kvm the initial launch failed, with the last bit of the console like:

Begin: Running /scripts/init-bottom ... [ 1.230178] Refined TSC clocksource calibration: 2532.590 MHz.^M
done.^M
/init: line 348: can't open /root/dev/console: no such file^M
[ 60.921374] Kernel panic - not syncing: Attempted to kill init!^M
[ 60.922434] Pid: 1, comm: init Not tainted 3.0.0-9-virtual #14-Ubuntu^M
[ 60.923760] Call Trace:^M
[ 60.924236] [<ffffffff815ea8e3>] panic+0x91/0x194^M
[ 60.925112] [<ffffffff81062255>] forget_original_parent+0x245/0x250^M
[ 60.926242] [<ffffffff81062277>] exit_notify+0x17/0x150^M
[ 60.927192] [<ffffffff81062b7b>] do_exit+0x1fb/0x440^M
[ 60.928164] [<ffffffff811653a0>] ? vfs_write+0x110/0x180^M
[ 60.929180] [<ffffffff81062f64>] do_group_exit+0x44/0xa0^M
[ 60.930143] [<ffffffff81062fd7>] sys_exit_group+0x17/0x20^M
[ 60.930772] [<ffffffff81608fc2>] system_call_fastpath+0x16/0x1b^M

It would seem that there is a race condition somewhere there, or for some reason /root/dev/console did not exist.

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: initramfs-tools 0.99ubuntu2
ProcVersionSignature: User Name 3.0.0-9.14-virtual 3.0.3
Uname: Linux 3.0.0-9-virtual x86_64
Architecture: amd64
Date: Thu Aug 25 13:52:15 2011
Ec2AMI: ami-0000004b
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: nova
Ec2InstanceType: <nova.db.sqlalchemy.models.InstanceTypes object at 0x4a49490>
Ec2Kernel: aki-00000026
Ec2Ramdisk: ari-00000028
PackageArchitecture: all
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: initramfs-tools
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

I'm attaching initial console log, and then rebooted console log (which succeeded).

Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

I've just now seen this again

Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

and again http://paste.ubuntu.com/677476/
I feel somewhat compelled to explain that these are from a openstack kvm instance that has used 'kexec-loader' to load the kernel (https://code.launchpad.net/~smoser/+junk/kexec-loader). That is what it sounds like, a kernel/ramdisk find the root device and then kexec a kernel from inside it.

Revision history for this message
Scott Moser (smoser) wrote :

Note, while I did say that above regarding kexec, I've not seen this on anything other than oneiric, and the same kexec kernel/ramdisk is used for my testing on lucid, maverick, natty also.

Changed in initramfs-tools (Ubuntu):
status: New → Confirmed
Revision history for this message
Dave Walker (davewalker) wrote :

I also experienced this. :(

Changed in initramfs-tools (Ubuntu):
importance: Undecided → High
Revision history for this message
Scott Moser (smoser) wrote :
tags: added: iso-testing
tags: added: rls-mgr-o-tracking
Brad Figg (brad-figg)
tags: removed: rls-mgr-o-tracking
tags: added: rls-mgr-o-tracking
Revision history for this message
Scott Moser (smoser) wrote :
Steve Langasek (vorlon)
Changed in initramfs-tools (Ubuntu Oneiric):
assignee: nobody → Steve Langasek (vorlon)
Revision history for this message
jamey0824 (jamesforyst) wrote :

I just had this happen to me on reboot of my ubuntu 11:04 . I actully have no clue on how I can work around this. So if you know let me know, please.

Revision history for this message
Steve Langasek (vorlon) wrote :

Patch to be tried against /usr/share/initramfs-tools/scripts/init-bottom/udev. Please test and report back if the problem is still reproducible with this change to the initramfs.

affects: initramfs-tools (Ubuntu Oneiric) → udev (Ubuntu Oneiric)
Steve Langasek (vorlon)
Changed in udev (Ubuntu Oneiric):
status: Confirmed → Incomplete
Revision history for this message
Scott Moser (smoser) wrote :

over the weekend i ran 521 instances of an image with patch from comment 12 in the initramfs, and 339 instances of a 'control' image. I ran this against a Canonical internal developer cloud.
Unfortunately, I failed to hit this race in any of them.

However, today, I just ran 2 instances of beta2 against the "CanoniStack", and hit it on both.

Possible reasons for the differing results:
 a.) developer cloud is just different hardware (cpu/disk)
 b.) develoepr cloud was running more recent nova snapshot

I really dont put any stock in 'b' though, as i really think this is just a timing race in the kvm guest.

Revision history for this message
Scott Moser (smoser) wrote :

largely for my own future purposes, attaching what I did to re-pack the image with debug on.

Revision history for this message
Scott Moser (smoser) wrote :

OK.
  So I ran 57 instances of the control ramdisk on my cloud that was reproducing earlier.
2 of the 57 hit the failure. I ran [successfully] of the modified ramdisk, and 1 had a failure.

  So it seems just based on that, that we're not significantly more or less likely to hit the failure with the modified image.

Scott Moser (smoser)
Changed in udev (Ubuntu Oneiric):
status: Incomplete → Confirmed
Revision history for this message
Steve Langasek (vorlon) wrote :

Since /dev/console is not created even with udevadm settle, would like to get a dump of the udev log (/dev/.udev.log) from the initramfs in the failure case.

[ -e /dev/console ] || cat /dev/.udev.log

Revision history for this message
Steve Langasek (vorlon) wrote :

Oh, /dev/.udev.log doesn't actually exist in the initramfs. Well, let's do something about that then.

Scott, could you apply the attached patch to the udev package for testing?

Revision history for this message
Steve Langasek (vorlon) wrote :

(This should generate a tremendous amount of console spew, which I assume you're able to capture)

Revision history for this message
Scott Moser (smoser) wrote :

well, after upgrades to the nova installation that I was using, we are no longer running our images backed by a qcow2 compressed image. Instead, the root disk is a qcow2 disk backed by a "raw" disk image. This obviously changes timing conditions as disk reads no longer take cpu to do decompression.

I've now run 240 instances, and not been able to reproduce this with the modified ramdisk to catch the output.

Steve Langasek (vorlon)
Changed in udev (Ubuntu Oneiric):
milestone: none → oneiric-updates
Changed in udev (Ubuntu Precise):
status: New → Confirmed
importance: Undecided → High
tags: added: rls-mgr-p-tracking
removed: rls-mgr-o-tracking
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.