Can't login after boot with Kernel 4.11.0-rc7, soft lockup in systemd-logind

Bug #1685865 reported by Stefan Daenzer
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned

Bug Description

Used distribution:

Ubuntu Gnome 16.04 with Kernel 4.11.0-rc7

After logging in to the gnome login screen, the system freezes. Trying to login using the shell (Ctr+Alt+F1) also freezes the system. The following ouptut appears on the shell repeatedly:
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [systemd-logind:913]

Steps to reproduce the problem:
Boot Ubuntu Gnome 16.04 with Kernel 4.11.0-rc7 and try to login after boot.

Further testing results:
Using Mainline Kernel Version 4.11.0-rc6, login works fine
Using Kernel Version 4.10.11, login works fine
Using Kernel Version 4.10.12, the issue is present

==> It might be possible the issue has been introduced in a commit in 4.11.0-rc7 and has been backported to 4.10.12.

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
tags: added: kernel-da-key
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a test kernel with a pick of commit: f2200ac311302fcdca6556fd0c5127eab6c65a3e. The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1685865/

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between v4.11-rc6 and v4.11-rc7. The kernel bisect will require testing of about 7-10 test kernels.

I built the first test kernel, up to the following commit:
82f1faa86727de976e38eade5e96a1846742d71e

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1685865/82f1faa86727de976e38eade5e96a1846742d71e

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Changed in linux (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
status: Triaged → In Progress
Revision history for this message
Stefan Daenzer (tisch) wrote :

I can confirm the problem persists with Kernel build from: http://kernel.ubuntu.com/~jsalisbury/lp1685865/82f1faa86727de976e38eade5e96a1846742d71e

Stefan Daenzer (tisch)
description: updated
description: updated
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the first test kernel, up to the following commit:
ee921c762cf90652add60ebacb5b90636ac108df

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1685865/ee921c762cf90652add60ebacb5b90636ac108df

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Make that "I built the NEXT test kernel" Sorry

Revision history for this message
Stefan Daenzer (tisch) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
b9b3322f13f350587f17f0a76f008830e3a420d3

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1685865/b9b3322f13f350587f17f0a76f008830e3a420d3

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Stefan Daenzer (tisch) wrote :

This kernel seems to fix the issue. I am able to login via the login screen.

This is the output of "uname -a"

tisch@tisch-XPS-15-9560:~$ uname -a
Linux tisch-XPS-15-9560 4.11.0-041100rc6-generic #201704251358 SMP Tue Apr 25 18:03:12 UTC 2017 x86_64 x86_64 x86_64 GNU/Linuxlogin

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
63987bfebd8869e00b34e2bdd12e59d71909bec0

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1685865/63987bfebd8869e00b34e2bdd12e59d71909bec0

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Stefan Daenzer (tisch) wrote :

This kernel seems to fix the issue. I am able to login via the login screen.

This is the output of "uname -a"

tisch@tisch-XPS-15-9560:~$ uname -a
Linux tisch-XPS-15-9560 4.11.0-041100rc4-generic #201704251553 SMP Tue Apr 25 19:55:31 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

@jsalisbury, your latest kernel seems not in the range of the bisection, did you say "bad" instead of "good" after message #8?

commit ee921c762cf90652add60ebacb5b90636ac108df BAD
commit 827c30a758392d00ce46d5d0c0a243cdecf5826e
commit 6f6266a561306e206e0e31a5038f029b6a7b1d89
commit f406270bf73d71ea7b35ee3f7a08a44f6594c9b1
commit c4a3fa261b16858416f1fd7db03a33d7ef5fc0b3
commit 5f9bf02a58f0f62d111994805212d0a775499862
commit 95149369c1c28b10f7318dfde54018ab107277d0
commit ab23d1146a8e7bd045507fe8a380827dc03e056d
commit 6dbd25a24599db07d88a805615bad1f4d48a5749
commit f4896fa502b81c5bce93f375bd17b14725c01826
commit 818249216d7dc9340a7d2332f6d07b462e818ee5
commit 2ca62d8a606a95e098799f128f6a40a6300d2a2a
commit 88b0b92bdaadfc43b11dc4172219ff0972673790
commit 97d93f35493f39c2b79e3379b30c17a2d00ec19d
commit c7aae6221f73312bb66464c353eb45d91433aea1
commit 11e63f6d920d6f2dfd3cd421e939a4aec9a58dcd
commit 956a4cd2c957acf638ff29951aabaa9d8e92bbc2
commit a4866aa812518ed1a37d8ea0c881dc946409de94
commit a2d6cbb0670d54806f18192cb0db266b4a6d285a
commit b9c1153f7a9cb2d53b845615a0edd510f7fe8341
commit 45abdf35cf82e4270328c7237e7812de960ac560
commit df7dd8fc965c665e83b71a649378cdf200ff36df
commit 0e1bfea999daa27c801b19617a6ef8b8ec4adc75
commit b9b3322f13f350587f17f0a76f008830e3a420d3 GOOD

this should be the list to test

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@LocutusOfBorg, I did says good for commit b9b3322f1. This is my current bisect log:

git bisect log
git bisect start
# good: [39da7c509acff13fc8cb12ec1bb20337c988ed36] Linux 4.11-rc6
git bisect good 39da7c509acff13fc8cb12ec1bb20337c988ed36
# bad: [4f7d029b9bf009fbee76bb10c0c4351a1870d2f3] Linux 4.11-rc7
git bisect bad 4f7d029b9bf009fbee76bb10c0c4351a1870d2f3
# bad: [82f1faa86727de976e38eade5e96a1846742d71e] Merge tag 'fbdev-v4.11-rc6' of git://github.com/bzolnier/linux
git bisect bad 82f1faa86727de976e38eade5e96a1846742d71e
# bad: [ee921c762cf90652add60ebacb5b90636ac108df] Merge tag 'drm-fixes-for-v4.11-rc7' of git://people.freedesktop.org/~airlied/linux
git bisect bad ee921c762cf90652add60ebacb5b90636ac108df
# good: [b9b3322f13f350587f17f0a76f008830e3a420d3] Merge branch 'stable-4.11' of git://git.infradead.org/users/pcmoore/audit
git bisect good b9b3322f13f350587f17f0a76f008830e3a420d3

git bisect next
Bisecting: 16 revisions left to test after this (roughly 4 steps)
[63987bfebd8869e00b34e2bdd12e59d71909bec0] drm/i915: Suspend GuC prior to GPU Reset during GEM suspend

Marking commit 63987bfebd good asks for the next SHA1 to be tested:

git bisect good 63987bfebd8869e00b34e2bdd12e59d71909bec0
Bisecting: 10 revisions left to test after this (roughly 3 steps)
[88b0b92bdaadfc43b11dc4172219ff0972673790] Merge tag 'drm-intel-fixes-2017-04-12' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes

I'll build that next kernel now.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

The tip of our trees might be different. If you run this, you should see the range of commits:

git log --oneline b9b3322f13f350587f17f0a76f008830e3a420d3..ee921c762cf90652add60ebacb5b90636ac108df

ee921c762cf9 Merge tag 'drm-fixes-for-v4.11-rc7' of git://people.freedesktop.org/~airlied/linux
827c30a75839 Merge tag 'pwm/for-4.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
2ca62d8a606a Merge branch 'linux-4.11' of git://github.com/skeggsb/linux into drm-fixes
88b0b92bdaad Merge tag 'drm-intel-fixes-2017-04-12' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
97d93f35493f Merge tag 'drm-misc-fixes-2017-04-11' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
c7aae6221f73 Merge branch 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux into drm-fixes
45abdf35cf82 drm/etnaviv: fix missing unlock on error in etnaviv_gpu_submit()
0c45b36f8acc drm/udl: Fix unaligned memory access in udl_render_hline
c053b5a506d3 drm/i915: Don't call synchronize_rcu_expedited under struct_mutex
63987bfebd88 drm/i915: Suspend GuC prior to GPU Reset during GEM suspend
e5199a37f7ed Merge tag 'gvt-fixes-2017-04-07' of https://github.com/01org/gvt-linux into drm-intel-fixes
a900152b5c29 pwm: rockchip: State of PWM clock should synchronize with PWM enabled state
b997e3edca4f pwm: lpss: Set enable-bit before waiting for update-bit to go low
3c1460e934f3 pwm: lpss: Split Tangier configuration
da2ba564a6dc drm/nouveau: initial support (display-only) for GP107
2907e8670b6e drm/nouveau/kms/nv50: fix double dma_fence_put() when destroying plane state
d639fbcc1027 drm/nouveau/kms/nv50: fix setting of HeadSetRasterVertBlankDmi method
f94773b9f5ec drm/nouveau/mmu/nv4a: use nv04 mmu rather than the nv44 one
83bce9c2baa5 drm/nouveau/mpeg: mthd returns true on success now
a34f83639490 drm/i915/gvt: set the correct default value of CTX STATUS PTR
cf082a4a264d Merge tag 'gvt-fixes-2017-04-01' of https://github.com/01org/gvt-linux into drm-intel-fixes
aa4ce4493c88 drm/i915/gvt: Fix firmware loading interface for GVT-g golden HW state
ecf8e89917d6 drm/i915: Use a dummy timeline name for a signaled fence
1383aeca92b7 drm/i915: Ironlake do_idle_maps w/a may be called w/o struct_mutex
9ba2a6261de4 drm/i915/gvt: remove the redundant info NULL check
729a0cd45c88 drm/i915/gvt: adjust mem size for low resolution type
6c9a8cdad48a drm/i915: Avoid lock dropping between rescheduling
f85726905745 drm/i915/gvt: exclude cfg space from failsafe mode
b79c52aef3cd drm/i915/gvt: Activate/de-activate vGPU in mdev ops.
dd68f2ba0720 drm/i915/execlists: Wrap tail pointer after reset tweaking
aa62acfd63e7 drm/i915/perf: remove user triggerable warn
4e5f713ffc20 drm/i915/perf: destroy stream on sample_flags mismatch
9e1764309f57 drm/i915: Align "unfenced" tiled access on gen2, early gen3

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
88b0b92bdaadfc43b11dc4172219ff0972673790

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1685865/88b0b92bdaadfc43b11dc4172219ff0972673790

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Stefan Daenzer (tisch) wrote :

This kernel seems to fix the issue. I am able to login via the login screen.

This is the output of "uname -a"

tisch@tisch-XPS-15-9560:~$ uname -a
Linux tisch-XPS-15-9560 4.11.0-041100rc6-generic #201704261323 SMP Wed Apr 26 17:25:42 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

thanks @jsalisbury, probably all the merges in the kernel trapped me :)

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
da2ba564a6dcf46df4f828624ff55531ff11d5b0

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1685865/da2ba564a6dcf46df4f828624ff55531ff11d5b0

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Stefan Daenzer (tisch) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
f94773b9f5ecd1df7c88c2e921924dd41d2020cc

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1685865/f94773b9f5ecd1df7c88c2e921924dd41d2020cc

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Stefan Daenzer (tisch) wrote :

This kernel seems to fix the issue. I am able to login via the login screen.

This is the output of "uname -a"

tisch@tisch-XPS-15-9560:~$ uname -a
Linux tisch-XPS-15-9560 4.11.0-041100rc5-generic #201704271012 SMP Thu Apr 27 14:17:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
2907e8670b6ef253bffb33bf47fd2182969cf2a0

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1685865/2907e8670b6ef253bffb33bf47fd2182969cf2a0

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Stefan Daenzer (tisch) wrote :

This kernel seems to fix the issue. I am able to login via the login screen.

This is the output of "uname -a"

tisch@tisch-XPS-15-9560:~$ uname -a
Linux tisch-XPS-15-9560 4.11.0-041100rc5-generic #201704271247 SMP Thu Apr 27 16:49:54 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

so the issue is:
da2ba564a6dcf46df4f828624ff55531ff11d5b0 ?
This looks strange to me :(

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Yes, the bisect reported commit da2ba564a6dcf46df4f828624ff55531ff11d5b0 as the culprit. To see if that is the case, I built a kernel with that commit reverted.

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1685865/

Can you test that kernel and report back if it has the bug or not?

Revision history for this message
Stefan Daenzer (tisch) wrote :

This kernel seems to fix the issue. I am able to login via the login screen.

This is the output of "uname -a"

tisch@tisch-XPS-15-9560:~$ uname -a
Linux tisch-XPS-15-9560 4.11.0-041100-generic #201705010951 SMP Mon May 1 13:53:56 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for testing. I'll ping upstream and the patch author for feedback.

Revision history for this message
Stefan Daenzer (tisch) wrote :

Thanks for extracting the commit. And thanks for caring for this bug report!

Changed in linux (Ubuntu):
status: In Progress → Incomplete
assignee: Joseph Salisbury (jsalisbury) → nobody
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.