Fails to detect (second) display

Bug #1543683 reported by LaMont Jones
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned
Xenial
Confirmed
Medium
Unassigned

Bug Description

With the upgrade to 4.4.0-2, only one of the two displays in the machine is detected. With 4.3.0-7 (and earlier) both are detected.

In this case, both display connectors are on the MB.

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-extra-4.4.0-2-generic 4.4.0-2.16
ProcVersionSignature: Ubuntu 4.4.0-2.16-generic 4.4.0
Uname: Linux 4.4.0-2-generic x86_64
ApportVersion: 2.19.4-0ubuntu2
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: lamont 3923 F.... pulseaudio
CurrentDesktop: Unity
Date: Tue Feb 9 10:22:38 2016
HibernationDevice: RESUME=UUID=5d145c55-633f-4197-a4c9-0b165b6dbeb3
InstallationDate: Installed on 2014-09-22 (505 days ago)
InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Release amd64 (20140416.2)
MachineType: Shuttle Inc SG41
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-2-generic root=UUID=ba0ea95b-cde0-42ba-b06e-9fd6ca006881 ro nomdmonddf nomdmonisw
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-2-generic N/A
 linux-backports-modules-4.4.0-2-generic N/A
 linux-firmware 1.155
RfKill:
 0: hci0: Bluetooth
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: Upgraded to xenial on 2014-11-14 (452 days ago)
dmi.bios.date: 02/10/2011
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 080015
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: FG41
dmi.board.vendor: Shuttle Inc
dmi.board.version: V20
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: Shuttle Inc
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr080015:bd02/10/2011:svnShuttleInc:pnSG41:pvrV20:rvnShuttleInc:rnFG41:rvrV20:cvnShuttleInc:ct3:cvrToBeFilledByO.E.M.:
dmi.product.name: SG41
dmi.product.version: V20
dmi.sys.vendor: Shuttle Inc

Revision history for this message
LaMont Jones (lamont) wrote :
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: New → Confirmed
tags: added: kernel-da-key
Revision history for this message
LaMont Jones (lamont) wrote :

As requested, here is the output of lspci -vvvnn on the good kernel.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

There are over 400 changes to i915 between Ubuntu-4.3.0-7.18 and Ubuntu-4.4.0-2.16. We should probably narrow down the exact kernel version that introduced this first. At the same time, we can see if this is Ubuntu specific, so testing the upstream kernels is best.

We need to identify the earliest kernel that did not exhibit the bug and the first kernel that did exhibit the bug.

Can you test the following kernels and post back?

v4.4 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4-wily/
If v4.4 final exhibits the bug, we should move on to testing some of the v4.4 release candidates.

v4.4-rc4: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4-rc4-wily/

If v4.4-rc4 does not exhibit the bug then test v4.4-rc6:
v4.4-rc6: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4-rc6-wily/

If v4.4-rc4 does exhibit the bug then test v4.4-rc2:
v4.4-rc2: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4-rc2-wily/

Revision history for this message
LaMont Jones (lamont) wrote :

I went the route of bisecting the RCs...

rc2: no kernel present in page, which makes rc6 the obvious bisect point
rc6: bad.
rc4: bad.

FIN.

Revision history for this message
penalvch (penalvch) wrote :

LaMont Jones, in order to allow additional upstream developers to examine the issue, at your earliest convenience, could you please test the latest upstream kernel available from http://kernel.ubuntu.com/~kernel-ppa/mainline/?C=N;O=D ? Please keep in mind the following:
1) The one to test is at the very top line at the top of the page (not the daily folder).
2) The release names are irrelevant.
3) The folder time stamps aren't indicative of when the kernel actually was released upstream.
4) Install instructions are available at https://wiki.ubuntu.com/Kernel/MainlineBuilds .

If testing on your main install would be inconvenient, one may:
1) Install Ubuntu to a different partition and then test this there.
2) Backup, or clone the primary install.

If the latest kernel did not allow you to test to the issue (ex. you couldn't boot into the OS) please make a comment in your report about this, and continue to test the next most recent kernel version until you can test to the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this issue is fixed in the mainline kernel, please add the following tags by clicking on the yellow circle with a black pencil icon, next to the word Tags, located at the bottom of the report description:
kernel-fixed-upstream
kernel-fixed-upstream-X.Y-rcZ

Where X, and Y are the first two numbers of the kernel version, and Z is the release candidate number if it exists.

If the mainline kernel does not fix the issue, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-X.Y-rcZ

Please note, an error to install the kernel does not fit the criteria of kernel-bug-exists-upstream.

Once testing of the latest upstream kernel is complete, please mark this report's Status as Confirmed. Please let us know your results.

Thank you for your understanding.

tags: added: regression-release
tags: added: needs-bisect
Changed in linux (Ubuntu Xenial):
status: Confirmed → Incomplete
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between v4.3 final and v4.4-rc1. The kernel bisect will require testing of about 13 test kernels.

I built the first test kernel, up to the following commit:
118c216e16c5ccb028cd03a0dcd56d17a07ff8d7

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1543683

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
LaMont Jones (lamont) wrote :

 118c216e16c5ccb028cd03a0dcd56d17a07ff8d7 is good.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the first test kernel, up to the following commit:
e6604ecb70d4b1dbc0372c6518b51c25c4b135a1

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1543683

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
LaMont Jones (lamont) wrote :

 e6604ecb70d4b1dbc0372c6518b51c25c4b135a1 is good.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the first test kernel, up to the following commit:
b44a3d2a85c64208a57362a1728efb58a6556cd6

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1543683

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
LaMont Jones (lamont) wrote :

 b44a3d2a85c64208a57362a1728efb58a6556cd6 is bad.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the first test kernel, up to the following commit:
c0f3f90cf454dd845dcc443afa4f0e312a8eaee0

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1543683

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

In the previous comments first should be 'next'. Cut and paste error.

Changed in linux (Ubuntu Xenial):
assignee: nobody → Joseph Salisbury (jsalisbury)
status: Incomplete → Confirmed
Revision history for this message
LaMont Jones (lamont) wrote :

 c0f3f90cf454dd845dcc443afa4f0e312a8eaee0 is bad. (201602161109 timestamp on the kernel build, for confirmation)

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
717d84d67e3a95f440c37c7482681b3535fdc7e2

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1543683

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
LaMont Jones (lamont) wrote :

 717d84d67e3a95f440c37c7482681b3535fdc7e2 is bad.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
6b6d5626750d72a22180a6e094cf95acd1d85c9b

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1543683

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

6b6d562 reported as good on IRC.

I built the next test kernel, up to the following commit:
40a4a5727f21a0e439d317aa99953e24467605eb

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1543683

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
LaMont Jones (lamont) wrote :

40a4a5727f21a0e439d317aa99953e24467605eb fails.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
e8cb8d69d125f56fa3ba5239b215a56718e2ca44

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1543683

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
LaMont Jones (lamont) wrote :

 e8cb8d69d125f56fa3ba5239b215a56718e2ca44 is bad.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
23eafea6a9d1faac0588a5275d0c755cb261346e

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1543683

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
LaMont Jones (lamont) wrote :

 23eafea6a9d1faac0588a5275d0c755cb261346e is good (with complaints from the monitor about refreshrate out of range)

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
9eca6832f7254d49d25494da7d47c0f8a24f7862

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1543683

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
LaMont Jones (lamont) wrote :

 9eca6832f7254d49d25494da7d47c0f8a24f7862 is bad. In the interest of letting me do the rest in one interruption of my primary worksurface, would it be possible for you to build all 7 remaining versions (or throw me a script..), and then I can just reboot a few times for completion. I'm of the belief that the next kernel for me to try is d2e08c0f34438af791482de8abf2c8e4e573b1d3

Revision history for this message
LaMont Jones (lamont) wrote :

That is, if it's convenient to just loop through and build all 7, then throw them into ~jsalisubury/lp1543683/$REF, then I can smash through the reboots in minimal time. (Getting tired of rebuilding my workspace...)

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built all the remaining test kernels but the last one. The test kernels are available from:
http://kernel.ubuntu.com/~jsalisbury/lp1543683

The first kernel is in the d2e08c0f3 directory. There are also two sub-directories:

bad.cfe01a5e
good.7f4c628

Test the kernel in the sub-directory depending if the d2e08c0f3 one is good or bad. Each of those sub-directories also have two sub-directories. Again, a good one and bad one, so test the kernel in the sub-directory depending if the prior kernel was good or bad.

Revision history for this message
LaMont Jones (lamont) wrote :

d2e0 is good 74fc bad 237e bad. We have a winner: 237ed86c is the first failing commit.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a Xenial test kernel with commit 237ed86c reverted. The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1543683/Commit237ed86cReverted/

Can you test this kernel and see if it resolves the bug?

Note, with this test kernel, you need to install both the linux-image and linux-image-extra .deb packages.

Revision history for this message
LaMont Jones (lamont) wrote :

Works. Note that during boot, when it flashes (presumably over to the kernel's i915 driver?) there is a noticible delay between the first (00:02.0) display and the second (failing, 00:02.1) display coming up - on the order of "under a second or two". I'm attaching the kernel log from the boot, which shows some screaming from the i915 driver (and some network packet screaming, which I believe is unrelated, though curious.)

lamont

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I created a debug test kernel that bumps the number for tries to 300 and print each try to the log. The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1543683/DEBUG-KERNEL/

Revision history for this message
LaMont Jones (lamont) wrote :
  • zz Edit (1.6 MiB, text/plain)

Here is the boot... Interestingly, it shows 2 displays in lspci. I'm not in front of the computer, so I can't exactly say if it found it or not, but ISTR that it didn't show the device at all (in lspci) in the bad kernels.

Revision history for this message
LaMont Jones (lamont) wrote :

Interestingly, while two displays show up in the lspci output, only the one was detected and used.

lamont

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Possibly a similar issue upstream:
https://lkml.org/lkml/2016/2/24/654

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a Xenial test kernel with the following commit requested by upstream:

commit 8d409cb3e8a24196be7271defafd4638f3e0b514
Author: Ville Syrjälä <email address hidden>
Date: Wed Feb 10 19:59:05 2016 +0200

    drm/i915: Fix hpd live status bits for g4x

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1543683/patched-kernel/

Can you give this test kernel a try and see if it resolves the bug?

Note, with this test kernel, you need to install both the linux-image and linux-image-extra .deb packages.

Revision history for this message
LaMont Jones (lamont) wrote :

4.4.0-9.24~lp1543683Commit8d409cb is a good kernel. (with the now expected lag of ~1/2 second between the two displays refreshing during boot at driver cutover.)

Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Xenial):
status: Confirmed → Fix Committed
Revision history for this message
LaMont Jones (lamont) wrote :

On the other hand, for the "completely unacceptable" part of the diff: locking the screen results in the display going away or something, such that X randomly moves windows from the right hand to the left and vice versa. Just... No.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a test kernel without the new patch and with commit 237ed86c reverted. Can you see if this new issue still happens with this test kernel? It can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1543683/NoPatchCommit237ed86cReverted

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.4.0-12.28

---------------
linux (4.4.0-12.28) xenial; urgency=low

  * Miscellaneous Ubuntu changes
    - reconstruct: Work around orig tarball packaging limitiations
      Fixes FTBS

 -- Tim Gardner <email address hidden> Tue, 08 Mar 2016 13:26:08 -0700

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a test kernel with commit 237ed86c reverted, like in comment #29. However, I also applied the patch for this bug from the latest Xenial kernel.

The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1543683/

Revision history for this message
LaMont Jones (lamont) wrote :

And I completely misstated myself. :(

What I think I want is: 237ed86c with the current patch for this bug applied.

Here is what I'm seeing, and trying to isolate it to these two patches, or if it's something yet newer that we'll need to bisect all over again:

With current xenial (4.4.0-13 and -14), but not with the trunk prior to 237ed86c, when the OS stops sending video to the displays (in my case after screen locks and there is still no keyboard/mouse activity), then the windows get jumbled around in a manner that is consistent with X deciding that it's single-headed again, and then back to double headed.

The kernel in #40 does not exhibit this behavior. It would be interesting to see if 237ed86c with the xenial fix does or does not. If it does, then this remains an open bug. If it does not, then we have a new bug, and a need to bisect again, with 4.4.0-13 (from the archive) being the initial bad, and 237ed86c+fix as the first good. Sadly, every single kernel in that bisect will need to have the fix applied post-bisect, so lets hope it's still this bug. As of right now, initial good is 237ed86c~1(with current fix -- I don't think I tested for this when we were doing the initial bisect.)

Thanks

Revision history for this message
LaMont Jones (lamont) wrote :

Rereading #40, what exactly was that kernel? On rereading, it sounds like it was top-of-tree (ish), with 237ed86c reverted and the fix added -- or was it bonafide 237ed86c~1 + the fix? (on which hinges what the kernel to try next is, but the last 2 paragraphs in #41 are still valid)

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

So the kernel posted in comment #40 does not exhibit the original bug, or this possible new bug of the windows getting re-shuffled?

The kernel in comment #40 was the tip of 4.4.0-9 with commit 237ed86c reverted and the fix added.

Revision history for this message
LaMont Jones (lamont) wrote :

The kernel in comment #40 exhibits neither the original bug, nor the window-shuffling bug. On a lark, I'd like tip 4.4.0-9 with the "original" fix added (and 237ed86c NOT reverted). All of which assumes that 237ed86c was not reverted as part of landing the fix.

Something about this feels like it's likely an interaction between 237ed86c and the fix. If not, that will give us 4.4.0-9+fix as a base good, way ahead of where we would be if we went back to 237ed86c~1.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a 4.4.0-9 with the "original" fix added (and 237ed86c NOT reverted). The kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1543683/

Revision history for this message
LaMont Jones (lamont) wrote :

The kernel in #45:
1) exhibits the window shuffling issue.
2) if I lock the screen, and sit back for a few seconds, the display blanks, and then (almost immediately) unblanks. I would expect it to remain blanked until there was mouse or keyboard activity.

I suspect that the next step is to instrument things to produce an audit trail of what queries are hitting the driver about the display, and what it's saying... It's behaving like it only sees 1 display part of the time, while both are actually present.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you test one additional kernel? It is the current Xenial kernel with commit 237ed86c also reverted. If this one is good, then commit 7a5c500 (drm/i915: Fix hpd live status bits for g4x). Needs some more work.

The latest test kernel is in the usual location:

http://kernel.ubuntu.com/~jsalisbury/lp1543683/

Revision history for this message
LaMont Jones (lamont) wrote :

Said kernel appears to be good. The reshuffling of windows does not occur (and I get two displays!!) In the worst case, I'd recommend shipping xenial with 237ed86c reverted, if they can't get a fixed version before our deadline.

Holler at me if there's another kernel for me to test.

thanks,
lamont

Revision history for this message
LaMont Jones (lamont) wrote :

based on discussion, more work is needed.

Changed in linux (Ubuntu Xenial):
status: Fix Released → In Progress
Revision history for this message
LaMont Jones (lamont) wrote :

4.6 test kernel (2dcd0af568b0cf583645c8a317dd12e344b1c72a would seem to be the base) does not exhibit the bug.

Changed in linux (Ubuntu Xenial):
status: In Progress → Confirmed
Changed in linux (Ubuntu):
status: In Progress → Confirmed
Changed in linux (Ubuntu Xenial):
assignee: Joseph Salisbury (jsalisbury) → nobody
Changed in linux (Ubuntu):
assignee: Joseph Salisbury (jsalisbury) → nobody
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.