openbox crashed with SIGABRT

Bug #2011751 reported by Chris Guiver
32
This bug affects 2 people
Affects Status Importance Assigned to Milestone
glib2.0 (Ubuntu)
Invalid
Critical
Unassigned
openbox (Ubuntu)
Fix Released
Critical
Aaron Rainbolt

Bug Description

Lubuntu lunar (primary box) on
- dell [optiplex] 7050 (i5-6500, 16gb, intel hd530/i915)

I experienced a crash yesterday (openbox) but tend to ignore the first crashes... Whilst using the machine again today a crash occurred.

This is my primary box, so my setup is pretty consistent

- featherpad (restored session)
- hexchat (irc)
- qterminal (only single tab today; usually many)
- firefox (snap, most sites excluding google related)
- chromium (snap, used for anything google related)
- telegram (snap)
- element

I was using chromium full screen (trying to read my gmail inbox) when borders around windows disappeared & I lost the capacity to switch between windows; mouse would move, but keyboard appeared mostly dead (it wasn't, more I think windows weren't responding, inc. clicks with mouse though that's likely inconsistent; I could [not] work out how to describe it yesterday, but todays is almost identical).

I switched to text terminal (ctrl+alt+f4) & explored; returned to GUI and used
- ctrl+alt+t to open new terminal
- openbox &

session returned to what I expect...

** expected outcome

Openbox doesn't crash

** actual outcome

all windows lost borders, and i lost ability to switch between windows
I appeared to have minimal control (not true, more a fraction of what I expect)

** How to get crash

For me,

- I used chromium (browser)
- make chromium full screen (ie. F11)
- click on link/something & right-click to open in new window
OPENBOX CRASH.

ProblemType: Crash
DistroRelease: Ubuntu 23.04
Package: openbox 3.6.1-10
ProcVersionSignature: Ubuntu 6.1.0-16.16-generic 6.1.6
Uname: Linux 6.1.0-16-generic x86_64
ApportVersion: 2.26.0-0ubuntu2
Architecture: amd64
CasperMD5CheckResult: unknown
CrashCounter: 1
CurrentDesktop: LXQt
Date: Thu Mar 16 09:46:22 2023
ExecutablePath: /usr/bin/openbox
ExecutableTimestamp: 1643543619
InstallationDate: Installed on 2023-01-25 (49 days ago)
InstallationMedia: Lubuntu 23.04 "Lunar Lobster" - Alpha amd64 (20230124)
JournalErrors: -- No entries --
ProcCmdline: /usr/bin/openbox
ProcCwd: /home/guiverc
RebootRequiredPkgs: Error: path contained symlinks.
Signal: 6
SourcePackage: openbox
StacktraceTop:
 () at /lib/x86_64-linux-gnu/libobt.so.2
 <signal handler called> () at /lib/x86_64-linux-gnu/libc.so.6
 client_calc_layer ()
 ()
 () at /lib/x86_64-linux-gnu/libobt.so.2
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
separator:

Revision history for this message
Chris Guiver (guiverc) wrote :
Revision history for this message
Apport retracing service (apport) wrote :

StacktraceTop:
 client_calc_layer (self=<optimized out>) at openbox/client.c:2726
 event_process (ec=<optimized out>, data=<optimized out>) at openbox/event.c:582
 event_read (source=<optimized out>, callback=<optimized out>, data=<optimized out>) at obt/xqueue.c:338
 g_main_dispatch (context=0x560f4b66f070) at ../../../glib/gmain.c:3460
 g_main_context_dispatch (context=0x560f4b66f070) at ../../../glib/gmain.c:4200

tags: removed: need-amd64-retrace
Revision history for this message
Chris Guiver (guiverc) wrote :
description: updated
Revision history for this message
Chris Guiver (guiverc) wrote :

I've experienced this issue now on Lubuntu lunar running on
- hp prodesk 400 g1 sff (i5-4570, 8gb, amd/ati cedar radeon hd 5000/6000/7350/8350)

qterminal was fullscreen (ie. F11) with chromium sitting idle on one display, and me 'fighting' with firefox issues on the other screen.. all of a sudden borders around windows disappeared & I lost pretty much access to my desktop

I noted the panel disappeared awhile, then was re-drawn? though that could have been due to upgrades that were installing. this box did NOT experience a openbox crash until AFTER upgrades had all installed.

mouse control was still active, but unable to click on windows etc. and unable to open a terminal using ctrl+ALT+T to re-run openbox... (terminal actually opened; but it was not selected so no commands could be entered into window opened with ctrl+alt+T)

Revision history for this message
Aaron Rainbolt (arraybolt3) wrote :

From #ubuntu-devel IRC:

[21:08] <jbicha> I'm concerned that glib2.0 may be less stable than usual 😢
[21:09] <sarnold> interesting, guiverc's still-private bug https://bugs.launchpad.net/ubuntu/+source/openbox/+bug/2011751 has a crash going through glib/gmain.c
[21:09] <arraybolt3> jbicha: That's quite possible. At least one of the Openbox crashes seems Glib-related.
[21:09] <arraybolt3> sarnold: Heh, you saw it.

This might be a GLib bug.

Chris Guiver (guiverc)
information type: Private → Private Security
Revision history for this message
Aaron Rainbolt (arraybolt3) wrote (last edit ):

I was able to reproduce this in a VM made with the latest Lunar ISO.

I'm not sure if this is helpful, but when I did this with Openbox running in a terminal, this message was printed when the crash occurred:

How are you gentlemen? All your base are belong to us. (Openbox received signal 11)

Revision history for this message
Aaron Rainbolt (arraybolt3) wrote :

I can *also* reproduce this by doing the same sequence of steps (fullscreen browser with F11, right-click a link, click "Open in new window") using Firefox instead of Chromium.

Revision history for this message
Aaron Rainbolt (arraybolt3) wrote :

This is a regression from Lubuntu Kinetic - I cannot reproduce this using Firefox on 22.10, while I can reliably reproduce it on 23.04.

Changed in openbox (Ubuntu):
importance: Undecided → Critical
Changed in glib2.0 (Ubuntu):
importance: Undecided → Critical
Revision history for this message
Sebastien Bacher (seb128) wrote :
Revision history for this message
Sebastien Bacher (seb128) wrote :

Upstream glib reply

> It's exceedingly unlikely that swapping the slice allocator with the system one will cause a
> crash in correct code; it's much more likely that the slice allocator was masking an incorrect
> deallocation or an invalid write. Valgrind would also have caught it, since we automatically
> disabled the slice allocator when using it.

which means it's probably an issue to fix in the openbox codebase

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in glib2.0 (Ubuntu):
status: New → Confirmed
Changed in openbox (Ubuntu):
status: New → Confirmed
Revision history for this message
Apport retracing service (apport) wrote :

StacktraceTop:
 client_calc_layer (self=<optimized out>) at openbox/client.c:2726
 event_process (ec=<optimized out>, data=<optimized out>) at openbox/event.c:582
 event_read (source=<optimized out>, callback=<optimized out>, data=<optimized out>) at obt/xqueue.c:338
 g_main_dispatch (context=0x560f4b66f070) at ../../../glib/gmain.c:3460
 g_main_context_dispatch (context=0x560f4b66f070) at ../../../glib/gmain.c:4200

Revision history for this message
Apport retracing service (apport) wrote : Stacktrace.txt
Revision history for this message
Apport retracing service (apport) wrote : StacktraceSource.txt
Revision history for this message
Apport retracing service (apport) wrote : ThreadStacktrace.txt
Revision history for this message
Thomas Ward (teward) wrote :

After discussion with the Security Team, we do not believe this is a Security bug and that there currently is no need to keep it private as there is other public discourse via other related bugs on other trackers and here in Launchpad with other bugs (marked as dupes of this or such).

information type: Private Security → Public
Revision history for this message
Ubuntu QA Website (ubuntuqa) wrote :

This bug has been reported on the Ubuntu ISO testing tracker.

A list of all reports related to this bug can be found here:
https://iso.qa.ubuntu.com/qatracker/reports/bugs/2011751

tags: added: iso-testing
Revision history for this message
Aaron Rainbolt (arraybolt3) wrote (last edit ):

I believe we have a working patch for the problem. A quick summary of what it looks like is happening:

Openbox maintains a list of windows organized by stacking order from highest to lowest. This is a doubly-linked list.

There is a function in Openbox called client_calc_layer that uses a pointer into this list to walk through it while modifying it. When client_calc_layer modifies the list, it also messes up the pointer that it is actively using to walk through the list. The next element that Openbox loads contains a dangling pointer, and that promptly results in a segfault when Openbox tries to dereference that pointer.

To solve the problem, I added a patch that makes Openbox save a pointer to the list element *before* the current element before modifying the list. When the list is then modified, the still-valid pointer to the previous list element is used to get a new (and actually usable!) pointer to the current list element. The old, broken pointer is then overwritten by the new, working pointer, and the segfaults stop.

Using the patch, I am no longer able to reproduce this bug when following the testcase (open a browser fullscreen, right-click a link, and click "Open in new window"). I have the patched Openbox uploaded to a PPA here: https://launchpad.net/~arraybolt3/+archive/ubuntu/openbox

Changed in glib2.0 (Ubuntu):
assignee: nobody → Aaron Rainbolt (arraybolt3)
assignee: Aaron Rainbolt (arraybolt3) → nobody
Changed in openbox (Ubuntu):
assignee: nobody → Aaron Rainbolt (arraybolt3)
Changed in glib2.0 (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Aaron Rainbolt (arraybolt3) wrote :

I don't have upload access to this package yet (it's not in the Lubuntu packageset apparently) so I'm uploading the patch here so that it can be sponsored.

Revision history for this message
Dan Bungert (dbungert) wrote :

Partial review of the patch so far. I haven't read much openbox code so take this with a grain of salt.

https://bugzilla.icculus.org/show_bug.cgi?id=6669#c9 mentions a concern that itPrev may still be pointing to invalid data. I share this concern. I think some interesting testcases are what happens with lists that have one or two elements, which then are modified by this function. In the single element case, I believe your itPrev now points to the deleted item, and in the two element case, itPrev may be ok but itPrev->prev or itPrev->next are both probably not valid.

https://bugzilla.icculus.org/show_bug.cgi?id=6669#c5 mentions a work branch. I suggest tracking that down and considering it. It's presumably this patch https://bugzilla.icculus.org/attachment.cgi?id=3646&action=diff.

Revision history for this message
Aaron Rainbolt (arraybolt3) wrote :

@dbungert: While the patch that simply copies the list is a possible way of doing things, a very experienced user on the Arch Linux forums had concerns about doing things that way, see https://bbs.archlinux.org/viewtopic.php?pid=2090570#p2090570 "But creating a copy of the list is inefficient and only glosses over the actual problem (and probably will cause X11 errors and/or behavioral differences - maybe maintain them b/c of the status quo ante)"

In the two-item case, as long as the rest of Openbox code keeps the list consistent after removing *and adding* an item (because keep in mind the code doesn't just remove stuff, it also adds stuff), everything should be fine. itPrev->next should point to a valid item unless the underlying list has become corrupted.

In the one item case, the first iteration through the loop works, and with only one item, there will only be one iteration through the loop, so presumably the crash won't happen unless the one item case triggers a crash in some other way.

I'll test both and report back what happens.

Revision history for this message
Aaron Rainbolt (arraybolt3) wrote :

OK, so I'm not entirely sure testing the one item case is even possible since the crash happens when switching from a fullscreen window (which implies that there are at least two windows). However, this is what I did, and everything was successful (no crashes).

1. Boot a Lubuntu Lunar ISO, and install the patched Openbox into it.
2. Log out.
3. Log in, but with a plain Openbox session. No desktop, no panel, no LXQt, just Openbox.
4. Open a terminal emulator, and run "firefox &" followed by "diswon <pidOfFirefox>". Then close the terminal emulator.
5. Fullscreen Firefox with F11.
6. Attempt to "switch" out of Firefox with Alt+Tab. No crashes occur.
7. Open Wikipedia.
8. Right-click on a link and click "Open in new window". No crashes occur (this is what was able to reliably trigger the crash before).
9. Switch between the fullscreen and non-fullscreen window with Alt+Tab. Still no crashes.
10: Fullscreen the second Firefox window and switch between them with Alt+Tab. Still no crashes.
11: Take both Firefox windows out of fullscreen and switch between them. Still no crashes.

I did a bit more than that, but that should cover the one-item and two-item test cases.

Chris Guiver (guiverc)
description: updated
Revision history for this message
Chris Guiver (guiverc) wrote :

Aaron asked me to test his version via PPA

As noted in the original bug report; this bug was annoying... as was hitting me often enough I stopped using Lubuntu & used Xfce & GNOME instead while, before returning to LXQt though I replaced `openbox` as WM with `xfwm4` (to avoid issues).

(note: times are likely AEST as I'm getting this detail from my hexchat IRC logs; but if verification/UTC times are needed read https://irclogs.ubuntu.com/2023/03/20/%23lubuntu-devel.txt etc where 13:09:35 is recorded as [02:09] on irclogs.ubuntu.com)

Mar 20 12:53:24 <arraybolt3> OK, will push to a PPA and have available for testing shortly.
Mar 20 12:54:19 <guiverc> :) (but also followed by a rather quick :( as that'll require me to make a change (to test) & then LOGOUT so I can re-login; the logout is the :( )
Mar 20 13:09:35 <arraybolt3> https://launchpad.net/~arraybolt3/+archive/ubuntu/openbox

I installed the patched version on Aaron's PPA
---
openbox:
  Installed: 3.6.1-10ubuntu1~ppa1
  Candidate: 3.6.1-10ubuntu1~ppa1
  Version table:
 *** 3.6.1-10ubuntu1~ppa1 500
        500 https://ppa.launchpadcontent.net/arraybolt3/openbox/ubuntu lunar/main amd64 Packages
        100 /var/lib/dpkg/status
     3.6.1-10 500
        500 http://archive.ubuntu.com/ubuntu lunar/universe amd64 Packages
---
Selected copy/pastes from my hexchat log
---
Mar 20 14:35:17 <guiverc> quick play and no crashes (or FIRES!) arraybolt3 ; I'll use normally; but I was getting openbox crash maybe hourly as I recall prior to fix (why I switched to other DEs then swapped out openbox).. currently using 3.6.1-10ubuntu1~ppa1

Mar 21 09:38:45 <guiverc> fyi arraybolt3 , have had no openbox crashes...

Mar 22 13:14:28 <guiverc> arraybolt3, another day & no openbox crashes... (last was @Mar 16 09:46 my local time; then I switched to other DEs... then back to LXQt but xfwm4 instead of openbox... been using openbox again since your PPA)
---

I've been using my primary system since then without issue.

My usage is normal, I suspend at night then resume in the morning. I rebooted system earlier today, logged in and returned to normal usage.

I have experienced no issues, system has been stable as I expect from Lubuntu :)

---
guiverc@d7050-next:~$ neofetch --off
guiverc@d7050-next
------------------
OS: Lubuntu Lunar Lobster (development branch) x86_64
Host: OptiPlex 7050
Kernel: 6.1.0-16-generic
Uptime: 2 hours, 59 mins
Packages: 3100 (dpkg), 12 (snap)
Shell: bash 5.2.15
Resolution: 1680x1050, 1920x1080, 1920x1080
DE: LXQt 1.2.0
WM: Openbox
Theme: Arc-Darker [GTK2/3]
Icons: oxygen [GTK2/3]
Terminal: qterminal
Terminal Font: Ubuntu Mono 14
CPU: Intel i5-6500 (4) @ 3.600GHz
GPU: Intel HD Graphics 530
Memory: 7864MiB / 15852MiB

Revision history for this message
Aaron Rainbolt (arraybolt3) wrote (last edit ):

(One extra bit of info that I realized was left out - according to the comment on the Arch Linux forums here: https://bbs.archlinux.org/viewtopic.php?pid=2090270#p2090270 the first loop iteration succeeds. It's not until the second iteration (after the list has been modified) that the crash occurs. Thus why I believe the one-item case to be safe.

tags: added: patch
Revision history for this message
Aaron Rainbolt (arraybolt3) wrote :

One worry I am having with my patch now is, what happens if the first element is the one being modified? Then itPrev is a null pointer. I *think* from looking at the GLib code that g_list_next will then return null when it it handed itPrev, which means "it" will be set to null, which I believe will trigger the end of the loop. Assuming I know what I'm looking at, that could put Openbox in an unusual state that could cause problems later down the road.

To get around that, I can add a check to see if itPrev is null, and if so, it will set "it" to stacking_list, which is where it is supposed to end up.

(I'm trying to *not* use the patch from Arch Linux's forums, since I really hate the fact that it has to loop through a bunch of the list every time it modifies an element. That seems quite inefficient.)

Revision history for this message
Aaron Rainbolt (arraybolt3) wrote :

New patch, same approach, but with a fallback added if itPrev turns out to be null.

Revision history for this message
Aaron Rainbolt (arraybolt3) wrote :

After a discussion with dbungert, we're using a different patch instead. This one is possibly upstream already, which makes it superior to the original patch I was suggesting since it means less maintenance overhead in the future.

Revision history for this message
Aaron Rainbolt (arraybolt3) wrote :

Gah, ok, one more try. I failed to update the debian/copyright file last time. This version has the updated copyright file.

Revision history for this message
Dan Bungert (dbungert) wrote :

Thanks for the patch, Aaron. I'm able to reproduce this bug using the
firefox f11 -> open link in new window, when using the live session of
the daily-live lubuntu iso. And the attached patch does seem to help.

Overall the patch is what I was hoping to see, but I will request some
changes.

* The Maintainer field currently reads "Maintainer: Lubuntu
  Developers". I'd like you to go with the more conventional "Ubuntu
  Developers" attribution. `seeded-in-ubuntu` says that openbox is
  also used in Ubuntu Mate, and of course people can use openbox
  otherwise. The `update-maintainer` script is a convenient way to set
  this field to the value I'm suggesting (when starting from the
  original value).
* There is a sponsorship process listed on the wiki, I think
  https://wiki.ubuntu.com/MOTU/Sponsorship/SponsorsQueue is a good
  summary. This can also help, in the future, find someone to take a
  look at the patch as doing so means your work show up in the queue.
  http://reports.qa.ubuntu.com/reports/sponsoring/
  Since I'm already looking at this I won't demand the SponsorsQueue
  bug changes but please keep that in mind for the next one.
* I forget where I saw it but I like the convention that the patch has
  a change number, like it looks like you're on v4 of the patch. So
  it might have been called glib_crash_bugfix-v4.patch or something.
  A nice convention for when these keep changing.
* One item also mentioned on the wiki is forwarding this patch to
  Debian. It is likely that the same problem applies there.
  https://wiki.ubuntu.com/Debian/Bugs talks about this in detail.
  I will ask that you do this before upload. For this particular bug
  I'd feel comfortable sharing the patch with Debian without
  reproducing on Debian, since it's already known to be happening
  elsewhere. Just say so when forwarding.
* A tweak to the changelog would be nice. It isn't necessary to note
  the update to the maintainer field, that will be the case for most
  if not all packages with an Ubuntu delta. Also, the first segment
  about the patch is in my opinion too developer focused. I'd
  probably start with "Cherry-pick patch from A for crash issue when
  B + C happens", fill in appropriate details you think might be
  interesting for someone who might not look at the code.

So Maintainer + Debian + Changelog tweaks and I will be happy to
upload.

Revision history for this message
Aaron Rainbolt (arraybolt3) wrote :

Latest patch has requested changes. Bug has been submitted to upstream Debian here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1033385

Revision history for this message
Dan Bungert (dbungert) wrote :

Thanks, uploaded!

Changed in openbox (Ubuntu):
status: Confirmed → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package openbox - 3.6.1-10ubuntu1

---------------
openbox (3.6.1-10ubuntu1) lunar; urgency=medium

  * Cherry-pick patch from
    http://git.openbox.org/?p=mikachu/openbox.git;a=commit;h=d41128e5a1002af41c976c8860f8299cfcd3cd72
    to avoid crashing when switching from a fullscreen window (LP: #2011751)
  * Updated copyright file.

 -- Aaron Rainbolt <email address hidden> Thu, 23 Mar 2023 15:47:38 -0500

Changed in openbox (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.