smartctl assert failure: free(): invalid pointer

Bug #1966610 reported by Vladimir
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
smartmontools (Ubuntu)
Triaged
Medium
Unassigned

Bug Description

smartctl -x crashed with SPCC M.2 PCIe SSD 2 Tb under /dev/nvme1n1p1
executed command: sudo smartctl -x /dev/nvme1 | grep Temp
very rare. hard to reproduce.
ubuntu 22.04
smartmontools 7.2-1build1

ProblemType: Crash
DistroRelease: Ubuntu 22.04
Package: smartmontools 7.2-1build1
ProcVersionSignature: Ubuntu 5.15.0-23.23-generic 5.15.27
Uname: Linux 5.15.0-23-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.11-0ubuntu79
Architecture: amd64
AssertionMessage: free(): invalid pointer
CasperMD5CheckResult: pass
Date: Sun Mar 27 23:36:07 2022
ExecutablePath: /usr/sbin/smartctl
InstallationDate: Installed on 2022-03-25 (1 days ago)
InstallationMedia: Ubuntu 22.04 LTS "Jammy Jellyfish" - Alpha amd64 (20220325)
ProcCmdline: smartctl -x /dev/nvme1
ProcEnviron:
 LANG=ru_RU.UTF-8
 TERM=xterm-256color
 PATH=(custom, no user)
 SHELL=/bin/bash
Signal: 6
SourcePackage: smartmontools
StacktraceTop:
 __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f0bc2dd8b8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
 malloc_printerr (str=str@entry=0x7f0bc2dd6764 "free(): invalid pointer") at ./malloc/malloc.c:5664
 _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at ./malloc/malloc.c:4439
 __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3391
 ?? ()
Title: smartctl assert failure: free(): invalid pointer
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
separator:
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu79
Architecture: amd64
CasperMD5CheckResult: pass
DistroRelease: Ubuntu 22.04
InstallationDate: Installed on 2022-03-25 (2 days ago)
InstallationMedia: Ubuntu 22.04 LTS "Jammy Jellyfish" - Alpha amd64 (20220325)
NonfreeKernelModules: nvidia_modeset nvidia zfs zunicode zavl icp zcommon znvpair
Package: smartmontools 7.2-1build1
PackageArchitecture: amd64
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=ru_RU.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 5.15.0-23.23-generic 5.15.27
Tags: jammy
Uname: Linux 5.15.0-23-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu79
Architecture: amd64
CasperMD5CheckResult: pass
DistroRelease: Ubuntu 22.04
InstallationDate: Installed on 2022-03-25 (2 days ago)
InstallationMedia: Ubuntu 22.04 LTS "Jammy Jellyfish" - Alpha amd64 (20220325)
NonfreeKernelModules: nvidia_modeset nvidia zfs zunicode zavl icp zcommon znvpair
Package: smartmontools 7.2-1build1
PackageArchitecture: amd64
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=ru_RU.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 5.15.0-23.23-generic 5.15.27
Tags: jammy
Uname: Linux 5.15.0-23-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu79
Architecture: amd64
CasperMD5CheckResult: pass
DistroRelease: Ubuntu 22.04
InstallationDate: Installed on 2022-03-25 (2 days ago)
InstallationMedia: Ubuntu 22.04 LTS "Jammy Jellyfish" - Alpha amd64 (20220325)
NonfreeKernelModules: nvidia_modeset nvidia zfs zunicode zavl icp zcommon znvpair
Package: smartmontools 7.2-1build1
PackageArchitecture: amd64
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=ru_RU.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 5.15.0-23.23-generic 5.15.27
Tags: jammy
Uname: Linux 5.15.0-23-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True

Revision history for this message
Vladimir (javafors) wrote :
information type: Private → Public
Revision history for this message
Apport retracing service (apport) wrote :

StacktraceTop:
 __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f0bc2dd8b8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
 malloc_printerr (str=str@entry=0x7f0bc2dd6764 "free(): invalid pointer") at ./malloc/malloc.c:5664
 _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at ./malloc/malloc.c:4439
 __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3391
 drive_database::~drive_database (this=<optimized out>, this=<optimized out>) at ./knowndrives.cpp:98

tags: removed: need-amd64-retrace
Revision history for this message
Apport retracing service (apport) wrote :

StacktraceSource:
 #0 0x00007f0bc2c93a7c in __nptl_setxid (cmdp=0x7f0bc2a7a840) at ./nptl/nptl_setxid.c:181
   [Error: nptl_setxid.c was not found in source tree]
 #1 0x00007ffdaf596478 in ?? ()
 #2 0x0000000000000000 in ?? ()
StacktraceTop:
 __nptl_setxid (cmdp=0x7f0bc2a7a840) at ./nptl/nptl_setxid.c:181
 ?? ()
 ?? ()

Revision history for this message
Apport retracing service (apport) wrote : Stacktrace.txt
Revision history for this message
Apport retracing service (apport) wrote : ThreadStacktrace.txt
Changed in smartmontools (Ubuntu):
status: New → Invalid
Revision history for this message
Apport retracing service (apport) wrote : Crash report cannot be processed

Thank you for your report!

However, processing it in order to get sufficient information for the
developers failed (it does not generate a useful symbolic stack trace). This
might be caused by some outdated packages which were installed on your system
at the time of the report:

zlib1g version 1:1.2.11.dfsg-2ubuntu8 required, but 1:1.2.11.dfsg-2ubuntu7 is available

Please upgrade your system to the latest package versions. If you still
encounter the crash, please file a new report.

Thank you for your understanding, and sorry for the inconvenience!

Revision history for this message
Lena Voytek (lvoytek) wrote :

Thank you for taking the time to report this bug and help make Ubuntu better.

I tried reproducing the error but was unable to after multiple tries, which makes sense as you stated it was very rare. Have you seen this happen multiple times, and is there any additional information you have about the crash that may help to reproduce it?

Also we can try and debug the issue if you have a crash file available. If a file exists in /var/crash/...smartmontools.../_.crash can you try running:

sudo apport-collect 1966610

Thanks

Changed in smartmontools (Ubuntu):
status: Invalid → Incomplete
Revision history for this message
Vladimir (javafors) wrote : Dependencies.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Vladimir (javafors) wrote : ProcCpuinfoMinimal.txt

apport information

description: updated
Revision history for this message
Vladimir (javafors) wrote : Dependencies.txt

apport information

Revision history for this message
Vladimir (javafors) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Vladimir (javafors) wrote :

I managed to reproduce the error.
The full listing:
 sudo smartctl -x /dev/nvme1
[sudo] пароль для vkovalen:
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-23-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number: SPCC M.2 PCIe SSD
Serial Number: 30044294457
Firmware Version: B00u7M10
PCI Vendor/Subsystem ID: 0x1e4b
IEEE OUI Identifier: 0x000000
Total NVM Capacity: 2 048 408 248 320 [2,04 TB]
Unallocated NVM Capacity: 0
Controller ID: 0
NVMe Version: 1.3
Number of Namespaces: 1
Namespace 1 Size/Capacity: 2 048 408 248 320 [2,04 TB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 202020 2020202020
Local Time is: Mon Mar 28 19:55:13 2022 MSK
Firmware Updates (0x12): 1 Slot, no Reset required
Optional Admin Commands (0x0006): Format Frmw_DL
Optional NVM Commands (0x000f): Comp Wr_Unc DS_Mngmt Wr_Zero
Log Page Attributes (0x03): S/H_per_NS Cmd_Eff_Lg
Maximum Data Transfer Size: 256 Pages
Warning Comp. Temp. Threshold: 120 Celsius
Critical Comp. Temp. Threshold: 130 Celsius

Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
 0 + 6.50W - - 0 0 0 0 0 0
 1 + 5.80W - - 1 1 1 1 0 0
 2 + 3.60W - - 2 2 2 2 0 0
 3 - 0.0800W - - 3 3 3 3 5000 10000
 4 - 0.0055W - - 4 4 4 4 5000 45000

Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
 0 + 512 0 0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 41 Celsius
Available Spare: 100%
Available Spare Threshold: 1%
Percentage Used: 0%
Data Units Read: 2 087 524 [1,06 TB]
Data Units Written: 7 394 867 [3,78 TB]
Host Read Commands: 36 739 993
Host Write Commands: 58 824 046
Controller Busy Time: 359
Power Cycles: 30
Power On Hours: 73
Unsafe Shutdowns: 9
Media and Data Integrity Errors: 0
Error Information Log Entries: 99
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 41 Celsius
Temperature Sensor 2: 41 Celsius
Temperature Sensor 3: 41 Celsius
Temperature Sensor 4: 41 Celsius
Temperature Sensor 5: 41 Celsius
Temperature Sensor 6: 41 Celsius
Temperature Sensor 7: 41 Celsius
Temperature Sensor 8: 41 Celsius

Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged

free(): invalid pointer
Аварийный останов

description: updated
Revision history for this message
Vladimir (javafors) wrote : Dependencies.txt

apport information

Revision history for this message
Vladimir (javafors) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Vladimir (javafors) wrote :

It fails ones after reboot, and after that it repeats without error.

free(): invalid pointer

Program received signal SIGABRT, Aborted.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=140737347295296) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: Нет такого файла или каталога.
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737347295296) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140737347295296) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140737347295296, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007ffff7b3e476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00007ffff7b247f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007ffff7b856f6 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7cd7b8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#6 0x00007ffff7b9cd7c in malloc_printerr (str=str@entry=0x7ffff7cd5764 "free(): invalid pointer") at ./malloc/malloc.c:5664
#7 0x00007ffff7b9eac4 in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at ./malloc/malloc.c:4439
#8 0x00007ffff7ba14d3 in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3391
#9 0x00005555555ae341 in ?? ()
#10 0x00007ffff7b41495 in __run_exit_handlers (status=0, listp=0x7ffff7d15838 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:113
#11 0x00007ffff7b41610 in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:143
#12 0x00007ffff7b25d97 in __libc_start_call_main (main=main@entry=0x555555580640, argc=argc@entry=3, argv=argv@entry=0x7fffffffe5f8) at ../sysdeps/nptl/libc_start_call_main.h:74
#13 0x00007ffff7b25e40 in __libc_start_main_impl (main=0x555555580640, argc=3, argv=0x7fffffffe5f8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe5e8)
    at ../csu/libc-start.c:392
#14 0x0000555555580b85 in ?? ()

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for smartmontools (Ubuntu) because there has been no activity for 60 days.]

Changed in smartmontools (Ubuntu):
status: Incomplete → Expired
Utkarsh Gupta (utkarsh)
Changed in smartmontools (Ubuntu):
status: Expired → New
Revision history for this message
Robie Basak (racb) wrote :

Thank you for the additional details.

It looks to me like the bug is in a library used by smartmontools, but the additional information gathered doesn't really provide much more information.

Unfortunately without the ability for a developer to reproduce it, I don't think this is likely to make any progress. We can leave the bug open though, in case other users are affected and can share any more information, or to coordinate if others are able to work on it.

Changed in smartmontools (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Christian Franke (christian-franke) wrote :

Upstream did not yet receive any similar report.

Questions:
- Does it only occur with NVMe devices?
- Does it always occur after the end of the regular output?
- Does it also occur if only a subset of '-x' info is printed?
  (try subsets of '-H -i -c -A -l error')
- Does it also occur with JSON ('-j, --json[=cgiosuvy]') output mode?

Thanks,
Christian
smartmontools.org

Revision history for this message
Vladimir (javafors) wrote :

Never seen it with sata devices.
To me it happens with only one NVMe device (out of two installed),
namely, Silicon Power 2TB A80 (SP002TBP34A80M28).
The crash happens only one time after fresh reboot. So, if I repeat the same command, which caused a crash, it works fine. One needs to reboot to see the crash again.

smartctl -x /dev/nvme1n1 crash
smartctl -a /dev/nvme1n1 crash
smartctl -H /dev/nvme1n1 no crash
smartctl -i /dev/nvme1n1 no crash
smartctl -c /dev/nvme1n1 no crash
smartctl -A /dev/nvme1n1 no crash
smartctl -A -H -i -c /dev/nvme1n1 no crash
smartctl -l error /dev/nvme1n1 crash
smartctl -l error -j /dev/nvme1n1 crash
smartctl -l error --json /dev/nvme1n1 crash
smartctl -l error --json=cgiosuvy /dev/nvme1n1 crash
(did a fresh reboot after each crash)

Revision history for this message
Vladimir (javafors) wrote :
Revision history for this message
Christian Franke (christian-franke) wrote :

The required reboot suggests that there is a bug in the kernel driver or drive firmware. May be related to the fact that `-l error` only reads the first 16 of 64 entries of the error log.

Please test:
smartctl -l error -l nvmelog,0x2,0x200 /dev/nvme1n1

This should print a hex dump of the SMART/Health Information after the 'Error Information'. If the crash appears before the dump, the previous NVMe pass-through call possibly returns more error log data than expected.

Revision history for this message
Vladimir (javafors) wrote :
Download full text (6.8 KiB)

The first run (with crash):

sudo smartctl -l error -l nvmelog,0x2,0x200 /dev/nvme1n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-37-lowlatency] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF SMART DATA SECTION ===
Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged

NVMe Log 0x02 (0x0200 bytes)
 00 00 39 01 64 01 00 00 00 00 00 00 00 00 00 00 00 .9.d............
 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 20 f8 8e cb 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 30 91 9e bb 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 40 03 f0 6e 05 00 00 00 00 00 00 00 00 00 00 00 00 ..n.............
 50 ad 0d aa 04 00 00 00 00 00 00 00 00 00 00 00 00 ................
 60 dd 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 70 77 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 w...............
 80 fd 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 90 16 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 b0 5c 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 \...............
 c0 00 00 00 00 00 00 00 00 39 01 39 01 39 01 39 01 ........9.9.9.9.
 d0 39 01 39 01 39 01 39 01 00 00 00 00 00 00 00 00 9.9.9.9.........
 e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 1a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 1b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 1c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 1d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 1e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
 1f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

free(): invalid pointer
Аварийный останов

The second run (without crash):

vkovalen@asus-tuf:~$ sudo smartctl -l error -l nvmelog,0x2,0x200 /dev/nvme1n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-37-lowlatency] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF SMART DATA SECTION ===
Error I...

Read more...

Revision history for this message
Christian Franke (christian-franke) wrote :

The different Error Information outputs shows that there is possibly something wrong at driver or firmware level.

The 'free(): invalid pointer' possibly appears in exit processing after return from 'main()'. Further diagnostics would require a developer running smartctl in debugger on an affected system.

Revision history for this message
Vladimir (javafors) wrote :

rebuilt smartmontools with debug symbols
the gdb output:

(gdb) r
Starting program: /usr/sbin/smartctl -l error /dev/nvme1n1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-39-lowlatency] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF SMART DATA SECTION ===
Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged

free(): invalid pointer

Program received signal SIGABRT, Aborted.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=140737347258432) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: Нет такого файла или каталога.
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737347258432) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140737347258432) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140737347258432, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007ffff7b35476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00007ffff7b1b7f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007ffff7b7c6f6 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7cceb8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#6 0x00007ffff7b93d7c in malloc_printerr (str=str@entry=0x7ffff7ccc764 "free(): invalid pointer") at ./malloc/malloc.c:5664
#7 0x00007ffff7b95ac4 in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at ./malloc/malloc.c:4439
#8 0x00007ffff7b984d3 in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3391
#9 0x00005555555bd0ac in drive_database::~drive_database (this=<optimized out>, this=<optimized out>) at ./knowndrives.cpp:98
#10 0x00007ffff7b38495 in __run_exit_handlers (status=0, listp=0x7ffff7d0c838 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:113
#11 0x00007ffff7b38610 in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:143
#12 0x00007ffff7b1cd97 in __libc_start_call_main (main=main@entry=0x555555588e08 <main(int, char**)>, argc=argc@entry=4, argv=argv@entry=0x7fffffffe5e8) at ../sysdeps/nptl/libc_start_call_main.h:74
#13 0x00007ffff7b1ce40 in __libc_start_main_impl (main=0x555555588e08 <main(int, char**)>, argc=4, argv=0x7fffffffe5e8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe5d8) at ../csu/libc-start.c:392
#14 0x0000555555581335 in _start ()

so it seems the problem is in drive_database destructor

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Thanks for the feedback.

It doesn't make much sense that the code is crashing at this place, unless one of the pointers that were copied into the vector is invalid.

A quick search for "smartmontools invalid pointer" led me to a post where the user says that they needed to update the SMART database in order to have smartctl properly read the information from the NVMe drive. Have you tried doing that? The following should be enough to update the DB:

# /usr/sbin/update-smart-drivedb

Revision history for this message
Christian Franke (christian-franke) wrote :

The drive database is always read but never used for NVMe drives, except if the drive is behind a USB->NVMe bridge. The related code including the dtor is 13+ years old, see
https://www.smartmontools.org/changeset/2653

Updating the drive database might change the behavior due to different number and size of allocated strings. To test with an empty drive database, use:
smartctl -B /dev/null ...
(The warning about missing DEFAULT entry could be safely ignored)

The crash within the drive_database dtor does not prove that the bug is in its code. The observed behavior still suggests that the code executed if and only if '-l error' is specified results in memory corruption which (at least) affects the area used by vector 'knowndrives.m_custom_strings'.

The bug could possibly be found by stepping through the 'if (options.error_log_entries) {...}' section in nvmeprint.cpp and watching the 'knowndrives.m_custom_strings' area.

Rebuilding without optimization may result in less '<optimized out>' messages in debug output.

Temporarily disabling ASLR may result in a reproducible memory layout:
echo 0 > /proc/sys/kernel/randomize_va_space

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.