create_safety_net is brittle

Bug #825027 reported by Vincent Ladeuil
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
High
Vincent Ladeuil

Bug Description

mthaddon encountered a very weird failure when debugging the new pqm install.

All tests were failing with:

    KeyError: 'getpwuid(): uid not found: 2005'

with a backtrace including:

Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/testtools/testcase.py", line 178, in _runCleanups
    function(*arguments, **keywordArguments)
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/tests/__init__.py", line 2521, in _check_safety_net
    if (t.get_bytes('.bzr/checkout/dirstate') !=
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/transport/__init__.py", line 605, in get_bytes
    f = self.get(relpath)
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/transport/local.py", line 168, in get
    self._translate_error(e, path)
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/transport/__init__.py", line 298, in _translate_error
    raise errors.NoSuchFile(path, extra=e)
NoSuchFile: No such file: u'/tmp/testbzr-W7TYpR.tmp/.bzr/checkout/dirstate': [Errno 2] No such file or directory: u'/tmp/testbzr-W7TYpR.tmp/.bzr/checkout/dirstate'

i.e. somthing really wrong has happened when creating the safety net.

Indeed in the first test involving the safety net we had:

Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/testtools/runtest.py", line 144, in _run_user
    return fn(*args)
  File "/usr/lib/python2.6/dist-packages/testtools/testcase.py", line 429, in _run_setup
    self.setUp()
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/tests/__init__.py", line 2908, in setUp
    super(TestCaseWithTransport, self).setUp()
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/tests/__init__.py", line 2679, in setUp
    super(TestCaseInTempDir, self).setUp()
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/tests/__init__.py", line 2630, in setUp
    self._make_test_root()
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/tests/__init__.py", line 2538, in _make_test_root
    self._create_safety_net()
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/tests/__init__.py", line 2506, in _create_safety_net
    wt = bzrdir.BzrDir.create_standalone_workingtree(root)
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/bzrdir.py", line 612, in create_standalone_workingtree
    format=format).bzrdir
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/bzrdir.py", line 322, in create_branch_and_repo
    bzrdir = BzrDir.create(base, format)
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/bzrdir.py", line 1039, in create
    return format.initialize_on_transport(t)
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/bzrdir.py", line 1428, in initialize_on_transport
    return self._initialize_on_transport_vfs(transport)
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/bzrdir.py", line 1567, in _initialize_on_transport_vfs
    control_files.lock_write()
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/lockable_files.py", line 162, in lock_write
    token_from_lock = self._lock.lock_write(token=token)
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/lockdir.py", line 673, in lock_write
    return self.wait_lock()
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/lockdir.py", line 603, in wait_lock
    return self.attempt_lock()
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/lockdir.py", line 552, in attempt_lock
    result = self._attempt_lock()
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/lockdir.py", line 236, in _attempt_lock
    tmpname = self._create_pending_dir()
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/lockdir.py", line 335, in _create_pending_dir
    info = LockHeldInfo.for_this_process(self.extra_holder_info)
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/lockdir.py", line 777, in for_this_process
    user=get_username_for_lock_info(),
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/lockdir.py", line 859, in get_username_for_lock_info
    return config.GlobalConfig().username()
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/config.py", line 1033, in __init__
    super(GlobalConfig, self).__init__(file_name=config_filename())
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/config.py", line 1539, in config_filename
    return osutils.pathjoin(config_dir(), 'bazaar.conf')
  File "/home/pqm/pqm-workdir/bzr+ssh/new-pqm-test/bzrlib/config.py", line 1526, in config_dir
    xdg_dir = osutils.pathjoin(os.path.expanduser("~"), ".config")
  File "/usr/lib/python2.6/posixpath.py", line 256, in expanduser
    userhome = pwd.getpwuid(os.getuid()).pw_dir
KeyError: 'getpwuid(): uid not found: 2005'

While this is a bug in the chroot setup (the user running the
tests wasn't declared in /etc/passwd, something pretty hard to
encounter in real life), I think we should be more robust here if
only to make the diagnosis easier (in the pqm case, all output
was in the subunit format and redirected to a temp file).

Related branches

Revision history for this message
Vincent Ladeuil (vila) wrote :

To debug selftest in a new environment, it's better to use:

     bzr selftest -1

In the case at hand, the selftest would have failed far earlier (-1 tells selftest to stop on first failure).

On the other hand, -1 behaviour could be forced in 'create_safety_net' if a problem occurs as there is no point to keep running in this case.

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 825027] Re: create_safety_net is brittle

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/12/2011 10:43 AM, Vincent Ladeuil wrote:
> To debug selftest in a new environment, it's better to use:
>
> bzr selftest -1
>
> In the case at hand, the selftest would have failed far earlier (-1
> tells selftest to stop on first failure).
>
> On the other hand, -1 behaviour could be forced in
> 'create_safety_net' if a problem occurs as there is no point to keep
> running in this case.
>

True, but we intentionally don't use -1 on PQM because with the long
cycle time, it is better to find out all the failures so you can iterate
locally.

For a deb build or whatever, -1 would be fine if you only care about any
failures.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk5E8m8ACgkQJdeBCYSNAAOkNwCgnuChYoqYoF0dxaCOKVU53WEd
jrUAoLNI0uA/S1b1EVLXVKiZ0dzZMhQC
=DU3J
-----END PGP SIGNATURE-----

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

Apparently fakeroot interacts badly with _check_safety_net. trac-bzr and bzr-cia fail to run their tests because of it:

bzr branch lp:bzr-cia
cd bzr-cia
BZR_PLUGINS_AT=cia@`pwd` fakeroot bzr selftest -s bp.cia

Changed in bzr:
importance: Medium → High
tags: added: regression
Revision history for this message
John A Meinel (jameinel) wrote :

=== modified file 'bzrlib/tests/__init__.py'
--- bzrlib/tests/__init__.py 2011-08-09 09:11:17 +0000
+++ bzrlib/tests/__init__.py 2011-08-30 11:09:47 +0000
@@ -2569,7 +2569,12 @@
                                                     suffix='.tmp'))
             TestCaseWithMemoryTransport.TEST_ROOT = root

- self._create_safety_net()
+ try:
+ self._create_safety_net()
+ except Exception, e:
+ sys.stderr.write("We failed to initialize the safety net.\n"
+ "%s\nExiting\n" % (e,))
+ sys.exit(1)

             # The same directory is used by all tests, and we're not
             # specifically told when all tests are finished. This will do.

This at least would make it fail sensibly when our setUp isolation isn't working.
I'm guessing there is a better way by teaching the test suite runner something about 'no, this means the test run has to stop', but I don't know testtools/etc that well.

Revision history for this message
Vincent Ladeuil (vila) wrote :

One issue here is that create_safety_net indirectly refers to config files and search them in eithe HOME or BZR_HOME.

But both has already been set to None in TestCase.setUp() which calls _cleanEnvironment().

So it cannot find the right config files (in the fakeroot case because it's not allowed to read under /root which according to jelmer which exist only on a few build hosts).

So *while building the safety net* we should either set them to some accessible place or restore the right values from test._original_os_environ which still contains the values seen when starting selftest.

The patch proposed by jam above should take care of any other failure.

Vincent Ladeuil (vila)
Changed in bzr:
assignee: nobody → Vincent Ladeuil (vila)
status: Confirmed → In Progress
Vincent Ladeuil (vila)
Changed in bzr:
milestone: none → 2.4.1
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.