Failure to import non-ascii filenames

Bug #508258 reported by Andrew Starr-Bochicchio
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu Distributed Development
Fix Released
High
Martin Packman
bzr-builddeb
Fix Released
High
Martin Packman

Bug Description

There are no bazaar source branches for the “gnome-games” package in Ubuntu on Launchpad. See:

https://code.edge.launchpad.net/ubuntu/+source/gnome-games/+branches

http://package-import.ubuntu.com/failures/gnome-games

Failed at 2010-01-12 21:11:15.048327

Traceback (most recent call last):
  File "./import_package.py", line 788, in <module>
    no_existing=options.no_existing))
  File "./import_package.py", line 713, in main
    import_package(temp_dir, importp, revid_db, bstore, possible_transports=possible_transports)
  File "./import_package.py", line 481, in import_package
    use_time_from_changelog=True)
  File "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py", line 1481, in import_package
    file_ids_from=file_ids_from)
  File "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py", line 1376, in _do_import_package
    timestamp=timestamp, file_ids_from=file_ids_from)
  File "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py", line 1244, in import_debian
    file_ids_from=parent_trees + debian_trees)
  File "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py", line 160, in import_dir
    import_archive(tree, dir_file, file_ids_from=file_ids_from)
  File "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py", line 235, in import_archive
    trans_id = tt.trans_id_tree_path(relative_path)
  File "/usr/lib/python2.5/site-packages/bzrlib/transform.py", line 241, in trans_id_tree_path
    path = self.canonical_path(path)
  File "/usr/lib/python2.5/site-packages/bzrlib/transform.py", line 1294, in canonical_path
    abs = self._tree.abspath(path)
  File "/usr/lib/python2.5/site-packages/bzrlib/workingtree.py", line 395, in abspath
    return pathjoin(self.basedir, filename)
  File "/usr/lib/python2.5/posixpath.py", line 65, in join
    path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 1: ordinal not in range(128)

Affects:
  http://package-import.ubuntu.com/status/bash-completion.html
  http://package-import.ubuntu.com/status/gnome-games.html
  http://package-import.ubuntu.com/status/gnome-themes-ubuntu.html
  http://package-import.ubuntu.com/status/ispell-fo.html
  http://package-import.ubuntu.com/status/kbd.html
  http://package-import.ubuntu.com/status/norwegian.html
  http://package-import.ubuntu.com/status/suomi-malaga.html
  http://package-import.ubuntu.com/status/swedish.html
  http://package-import.ubuntu.com/status/ubuntu-wallpapers.html

Related branches

description: updated
John A Meinel (jameinel)
summary: - No source branches for “gnome-games” package in Ubuntu
+ Failure to import non-ascii filenames
description: updated
Changed in udd:
importance: Undecided → Medium
status: New → Confirmed
James Westby (james-w)
tags: added: bzr
James Westby (james-w)
description: updated
tags: added: import-failure main
James Westby (james-w)
description: updated
description: updated
James Westby (james-w)
description: updated
description: updated
James Westby (james-w)
description: updated
James Westby (james-w)
description: updated
James Westby (james-w)
description: updated
Jonathan Riddell (jr)
Changed in udd:
assignee: nobody → Jonathan Riddell (jr)
Revision history for this message
Jonathan Riddell (jr) wrote :

The current UnicodeDecodeErrors seem to be caused by non utf-8 filenames in packages. These are fine when held in str variables but when they get converted to unicode they get converted back to utf-8 when reading the file which does not exist.

I managed some success by forcing strings be to encoded as iso-8859-1 on package liblingua-de-ascii-perl but the errors moved into bzrlib and I couldn't track them down.

Changed in udd:
assignee: Jonathan Riddell (jr) → nobody
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

how are we converting these non-utf8 filenames to unicode at the moment, do we just assume they're in iso8859-1 ?

Martin Pool (mbp)
Changed in udd:
importance: Medium → Low
importance: Low → High
Revision history for this message
Jonathan Riddell (jr) wrote :

It'll just use whatever the system locale is

Martin Packman (gz)
Changed in bzr-builddeb:
assignee: nobody → Martin Packman (gz)
importance: Undecided → High
status: New → In Progress
Martin Packman (gz)
Changed in bzr-builddeb:
milestone: none → 2.8
status: In Progress → Fix Committed
Revision history for this message
Martin Packman (gz) wrote :

Deployed the bzr-builddeb fix and requeued packages with this issue.

Many should now succeed, those that don't should be categorised under:

<http://package-import.ubuntu.com/status/08eff66a5fe37a967e2f2b06210cc608.html>

Resolving those probably depends on bug 63324 in bzrlib.

Changed in udd:
assignee: nobody → Martin Packman (gz)
status: Confirmed → Fix Released
Jelmer Vernooij (jelmer)
Changed in bzr-builddeb:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.