uscan should support parsing s3 bucket listings

Bug #798293 reported by Scott Moser
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
devscripts (Debian)
Fix Released
Unknown
devscripts (Ubuntu)
Fix Released
Wishlist
Unassigned

Bug Description

Binary package hint: devscripts

A few (admittedly very few) upstreams host their files on S3.
By default, S3 does not do apache like file listings, but instead gives file listing in XML format.

ec2-api-tools:
  Homepage: http://aws.amazon.com/developertools/351
  S3 listing: http://s3.amazonaws.com/ec2-downloads/
ec2-ami-tools: http://aws.amazon.com/developertools/368
  Homepage: http://aws.amazon.com/developertools/368
  S3 listing: http://s3.amazonaws.com/ec2-downloads/
rdscli (bug 797387):
  Homepage: http://aws.amazon.com/developertools/2928
  S3 listing: http://s3.amazonaws.com/rds-downloads/

I suggest that uscan could be made to understand the well defined output of an S3 bucket listing. Below is an example. I've added carriage returns, the content usually is only 2 lines (xml header and content).

<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>ec2-downloads</Name><Prefix></Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><IsTruncated>false</IsTruncated>
<Contents><Key>2006-06-26.ec2.wsdl</Key><LastModified>2006-10-23T12:22:30.000Z</LastModified><ETag>&quot;d4fa76ef26b78d3905e009de9db8bf7d&quot;</ETag><Size>28344</Size><StorageClass>STANDARD</StorageClass></Contents><Contents><Key>2006-10-01.ec2.wsdl</Key><LastModified>2006-12-11T15:16:07.000Z</LastModified><ETag>&quot;c10f3f9a199906e09eb3140b1f694e78&quot;</ETag><Size>38244</Size><StorageClass>STANDARD</StorageClass></Contents><Contents><Key>2007-01-03.ec2.wsdl</Key><LastModified>2007-02-10T14:57:28.000Z</LastModified><ETag>&quot;3315c7afe5a27444fddf8e0f21d20d1c&quot;</ETag><Size>39314</Size><StorageClass>STANDARD</StorageClass></Contents><Contents><Key>2007-01-19.ec2.wsdl</Key><LastModified>2007-03-22T16:29:10.000Z</LastModified><ETag>&quot;01d64aeb8a01b41f80e8bc6627daa313&quot;</ETag><Size>40501</Size><StorageClass>STANDARD</StorageClass></Contents><Contents><Key>2007-03-01.ec2.wsdl</Key><LastModified>2007-07-06T05:25:50.000Z</LastModified><ETag>&quot;a716de16be05c734b687f0612134d655&quot;</ETag><Size>43408</Size><StorageClass>STANDARD</StorageClass></Contents>
</ListBucketResult>

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: devscripts 2.11.0ubuntu1
ProcVersionSignature: Ubuntu 3.0-0.1-generic 3.0.0-rc2
Uname: Linux 3.0-0-generic x86_64
Architecture: amd64
Date: Thu Jun 16 11:09:58 2011
EcryptfsInUse: Yes
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Beta amd64 (20100318)
ProcEnviron:
 LANGUAGE=en_US:en
 PATH=(custom, no user)
 LANG=en_US.utf8
 LC_MESSAGES=en_US.utf8
 SHELL=/bin/bash
SourcePackage: devscripts
UpgradeStatus: Upgraded to oneiric on 2010-11-15 (212 days ago)

Revision history for this message
Scott Moser (smoser) wrote :
Scott Moser (smoser)
tags: removed: running-unity unity-2d
Revision history for this message
Scott Moser (smoser) wrote :

Just as an example of a package/watch file, you can check out:
lp:~awstools-dev/ubuntu/oneiric/rdscli/oneiric/

$ /home/smoser/src/devscripts/trunk/scripts/uscan.pl --verbose
-- Scanning for watchfiles in .
-- Found watchfile in ./debian
-- In debian/watch, processing watchfile line:
   http://s3.amazonaws.com/rds-downloads/RDSCli-([0-9].*).zip
-- Found the following matching hrefs:
     RDSCli-1.0.001.zip
     RDSCli-1.0.004.zip
     RDSCli-1.0.005.zip
     RDSCli-1.0.006.zip
     RDSCli-1.1.004.zip
     RDSCli-1.1.005.zip
     RDSCli-1.2.006.zip
     RDSCli-1.3.003.zip
     RDSCli-1.4.006.zip
Newest version on remote site is 1.4.006, local version is 1.4.006
 => Package is up to date
-- Scan finished

Revision history for this message
Benjamin Drung (bdrung) wrote :

This patch should be applied in Debian first.

Revision history for this message
Scott Moser (smoser) wrote : Bug#630756: [uscan] support parsing S3 bucket listings
Download full text (5.2 KiB)

Package: devscripts
Version: 2.10.69ubuntu2
Severity: wishlist
File: /usr/bin/uscan
Tags: patch

A few (admittedly very few) upstreams host their files on S3.
By default, S3 does not do apache like file listings, but instead gives
file listing in XML format.

uscan can be modified to allow watch files that reference these listings.

Attached is a suggested patch.
See also ubuntu bug 798293 (http://bugs.launchpad.net/bugs/798293).

There is very little chance for false positives, and the content that is
found in s3 bucket listings is well defined so it is not likely to stop
working. Before considering the content to be an S3 bucket listing, it
checks:
 a.) that the file begins with "<?xml"
 b.) that it contains the string
     'xmlns=http://s3.amazonaws.com/doc/2006-03-01/'

-- Package-specific info:

--- /etc/devscripts.conf ---

--- ~/.devscripts ---
DEBSIGN_KEYID=024BC6F0
DEBUILD_DPKG_BUILDPACKAGE_OPTS="--source-option=--abort-on-upstream-changes"

-- System Information:
Debian Release: squeeze/sid
  APT prefers natty-updates
  APT policy: (500, 'natty-updates'), (500, 'natty-security'), (500, 'natty')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.38-8-server (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages devscripts depends on:
ii dpkg-dev 1.16.0~ubuntu7 Debian package development tools
ii libc6 2.13-0ubuntu13 Embedded GNU C Library: Shared lib
ii perl 5.10.1-17ubuntu4.1 Larry Wall's Practical Extraction

Versions of packages devscripts recommends:
ii 3.1.12-1ubuntu2 Delayed job execution and batch pr
ii 8.1.2-0.20100314cvs-1 simple mail user agent
ii 2.3.1-1ubuntu1 easy to use distributed version co
ii 7.21.3-1ubuntu1 Get a file from an HTTP, HTTPS or
ii 2.14.5 Command-line tools to process Debi
ii 0.9.6.1ubuntu1 Debian package upload tool
ii 1.14.4-1ubuntu1 Gives a fake root environment
ii 4.0.1+build1+nobinonly-0ubuntu0.11.04.3 Safe and easy web browser from Moz
ii 1:1.7.4.1-3 fast, scalable, distributed revisi
ii 1.4.11-3ubuntu1 GNU privacy guard - a free PGP rep
ii 2.1500-1 Authen::SASL - SASL Authentication
ii 2.27-1 Perl module to parse and convert t
ii 2.005-2 Easy OO parsing of Debian control-
ii 0.2-4build3 Perl extension for retrieving term
ii 1.2000-1 collection of modules to manipulat
ii 1.56-1 module to manipulate and access UR
ii 5.837-1 simple and consistent interface to
ii 2.5.0~rc2ubuntu3 Debian package checker
ii 4.0-0ubuntu11 Linux Standard Base version report
ii 4.43-14ubuntu2 Compression method of 7z format in
ii 2...

Read more...

Changed in devscripts (Debian):
status: Unknown → New
Benjamin Drung (bdrung)
Changed in devscripts (Ubuntu):
status: New → Fix Committed
Benjamin Drung (bdrung)
Changed in devscripts (Ubuntu):
importance: Undecided → Wishlist
tags: added: patch
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.2 KiB)

This bug was fixed in the package devscripts - 2.11.8

---------------
devscripts (2.11.8) unstable; urgency=low

  [ David Prévot ]
  * French translation update.

  [ James McCoy ]
  * dd-list:
    + Recognize -h argument, as documented.
    + Don't error when given multiple binary packages from the same source.
      (Closes: #672309)
  * Also note DEBCHANGE_MAINTTRAILER change in NEWS entry for 2.11.7.
    (Closes: #672973)
  * dget:
    + Fix handling of sources.list entries with a port. (Closes: #672460) Still
      can't handle entries at the same domain but different ports until
      #154868 is fixed.
  * debcheckout:
    + Document the DEBCHECKOUT_SOURCE configuration variable.
    + Determine the source package name when downloading the source tarball.
      This ensures the downloaded files aren't incorrectly removed after being
      downloaded.
    + Adapt find_repo() to determine the tarball name for native packages.

  [ Benjamin Drung ]
  * debchange:
    + Add --vendor= and DEBCHANGE_VENDOR to override the distributor ID
      returned by dpkg-vendor.
    + Always perform Vendor check.
    + Fall back to Debian vendor when a Debian-specific command-line option
      has been supplied (--nmu, --qa, --bin-nmu, --bpo).
    + Adjust --security template for Ubuntu.
    + Add -R/--rebuild flag for Ubuntu's no-change rebuilds.
    + Append ubuntu1 to version when incrementing on Ubuntu, unless a
      -U/--upstream option is given.
    + On Ubuntu, don't copy the previous distribution name for a new changelog
      entry. Use the Ubuntu devel release.
    + Don't use NMU versioning for NMUs / Security uploads on Ubuntu.
    + dch --increment changes XbuildY to Xubuntu1 on Ubuntu (LP: #690230).
    + Try to guess the vendor based on the given distribution name (LP: #723715)
    + Prefer UBUMAIL over DEBEMAIL on Ubuntu (LP: #929846).
  * Add first tests for licensecheck.
  * Add online test for uscan.

  [ Stefano Rivera ]
  * devscripts.Logger Don't substitute arguments into logged strings unless
    they were provided. (LP: #968129)
  * debchange: Use distro-info to determine Ubuntu release names (LP: #997932).
  * Incorporate Ubuntu's delta:
    Move debian-keyring, equivs, libcrypt-ssleay-perl, and libsoap-lite-perl
    to Suggests when building on Ubuntu.

  [ Salvatore Bonaccorso ]
  * bts: When searching for usertags use tag= in the url (followed by
    the options containing users=). (Closes: #675071).

  [ Raphael Geissert ]
  * dget: ignore duplicate repository URLs. (Closes: #675258)

  [ Kees Cook ]
  * licensecheck: Catch LGPL more robustly. (Closes: #623283)

  [ Thijs Kinkhorst ]
  * debdiff: Do not generate warnings when debdiff'ing dpkg source format
    3.0 (git). (Closes: #668372)
  * debuild: Do not warn for missing upstream tarball if package is source
    format 3.0 (git). (Closes: #668372)

  [ Scott Moser ]
  * uscan: Support watch files that reference S3 bucket listings.
    (Closes: #630756, LP: #798293)

  [ Yaroslav Halchenko ]
  * licensecheck: Check licenses in .m (Octave/Matlab), .tex (LaTeX),
    and .pyx (Python's pyrex) files (Closes: #604529)

  [ Ivan Borzenkov ]
  * licensecheck: Add detection co...

Read more...

Changed in devscripts (Ubuntu):
status: Fix Committed → Fix Released
Changed in devscripts (Debian):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.