LaunchpadDirectory shouldn't use xmlrpc

Bug #397739 reported by Martin Pool
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
High
John A Meinel
2.3
Won't Fix
Medium
John A Meinel
Launchpad itself
Fix Released
Low
Unassigned

Bug Description

Possibly bzr's LaunchpadDirectory (the thing that provides a directory service like "lp:bzr") shouldn't use xmlrpc lib but rather another protocol:

* xmlrpclib is an external dependency with some bugs like bug 186920
* it's inconsistent with other API-type services on Launchpad
* it's also inconsistent at the code level with the mechanisms used by bzr

We could alternatively send a (possibly authenticated) http REST request.

On the other hand, other than possibly implicitly fixing those bugs, there may not be much of a benefit in changing this, and it would require a new deployment on the Launchpad side.

Related branches

Revision history for this message
Jonathan Lange (jml) wrote :

Not as much of a deployment on the Launchpad side as you might think. We already have a public method that takes a branch URL (including an lp: URL) and returns a branch object.

There is already a patch to bzr providing a facility for doing authenticated http REST requests to Launchpad. Note that this introduces more external dependencies. These deps are in some ways worse than xmlrpclib, since they don't come with the standard library.

If Bazaar would like to avoid these dependencies, then we could perhaps provide our own implementation of a subset of launchpadlib.

I'm definitely personally interested in seeing this bug fixed.

Changed in launchpad-code:
importance: Undecided → Low
status: New → Triaged
Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 397739] Re: LaunchpadDirectory shouldn't use xmlrpc

2009/7/12 Jonathan Lange <email address hidden>:
> Not as much of a deployment on the Launchpad side as you might think. We
> already have a public method that takes a branch URL (including an lp:
> URL) and returns a branch object.
>
> There is already a patch to bzr providing a facility for doing
> authenticated http REST requests to Launchpad. Note that this introduces
> more external dependencies. These deps are in some ways worse than
> xmlrpclib, since they don't come with the standard library.

Would it be feasible to perhaps have Launchpad answer these queries by
a plain text response (ie one url per line), so that we don't need a
json parser in bzr? I think that was one of the objections.

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
Jonathan Lange (jml) wrote :

On Mon, Jul 13, 2009 at 11:49 AM, Martin Pool<email address hidden> wrote:
> Would it be feasible to perhaps have Launchpad answer these queries by
> a plain text response (ie one url per line), so that we don't need a
> json parser in bzr?  I think that was one of the objections.
>

I don't think so. Other dependencies include wadllib, launchpadlib,
lazr.uri, oauth and whatever they depend on.

Revision history for this message
John A Meinel (jameinel) wrote :

Downgrading importance given that bug #186920 is fixed.

Changed in bzr:
importance: Low → Wishlist
Revision history for this message
Aaron Bentley (abentley) wrote :

At this point, we have plugins that use the API, and the API supports anonymous access. So it's certainly possible, but it's hard to imagine disabling the xmlrpc option for those without API support.

Martin Pool (mbp)
tags: added: lpurl
Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> At this point, we have plugins that use the API, and the API supports
> anonymous access. So it's certainly possible, but it's hard to imagine
> disabling the xmlrpc option for those without API support.
>

Right. Not to mention that my experiments don't instill me with
confidence that going via the API will be any faster/provide a better
user experience.

Using 'timeit' to load all plugins right now:
$ TIMEIT "call([sys.executable, '-c', 'import bzrlib.plugin; \
                bzrlib.plugin.load_plugins()'])"
10 loops, best of 3: 496 msec per loop

Adding an import of the launchpad api:
$ TIMEIT "call([sys.executable, '-c', 'import bzrlib.plugin; \
 bzrlib.plugin.load_plugins();\
 from bzrlib.plugins.launchpad import lp_api])"
10 loops, best of 3: 1.04 sec per loop

Which is 500ms of overhead to load lp_api. On a Linux virtual guest,
things are a little bit better:
268ms baseline, 384ms import lp_api.

Though also that machine doesn't have as many plugins or search paths
installed.

It is still a rather significant overhead, though I suppose comparable
to an http round trip. So if lp_api could save us a few round trips to
resolve the lp: url, it could be a net win. (500ms is a lot, though.)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkt0RN4ACgkQJdeBCYSNAAP4NQCfeSQuyMTctFilHRIMtUnnpv21
M9MAoMUukpMaQkyEi2tuI0MIe/ScWOIR
=nu3O
-----END PGP SIGNATURE-----

Revision history for this message
Martin Pool (mbp) wrote :

@john I think just changing to using lp rest apis would be not worth doing

Here's a straw man:

* always connect over ssh
* if we don't know your username when we first connect, ask (actually
a bit hard to implement on top of openssh...)
* if you want http, say so
* connect code into the Launchpad ssh server to remap branch names
when they're opened.

This would cut out all overhead for people who do have accounts, who
are our most active users. People without accounts are treated as a
bit second class at the moment because they get this warning every
time and access is a bit slower. There are other bugs open asking for
anonymous ssh etc.
--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
Robert Collins (lifeless) wrote :

On Thu, 2010-02-11 at 17:56 +0000, John A Meinel wrote:

> It is still a rather significant overhead, though I suppose comparable
> to an http round trip. So if lp_api could save us a few round trips to
> resolve the lp: url, it could be a net win. (500ms is a lot, though.)

I think you should file a bug on this with the launchpad developers;
this is a _huge_ overhead: more than all of bzr itself, and really
unsuitable for command line use.

-Rob

Revision history for this message
Martin Pool (mbp) wrote :

https://bugs.edge.launchpad.net/launchpadlib/+bug/522963 and https://launchpad.net/bugs/521927 mean that using launchpadlib to resolve urls is really infeasible at the moment. We should be looking to eliminate the extra lookup altogether.

Revision history for this message
Jonathan Lange (jml) wrote :

FWIW, the API is already exposed.

A thing we could do is provide some server-side code to serve branches from their aliased locations. This would avoid the round-trip altogether. We'd probably need to do something like this to implement lp: aliases for private branches. Bug 261609 has a plan for this.

Revision history for this message
Martin Pool (mbp) wrote :

After the next rollout, the server side of this should be done for ssh access: we can go to bzr+ssh://bazaar.launchpad.net/+branch/bzr and get the development focus branch for bzr. When bzr knows it has an ssh account, it could just rewrite the url itself.

Later on Launchpad can also provide these redirects for http and we can do the same there.

Changed in bzr:
importance: Wishlist → Medium
tags: added: launchpad
Revision history for this message
Martin Pool (mbp) wrote :

> When bzr knows it has an ssh account, it could just rewrite the url itself.

Is this in fact true? If so, perhaps we should just do it. Or, we could just try it, and manually test all the relevant classes, including: series and product default branches, private branches, source package branches, etc.

Revision history for this message
Robert Collins (lifeless) wrote :

I think we no longer need this at all do we? usable urls are entirely predictable AIUI.

Revision history for this message
Jonathan Lange (jml) wrote :

On Mon, Jan 17, 2011 at 3:55 PM, Robert Collins
<email address hidden> wrote:
> I think we no longer need  this at all do we? usable urls are entirely
> predictable AIUI.

Correct.

Changed in launchpad:
status: Triaged → Invalid
Martin Pool (mbp)
tags: added: easy
Changed in launchpad:
status: Invalid → Fix Released
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

The translation from lp:foo to an http url when launchpad-login cannot be done client side yet, I think?

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

*when no launchpad-login is present, sorry

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 1/18/2011 5:22 PM, Michael Hudson-Doyle wrote:
> *when no launchpad-login is present, sorry
>

Well this works:
  http://bazaar.launchpad.net/+branch/bzr

I'm not 100% sure you can branch from that.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk02JmIACgkQJdeBCYSNAAMlygCfSpF591MoclDQskIQ/0wuDq5b
P+AAoJGGIQVjQoGZMl8/ThVhpKHW3oQK
=gX7E
-----END PGP SIGNATURE-----

Revision history for this message
Jonathan Lange (jml) wrote :

On Tue, Jan 18, 2011 at 5:22 PM, Michael Hudson-Doyle
<email address hidden> wrote:
> *when no launchpad-login is present, sorry

I thought http://bazaar.launchpad.net/+branch/foo worked?

jml

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

On Tue, 18 Jan 2011 17:46:42 -0600, John Arbash Meinel <email address hidden> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 1/18/2011 5:22 PM, Michael Hudson-Doyle wrote:
> > *when no launchpad-login is present, sorry
> >
>
> Well this works:
> http://bazaar.launchpad.net/+branch/bzr

Oh! I'm out of date then.

> I'm not 100% sure you can branch from that.

It seems not:

$ bzr info http://bazaar.launchpad.net/+branch/bzr
bzr: ERROR: Not a branch: "http://bazaar.launchpad.net/%2Bbranch/bzr/".

I did actually try this before posting :-)

Cheers,
mwh

Revision history for this message
Martin Pool (mbp) wrote :

Yes, I think at the moment we can do this for ssh and not http.

Revision history for this message
Martin Pool (mbp) wrote :

Removing this would take a couple of seconds off every connection, and fix a bunch of bugs, so I'll bump this up. For instance it will help with the performance problems in bug 680763.

Changed in bzr:
importance: Medium → High
Revision history for this message
John A Meinel (jameinel) wrote :

This is pretty trivial to do, so it is done in the associated branch.
I did do a fair amount of refactoring to make the code clearer.

My timing from Netherlands was 368 ms (using a TIMEIT loop). As Robert observed, this is a huge amount of time. In bug #680763 Martin observed it was as much as 2s+. Considering the overall push time was 19s, this is a fair amount of overhead.

Changed in bzr:
assignee: nobody → John A Meinel (jameinel)
milestone: none → 2.3.2
status: Confirmed → In Progress
Revision history for this message
John A Meinel (jameinel) wrote :

Some times from IRC running:
python -m timeit -s "from bzrlib import plugin, initialize; initialize(); plugin.load_plugins(); \
from bzrlib.plugins.launchpad.lp_directory import LaunchpadDirectory; lpd = LaunchpadDirectory()" \
"lpd.look_up(None, 'lp:bzr')"

devpad: 140ms
vila: 300ms
jam: 323ms
fullermd:897ms
beuno: 1800ms

So the baseline time to resolve an 'lp:' url is 140ms, and you add lots of latency overhead on top of that.

Revision history for this message
John A Meinel (jameinel) wrote :

Jelmer noticed that we still have the "http://code.launchpad.net/PROJECT" redirects available (.bzr/branch/location pointing to the real URL).
We could fall back to those for the HTTP side if we didn't want to use XMLRPC even when the user isn't authenticated.

Changed in bzr:
milestone: 2.3.2 → 2.4b2
Revision history for this message
John A Meinel (jameinel) wrote :

The fix is available for 2.3, but we're planning on letting it cook in bzr.dev for a little bit first.

Revision history for this message
John A Meinel (jameinel) wrote :

    for instance, domain in LAUNCHPAD_DOMAINS.iteritems():
        LAUNCHPAD_INSTANCE[instance] = 'https://xmlrpc.%s/bazaar/' % domain

Note that this regressed recently, because we switched to HTTPS for all XMLRPC requests, instead of plain http. I'm guessing this is probably to resolve private branches, etc.
If you look at this log:
https://bugs.launchpad.net/launchpad/+bug/680763/+attachment/1933516/+files/lp1.txt

You can see:
           1 0 1.1641 1.1641 <built-in method do_handshake>

So it is costing ~1.2 seconds of real time to do the SSL handshake with the xmlrpc server.
Added incentive to just get rid of the xmlrpc request entirely.

Revision history for this message
Andrew Bennetts (spiv) wrote :
Revision history for this message
John A Meinel (jameinel) wrote :

The work is done in a branch available for 2.3. We decided for now to only land it on trunk.
Note that "2.3-avoid-xmlrpc-ssh-397739" shouldn't be used because of bug #741375.

Changed in bzr:
status: In Progress → Fix Released
Revision history for this message
Martin Pool (mbp) wrote :

This is is a nice speedup, but not a low-risk bugfix that we could put into an SRU. Given Natty is now in alpha freeze (iirc) and might get this uploaded before beta, I think it's arguably safe enough for 2.3 if we're sure it's safe in trunk, and if the lp stacking issues are addressed first. Otherwise, wontfix for 2.3.

Revision history for this message
Martin Pool (mbp) wrote :

When it's removed for http as well we can:

 * remove the import, which I believe is reasonably slow
 * close a bunch of bugs about problems caused by xmlrpc

At the moment xmlrpc is always imported, and we could perhaps save some time and memory by not doing that unless it's actually going to be used. Perhaps not very much if lp_directory only gets loaded when we're about to open a network connection.

Revision history for this message
John A Meinel (jameinel) wrote :

We decided not to backport this to the 2.3 series. The cost/benefit tradeoff wasn't sufficient.
Also, it doesn't get rid of XMLPRC entirely, because while *codebrowse* supports http://bazaar.launchpad.net/+branch/bzr/changes Apache itself does not, so you can't branch from http://bazaar.launchpad.net/+branch/bzr

Revision history for this message
Martin Pool (mbp) wrote :

Bug 813272 for doing this on http too.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.