loggerhead should be cache friendly (send etags)

Bug #248901 reported by Michael Hudson-Doyle
6
Affects Status Importance Assigned to Milestone
loggerhead
Triaged
Low
Unassigned
loggerhead-breezy
Triaged
Medium
Unassigned

Bug Description

Loggerhead can easily enough generate Etags on its pages, which if nothing else would make slapping squid in front an effective way of improving performance.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

James Henstridge had this advice on how to do this:

The guts of the caching code in ViewVC can be found in the
check_freshness() function found in viewvc.py:

    http://viewvc.tigris.org/source/browse/viewvc/trunk/lib/viewvc.py?view=markup

This function takes the etag value and last modified date for the
page, and does the following:
 * If an etag was passed in and an If-Not-Modified header is sent with
a matching value, consider the request as fresh.
 * If a last modified date was passed in an an If-Modified-Since
header is sent with a date equal or newer, consider the request fresh.
 * If the request is fresh, generate a "304 Not Modified" response.
 * Otherwise, add ETag and Last-Modified response headers.
 * Return whether the request is fresh.

If check_freshness() returns True, then the caller is expected to exit
with no further processing (so this should be done before as much
heavy work as possible).

As for picking etags, the main constraint is that they should change
if the content changes. For many ViewVC pages I used "weak etags"
since there were a few cases where the constraint could be broken
(software/template upgrades). If we ignore changes from templates,
then we can see that rendering of most pages are dependent on a single
revision:
 * changes pages: the first revision in the list.
 * annotate pages: the revision of the file being annotated.
 * files pages: the revision of the tree being displayed
 * revision pages: the revision being displayed

So for these cases, it should be possible to use the relevant revision
ID as an etag. And if you've got a revision ID, you should be able to
look this up for a date fairly cheaply.

Now if you have Squid sitting in front, it will use conditional GETs
to validate its cache, so the only points where Loggerhead would need
to fully render a page are:
 * the first time a particular page is rendered, or if the page falls
out of the cache.
 * the underlying branch has changed, so the given URL now points at a
different revision.

Martin Albisetti (beuno)
Changed in loggerhead:
importance: Undecided → Medium
status: New → Confirmed
description: updated
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

So the attached branch is a start on this -- it generates etags for all the templated pages -- but I couldn't manage to get squid to generate any kind of conditional get :/

Revision history for this message
Martin Albisetti (beuno) wrote :

Did this ever get fixed?

Changed in loggerhead:
status: Confirmed → Triaged
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

No, because I couldn't ever figure out how to get squid to behave in a way that was useful for us, and because I realized the way we display dates like "6 hours ago" isn't very etag friendly.

We could and should probably still land something like this, maybe we can do the relative dates using javascript?

Revision history for this message
Max Kanat-Alexander (mkanat) wrote :

I've started to work on this. I have an idea of how to make some generic code for this, although it will require a bit of work.

summary: - loggerhead should be cache friendly
+ loggerhead should be cache friendly (send etags)
Changed in loggerhead:
assignee: nobody → Max Kanat-Alexander (mkanat)
Revision history for this message
Robert Collins (lifeless) wrote :

I'm fairly sure Max isn't hacking on this at the moment.

Changed in loggerhead:
importance: Medium → Low
assignee: Max Kanat-Alexander (mkanat) → nobody
Jelmer Vernooij (jelmer)
Changed in loggerhead-breezy:
status: New → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.