SyntaxHighlighter

Tuesday, October 2, 2007

URLs for Pleiades

Lorcan Dempsey is thinking about URIs, with some good links. Among other things, he points at this tidbit from Richardson and Ruby:

a resource and its URI ought to have an intuitive correspondence

Sean's been a positive influence in this area for Pleiades. Here's what we're doing:

Places

We surface information about our place records under the intuitive URL fragment http://pleiades.stoa.org/places. We elaborate URLs below that level using various intuitive labels for thematic, non-hierarchical groupings of records, as well as the unique identifiers for specific place records. So, for example you get:

Geographic Names

Names are trickier. We'd like to provide users with an intuitive lookup of names like http://pleiades.stoa.org/names/apollonia, but that's problematic because names (whether geographic or personal) are non-unique proxies for identifiers (see recent excellent postings from Karen Coyle and Stuart Weibel). For example, when conversion of our legacy dataset is complete, we'll have 17 places with the name Apollonia.

Our names interface at http://pleiades.stoa.org/names/ points users first to our search form. Individual name records are surfaced there too, using ASCII (sic) transliterations of the name strings (our model aims at one record per unique attested variant in original language and script). Duplicates are presently handled by postfixing a hypen plus a one-up numeral (e.g., http://pleiades.stoa.org/names/apollonia-1). I should point out here that right now we lack backlinks from the name records to the associated place records; that's an urgent to-do. At the main names page, we do have a link to a complete list, which will get unmanageably huge; we'll probably need to add alphabetic and/or max-per-page chunking of that list soon. But I digress ...

URL-wise I'm thinking we could do more to help our users get at the name records. Perhaps we should take a page from Wikipedia and implement name disambiguation pages. Under such a scheme, a URL like http://pleiades.stoa.org/names/apollonia would take a user either to the one-and-only record appropriate record or to a disambiguation page containing links to all the relevant records.

These disambiguation pages would have to surface enough additional information from the records themselves (including their associated places and locations) to facilitate selection of the desired name record. Here we'd want to echo long-standing practice in print works for classical geography. When name ambiguity is a problem, add a regional qualifier (something like "Antioch in Pisidia" or "Pisidian Antoich"). Would that look something like http://pleiades.stoa.org/names/antioch-in-pisidia? When we implement place-to-place relationship tagging, maybe we can leverage that information for this purpose. It probably won't be foolproof for certain edge cases though: some unlocated places with common names may require an alternative or more verbose mechanism for disambiguation.

Locations

The right kind of intuitive access to locations (i.e., feature geometry and coordinates) is a geographic one. I'll table that for a separate future post.

Bibliographic Records

URLs for our bibliographic records are constructed on the basis of human-friendly short titles. For most modern works, these are either abbreviations or author-year combinations. For ancient literary works we follow conventional humanist practices for author and work short titles. For example:

Alternative Formats

Where we offer alternative formats (e.g., Atom+GeoRSS and KML, or MODS for our bibliography), we align them under the appropriate URL fragment. For example:

Now What?

I'd be grateful for critiques or suggestions for improvement. Now's the time to get this stuff right.

2 comments:

Sebastian said...

Quick queries...

1. For places, is it significant that there are sometimes trailing slashes, somtimes not. If yes, why would an individual site have a terminal slash but a group, which is somewhat akin to a folder, not?

I tried adding and removing the trailing slash, both worked. Do you want to decide or better not to care.

2. Why ".html" at the end of bibliographic records but not on other unique identifiers?

3. Are their plans for the right side of unique identifiers. The answer seems to be yes? As in ".html" or ".xml" . Or perhaps, http://pleiades.stoa.org/places/archaic[latitude > 43.00] ? Or is this document('http://pleiades.stoa.org/places/archaic.xml')/place[latitude > 43.00] ?

Tom said...

We've added redirects from naked bibliography urls like bibliography/aasor to the canonical forms like bibliography/aasor.html