SyntaxHighlighter

Wednesday, February 1, 2012

Playing with PELAGIOS: Nomisma

So, I want to see how hard it is to query the RDF that PELAGIOS partners are putting together. The first experiment is documented below.

Step 1: Set up a Triplestore (something to load the RDF into and support queries)

Context: I'm a triplestore n00b. 

I found Jeni Tennison's Getting Started with RDF and SPARQL Using 4store and RDF.rb and, though I had no interest in messing around with Ruby as part of this exercise, the recommendation of 4store as a triplestore sounded good, so I went hunting for a Mac binary and downloaded it.

Step 2: Grab RDF describing content in Nomisma.org

Context: I'm a point-and-click expert.

I downloaded the PELAGIOS-conformant RDF data published by Nomisma.org at http://nomisma.org/nomisma.org.pelagios.rdf.

Background: "Nomisma.org is a collaborative effort to provide stable digital representations of numismatic concepts and entities, for example the generic idea of a coin hoard or an actual hoard as documented in the print publication An Inventory of Greek Coin Hoards (IGCH)."

Step 3: Fire up 4store and load in the nomisma.org 

Context: I'm a 4store n00b, but I can cut and paste, read and reason, and experiment.

Double-clicked the 4store icon in my Applications folder. It opened a terminal window.

To create and start up an empty database for my triples, I followed the 4store instructions and Tennison's post (mutatis mutandis) and so typed the following in the terminal window ("pelagios" is the name I gave to my database; you could call yours "ray" or "jay" if you like):
$ 4s-backend-setup pelagios
$ 4s-backend pelagios
Then I started up 4store's SPARQL http server and aimed it at the still-empty "pelagios" database so I could load my data and try my hand at some queries:
$ 4s-httpd pelagios
Loading the nomisma data was then as simple as moving to the directory where I'd saved the RDF file and typing:
$ curl -T nomisma.org.pelagios.rdf 'http://localhost:8080/data/http://nomisma.org/nomisma.org.pelagios.rdf/'
Note how the URI base for nomisma items is appended to the URL string passed via curl. This is how you specify the "model URI" for the graph of triples that gets created from the RDF.

Step 4: Try to construct a query and dig out some data.

Context: I'm a SPARQL n00b, but I'd done some SQL back in the day and XML and namespaces are pretty much burned into my soul at this point. 

Following Tennison's example, I pointed my browser at http://localhost:8080/test/. I got 4store's SPARQL test query interface. I googled around looking grumpily at different SPARQL "how-tos" and "getting starteds" and trying stuff and pondering repeated failure until this worked:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX oac: <http://www.openannotation.org/ns/>

SELECT ?x
WHERE {
 ?x oac:hasBody <http://pleiades.stoa.org/places/462086> .
} 

That's "find the ID of every OAC Annotation in the triplestore that's linked to Pleiades Place 462086" (i.e., Akragas/Agrigentum, modern Agrigento in Sicily). It's a list like this:
  • http://nomisma.org/nomisma.org.pelagios.rdf#igch1910-agrigentum-5
  • http://nomisma.org/nomisma.org.pelagios.rdf#igch2089-agrigentum-24
  • http://nomisma.org/nomisma.org.pelagios.rdf#igch2101-agrigentum-32
  • ...
51 IDs in all.

But what I really want is a list of the IDs of the nomisma entities themselves so I can go look up the details and learn things. Back to the SPARQL mines until I produced this:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX oac: <http://www.openannotation.org/ns/>

SELECT ?nomismaid
WHERE {
 ?x oac:hasBody <http://pleiades.stoa.org/places/462086> .
 ?x oac:hasTarget ?nomismaid .
} 

Now I have a list of 51 nomisma IDs: one for the mint and 50 coin hoards that illustrate the economic network in which the ancient city participated (e.g., http://nomisma.org/id/igch2081).

Cost: about 2 hours of time, 1 cup of coffee, and three favors from Sebastian Heath on IRC.

Up next: Arachne, the object database of the Deutsches Archäologisches Institut.



1 comment:

Sebastian Heath said...

Hey Tom, FWIW, there was a slight issue with the nomisma-pelagios rdf whereby Aegina was left out of the fun. The nomisma id for the mint as well as 78 hoards with coins from the island are now included in http://nomisma.org/nomisma.org.pelagios.rdf .

This is all way cool!

Thanks,
-Sebastian