Thursday, February 28, 2008

What do URLs "mean":

Does anybody besides me find this bizarre:

gets you:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">
<html xmlns="" xml:lang="en" lang="en">

gets you:

<?xml version="1.0" encoding="UTF-8"?>
<?oxygen RNGSchema="" type="compact"?>
<TEI xmlns="" rend="home">

gets you:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">
<html xmlns="" xml:lang="en" lang="en">

Why not:

or some such (in which filename extensions actually bear some relationship to what the user gets, rather than the hidden underlying format or application/file hierarchy on the server) -- and thus throughout the whole website.


Gabriel Bodard said...

I don't know, but I'll pass this on to the Council, someone among whom no doubt has an opinion. :)

Daniel said...

And here's a good answer: quoting Sebastian Rahtz and Syd Bauman:

By [this] argument, all the .php and .asp pages in the world are wrong too. It's actually quite common for the suffix to reflect the source, not the output. (SR)

And Syd:

On the web it has been commonplace for years now to have in the URL [an extension referring to the type of data in] the source file that you ask the server for, rather than the type of data that the server returns in the URL (some of that information is stored elsewhere in the HTTP header stuff).

Syd adds that the combining names and types of data together are bad practice anyway, but what can you do?

Tom Elliott said...

I would not disagree that unhealthy urls like this are commonplace; the TEI website is not alone or worst.

Syd's last point is actually the point. Apropos which there's an established meme on sane/cool ur[i|l]s.

What can one do? Do something better!

Drop extraneous or ambiguous extensions entirely then:

I should say, I applaud the tei website design for making the three avatars of each resource separately addressable (instead of switched by javascript madness or some such), and discoverable by clicking through links.