Sunday, May 24, 2009

unAPI format and Semantic Web

Ross Singer's recent post about "One Data Format Identifier (and Registry) to Rule Them All" caused some interesting discussions. I didn't read all of them, and don't have much to add, but I do have a chance of reflecting how unAPI format wiki was created at the first place.

It turned out that I always have trouble of understanding why Semantic Web is so obsessed with URI and Ontology. I am not Semantic Web expert, but to me the strict URI approach seems to directly conflict with other "convention over configuration" approaches like tagging, wiki, twitter. And it doesn't seem everyone will have the time to learn each other's ontology anyhow. As a result, any sufficiently large RDF file always make my head spin because of all these long URIs, and I am not sure how much bandwidth were used to carry them on the Internet.

So why don't just use a word as the identifier, and let everyone pickup a dictionary and find out its semantic. I guess Oxford dictionary is a better agreed-upon ontology. In retrospect, perhaps this is how unAPI format differs from SRU/OpenURL, we can choose a name and hopefully it will work. Does it work? I don't know.

Since multiple copies keep stuff safe, I also paste here the unAPI format retrieved from Internet Archive.

name type example desc doc
amazon application/xml opa a convention used by OPA
asn1 text/plain opa Abstract Syntax Notation One asn.1
bibtex text/plain hubmed bibtex bibtex
dc text/plain opa unqualified Dublin Core
didl application/xml TODO MPEG-21 DIDL didl
endnote text/plain refbase endnote endnote
latex application/x-latex refbase latex latex
marcxml application/xml Technosophia MAchine Readable Cataloging in XML marcxml
markdown text/plain refbase markdown markdown
mods application/xml Technosophia Metadata Object Description Schema mods
html text/html refbase HTML HTML
oai_citeseer application/xml opa
oai_dc application/xml Technosophia unqualified OAI Dublin Core oai_dc
pdf application/pdf refbase Portable Document Format PDF
pubmed application/xml opa pubmed article pubmed
rdf/xml application/rdf+xml hubmed Resource Description Framework RDF
ris text/plain hubmed ris ris
rss application/xml Technosophia Really Simple Syndication RSS
rtf application/rtf refbase Rich Text Format RTF
srw_dc application/xml Technosophia unqualified SRW Dublin Core
srw_mods application/xml refbase unqualified SRW MODS
text text/plain opa
wrap application/x-javascript opa unalog json format