Welcome! This is an ARCHIVED page from my old blog

In addition to taking a look at the entry below, why don't you also take a look at some other recent entries:



If you like what you see, please also sign up to the RSS feed

2005-04-15 16:59 UTC My Semantic Web reading for the weekend (and the next couple of months)

My Semantic Web reading for the weekend (and the next couple of months)

I just stumbled on
Dave Beckett's Resource Description Framework (RDF) Resource Guide. I wish I had known about this when I was writing my semantic web essay last month... An absolutely amazing list of resources.

April 14, 2005

The RDF data model and databases

I've been thinking about the RDF data model a lot lately. Including reading up on SPARQL. Initially I didn't like it. However after a while it struck me that the RDF model + SPARQL actually matches most of what I do with databases a lot more than the relational model + SQL.

The problem with the relational model is that I normally work with "resources" that consists of data items that are generally accessed at the same time and are tightly related, yet if I want to get a properly normalised relational database, I end up with insanely complex queries if I want to gather all the information back together again.

With SPARQL queries on RDF data this suddenly becomes simplicity itself because I'm not required to try to figure out a way to map the data I want to query about back into a single row per entity - instead I'm figuring out a way to find the triples I need, and optionally provide a pattern for extracting just the data I want, or alternatively return all the found RDF triples.

The question is whether performance will be good enough - I haven't yet had a chance to experiment with a large scale RDF model.

The ease of querying for a graph of data, as opposed to being constrained into a very simplistic row/column model is a compelling incentive to spend more time on it.

April 11, 2005

Semantic Web for Extending and Linking Formalisms

I came across this paper by chance while searching for material on Z notation and RDF. It's an interesting read on the use of RDF for expressing languages used for formal methods.

While reading it, something occured to me (I'm sure it's not an original idea, but I haven't seen any implementations): It would be great to generate an RDF representation of source code. I'm tempted to spend some time considering if there's an easy way of bolting RDF generation support on to my parser assembler.

It would enable a rapidly growing number of tools that understand RDF to manipulate the code, and would be an interesting way of achieving some of the same as GCC-XML.

With RDF mappings for UML and Z or similar formal languages, I'm sure someone could come up with interesting data mining tools based on combining various data (specifications, models, source) and cross referencing extracted data...

April 09, 2005

lowercase semantic web

Related to my previous post, I found Lucas Gonze's entry on the lowercase semantic web via Marc's voice. I'll definitively be following what Lucas' writes.

Lucas' expresses some of the issues I've had over micro formats in a much more succint (and perhaps less brutal) way:

I don't mean to overcommit to this argument. The problem I'm having is emotional, I think. lowercase semantic web feels like yet another religion, which I don't want. I see lots of value in being strict about HTML semantics, however I don't yet see why I'd want HTML semantics everywhere.

Against Micro Formats for the Semantic Web

I just finished reading Danny Ayers's ramblings on MicroFormats, and while he comes out a bit conflicted, it in some ways helped me get a clear view of my own position on it:

MicroFormats using XHTML is to me what tables for layout was to older HTML.

It is a semantic overloading that tries to apply meaning to tags far beyond their original intent, and while it is tempting because of the immediate advantages, I believe it will come back to bite people badly.

One of the claims that often props up is that the main advantage of using XHTML is that it is immediately stylable via CSS, yet the current main browsers have no problems styling plain XML. A quick look at my feed (via Feedburner) demonstrates that - Feedburner have done a wonderful job of it.

If you need any more, writing cross browser javascript to use XSLT to transform the XML is trivial - my first attempt, which was also my first attempt at using "Ajax"/XMLHttpRequest etc. and my first real javascript app, took me less than half a day to throw together. For that matter, if the page returned is static, specifying an XSL stylesheet works great across at least all Mozilla based browsers and IE if you're a little bit careful about how you write your XSLT.

I just don't buy the convenience argument. I find a well defined XML vocabulary much more convenient, because it is often a lot clearer what the data represents.

Personally I'm becoming a fan of GRDDL, because it allow us to pick simple XML syntaxes (or for that matter those obnoxious Micro Formats) and have an automatic way of deriving RDF, so that we can use the most convenient and most expressive syntax we want in the main document.

I'm increasingly moving towards using XML + XSLT for publishing documents online as well. My first experiment was an online change request list for team at work - instead of maintaining an HTML version, I moved to XML + client side XSLT conversion exclusively. Works great, and resulted in a significantly smaller page (not that size matters for that particular application).

XHTML for me is mostly something to be generated from other sources, not something I'd want to use as a source of document data itself anymore.

As such, the Micro Formats seems to me as a distraction at best, a repeat of huge mistakes of the past at worst - instead, please give me a well defined XML vocabulary; and if you must, please just write some XSLT to generate the XHTML.

Turtle parser update

My Turtle parser is getting along quite nicely. The parser bytecode itself still weighs in at less than a KB, and I've built a very simple RDF model and code to generate triples.

As it stands it now passes most of the pre-requisite tests specified in the Turtle spec. Once it passes the full set, and I'm done with my damn exam, I think it's time to put up a page with the source code and some documentation.

The RDF model is in no way suitable for production use, but I'll abstract out the interface used by the parser to add triples to it, so that it'll be easy to replace it with an adapter to a proper RDF model implementation.

The parser is still fairly messy, so after that I want to go back and improve the assembler to allow local labels and constants, and see whether I should consider some changes to the opcodes. Then I have three candidates for what to continue with: either a BNF parser to generate assembly for my VM, extending the Turtle parser to full N3 and write code to generate assembly from the N3 BNF representation of the N3 grammar, or start on a XML parser.

I suspect I'll start with the BNF tool, as that would massively simplify the XML parser in particular. I've done BNF parser generation tools before, but never quite liked the approach I chose to generating events - I feel the trigger mechanism I've chosen now is fairly nice. The only thing I might want to do is make it easier to pass additional data from the parser to the trigger callback.


April 07, 2005

Parser assembler, RDF and Turtle

I've kept working on my parser assembler, but it's moving slow thanks to actually having a day job to do... However tonight I mostly finished a parser for Turtle - a subset of N3 that allows convenient specification of RDF triples.

It ended up at about 1KB of bytecode, which is quite reasonable. I still need to add some error handling, and then I can start writing some code to actually do some useful stuff with it.

The current code uses an instruction to trigger callbacks from the VM on specific events, and it's proven to work very well. I'm still considering whether or not to add higher level constructs, or possibly replacing a couple of the current instructions, but I want to get more experience with it first.

The current Turtle parser just trigger on each subject, verb and object, and on prefix directives.

Redland (RDF) backend for Movable Type

Kasei is working on a Redland bakend for Movable Type - article and screenshots here

This entry points to some of the promise of the semantic web: Harvesting data from the vast amount of semi-structured data stores already out there. Blogging software already collects data in nicely structured ways, and stores them in databases.

Adding the ability to add slightly more semantic information comes at a very low cost in terms of time spent preparing the data.

Adding the ability of querying that data and linking information together using RDF comes at no extra cost in terms of human work once the software to do so is there.

Marking up all kinds of static information may be interesting, but the promise lies in MT Redland and similar applications that help us take advantage of structure that is already there but not made accessible or interoperable.

Imagine the untold terabytes of data tied up in databases that are not exported because there wouldn't be a simple way for people to make use of the data in a sensible way. Now connect it together, and the potential value of that data will in many cases skyrocket.

April 05, 2005

Tag ontology design

Richard Newman has a writeup on his Tag ontology design including N3 notation for the draft ontology itself. The purpose is to create a generic way to associate tags with content with richer context than what is used for instance by Technorati.

One of the advantages of Richard's design is that by using RDF tags can be distributed separately from the content itself - for instance tags occuring in an RSS feed (for instance from Technorati or del.icio.us ) and in your blog could assert that a certain post is tagged with a certain tag in exactly the same way, or third parties could publish collections of tags from their own categorisation, again in the same format.

More over at Richards blog

March 31, 2005

BlogPulse Conversation Tracker

Via Mike Liksvayer: BlogPulse Conversation Tracker is a specialized search that attempts to build a view of the "conversation" that is created by people commenting on a blog entry around the web.

This is the kind of application that would be so much easier with widespread use of the previously mentioned "Thread Description Language, by explicitly annotating the pages to describe the relationship of the posts and comments to posts.

Instead of having to search, it becomes a simple matter of traversing the links in the documents. A firefox extension that present a three view based on TDL annotation would be great... Unless someone else gets to it first perhaps it's time to spend some time experimenting (please, let someone else get to it first, I'm spreading myself way to thin these days... :) )

March 30, 2005

The Temporal content of Web pages

OWL-Time
is an OWL ontology for describing temporal aspects of web pages or web-services.

One very useful aspect of it is that it's fairly readable and well documented, and comes with several example files - as such it's a great way of getting more familiar with OWL.

Threaded Description Language

Thread Description Language (TDL)


TDL is an RDF vocabulary for desribing threaded discussions, such as Usenet, weblogs, bulletin boards, and e-mail conversations.

So what could it be used for? One obvious thing would be to enable client software to access web based message archives without having to care about scraping the HTML to see if all the header information is there - just embed the RSS in a page,

Another would be as a uniform way of storing meta data about messages and their relationships, or exchanging that information with other applications.

March 29, 2005

Exploring the Semantic Web: MeNow and MusicBrainz

crsmith.net has an interesting entry on using RDF data from MusicBrainz to export information about the music tracks he's currently listening to, and how it'll allow him to link that information to, for instance, license data, review information, FOAF data etc. without having to explicitly combine the data sets: MeNow and MusicBrainz

March 28, 2005

Playing with RSS

Been busy all day programming, and testing out RSS enabling assorted stuff, including my e-mail - just love how easy it is to churn out feeds from any information source available and instantly have it accessible from a wide variety of applications, includinf Firefox.

March 23, 2005

Improve my bookmarks!

It just hit me that it's extremely annoying to manage bookmarks, and I REALLY want people to stick more metadata in their document headers, and for bookmark managers to extract it and annotate the bookmarks and let me use the data to search the stored bookmarks. It's one of those blindingly obvious uses of RDF/RDF-A, and one where Dublin Core entities are already widely used for Search Engine Optimisation purposes so a lot of data is already in there, some of it in a form that fits exactly or almost with RDF-A.

Couple it with other RDF data sources available for webpages, such as RSS feeds and Open Directory, and you could get quite good coverage.

The upside of marking up your static content this way is that it would make it trivial to put together a program to scan your site and generate an RSS file of all recently changed pages as an added value for your regular users.

RDF and the Semantic Web ludicrous ideas?

I came across this: RDF and the Semantic Web are ludicrous ideas | Semiologic which is a short and thought provoking alternative view of the Semantic Web. I had to post a response, part of which I quote the most important parts here (go visit semiologic to read the rest):


People won't mark up every little bit they put online, but that isn't needed: People will mark up the bits they care about. I'd rather have info that matters marked up than all kinds of fluff.

Companies selling online will mark up their catalogues because 1-2% extra sales for adding some extra processing of their products database is worth it, and it won't take much of an audience to a new product search engine before they'll be able to reach that.

(...)

Your example is contrived, because there's no point in marking up everything - you mark up whatever will have it's value increased through markup. That means data that it's important for you that people can find and reason about.

(...)

A vast amount of the web HAVE semantic information associated with it in the databases and content management systems they're generated from - but that information is lost when it is output into a form that is only human readable and not easily machine parsable.

Unlock 5-10% of the database content that is tied to the net and we already have the Semantic Web.

March 21, 2005

Semantic web as future reality

This entry at Fred on Something neatly summarises my painful experiences while reading the W3 specs and assorted tutorial this weekend:

The thing is that RDF is not intended to be easily understood by humans like simple XML documents. RDF is intended to be understood by machines.

However, I still think the lack of accessibility of the W3 specs is a big problem. The XML spec is reasonably accessible. Even the XML Schema spec is. I can sit down with them, read them, and start writing a parser. Granted, it wouldn't be a very good parser if I didn't know more than I'd learned from a single reading of the specs, but I'd be able to.

It's less important that the formats are inaccessible if the specs are easily accessible so we get good tools to deal with them.

Nobody cares that Postscript is painfully obtuse to read in a text editor, and that doing so won't really tell you much about the document it describes, because we have good tools to manipulate postscript files and few of us need to interpret the files directly.

However the RDF and OWL specs are painfully dense, and painfully fluffy and full of mathematical terms that for me and most software engineers I know reads as mostly nonsense.

This massively complicates the issue of getting good tools to work with it, and at this early stage even makes it hard to get people to understand the potentials of the technology.

I'm sure these specs represent great work, but it could have been so much better if more effort had been put into 1) examples and 2) presenting the normative semantics by specifying the intended effects in terms of observable effects on the RDF graph, or conceptual addition of RDF triples (even if the implementation wouldn't necessarily have to store these triples).

The triples aren't hard to understand. The RDF graph isn't hard to understand. The bloody description ohe OWL semantics IS.

I wish the W3 would take some cues from ECMA, and do what ECMA did for ECMA 262 (the ECMAScript / Javascript specification), where the document specifies the semantics of the language by presenting expected results in terms of code rather than abstract mathematical terms.

Personally I have this intense hate for these kinds of specs as they're hardly ever needed.

I have no problems understanding how to implement a backpropagation neural network, for instance. However that is thanks to plain English or pseudo code descriptions of the algorithms involved. If somebody tried showing me a mathematical representation of it I'd glaze over instantly.

I've yet to see a single example of something presented in this kind of notation that isn't possible to do just as well in natural language, and that will be significantly more accessible to a significantly larger audience.

If you want to win the Nobel Prize in maths then accessibility to the general public isn't needed as long as other leading scientists understand you. If you try to write specifications with the goal of transforming the web, which became successful largely exactly because it was accessible and anybody could easily understand how to make use of the technology, it is.

N3 as a logic language

After yesterday's entry Understanding the Semantic Web: N3 to the rescue I went on to spend some time actually starting to write a N3 parser. The language is straightforward enough, though the BNF grammar was a bit awkward.

That might be because it was actually generated from an N3 description of N3 itself. I ended up reading the N3 description instead of the BNF, and got most of the way to having a working language checker (as in, it parses most of the language but throws away the result) in a couple of hours, and plan to fill it in to create a proper parser later this week.

N3 seems promising to me both as a way of exploring RDF and OWL and as a data format in it's own right.

I'll want to implement a basic RDF storage model as well, but that seems quite straightforward (I'm looking at a testing ground, not production quality code)

I was looking at Redland yesterday too, and while I'm sure it's a fine system, it just seems far too complex for my taste.

N3 really drove home the idea that what RDF-S and OWL and the rest is really about is simple logic programming based around Horn clauses. It's a very constrained, and simple model, which is good because apart from some toying with Prolog when I was a kid I haven't spent much time on it - this is a great opportunity to read up now that I have real world applications for it.

March 20, 2005

Understanding the Semantic Web: N3 to the rescue

I've spent most of today reading up on RDF, OWL and other painful stuff. Things were going really slowly (what f******d decided using set theory to describe the RDF semantics was a good idea, when it could have been so "simple" if they'd instead just explained things in terms of what triples could be inferred) until I came across N3.

I briefly mentioned Metalog earlier, and that was a great start - allowing me to play around with "human readable" assertions. But N3 is a step closer to the "real thing", and in fact Ntriples, a reduced form of N3 can be generated by Metalog.

N3 is part of a Semantic Web Application Platform (or Playground) set up to facilitate practical demonstrations of semantic web technology. So far it's succeeded for me - it's told me far more about the Semantic Web than any of the specifications.

The N3 grammar seems clumsy and badly documented, but there is a great tutorial covering N3 and how to apply it to the Semantic Web.

If you feel brave, you might also want to take a look at Euler - a Java app to verify conclusions by inferring proofs for them. The Java code for Euler is some of the nastiest stuff I've seen (1700 lines in one class and pages upon pages in a single function) but it seems like something worth investigating further once I've digested more of the N3 stuff.

Ah... I get it now: It's Prolog all over again

Metalog - the semantic web query/logical system seems to be exactly what I've been looking for in terms of allowing simple exploration of Semantic Web technologies without all the lofty promises clouding things up.

Take a look at the quick guide and if you've ever read an introductory article on Prolog you'll be right at home... (In fact Metalog uses Prolog for it reasoning support)

It's helpful in that it allows you to translate the Metalog input into RDF triples and RDF/XML format, while playing with it in a pseudo natural language, so it drives home the mappings much more effectively than most tutorials I've seen.

I wish Metalog had OWL support too, but it's a start, I guess.

Ontology Development 101: A Guide to Creating Your First Ontology

Stanford has a great tutorial online as part of their Protégé project (an open source ontology editor in Java w/an OWL plugin). You can find the tutorial here

I found the link to Protégé at Chaz Blog - seems he's running into some of the same problems I do with how to apply this stuff in practice.

What on earth are Ontologies and taxonomies?

Search Science has a great little summary here

March 19, 2005

The complexities of OWL

I've spent today writing on an essay on the Semantic Web, and reading up more particularly on OWL.

What hits me is the complexity. The OWL Guide was a big help, but I still find it difficult to see how to apply it to the real world. I mean, I can see the potential - the idea of being able to effectively convey semantics, even in the the face of data using different ontologies, and the promise of being able to query about properties that are not explicitly written out in the data through machine reasoning.

But I've yet to find a tutorial or introduction that explicitly address more directly useful scenarios instead of the "let's build a complex ontology" scenario.

That, and the lack of a wide choice of tools to reason about OWL ontologies means that we're likely still years away from seeing the real promise of the Semantic Web realised.

In the meantime there is still lots we can do to approach the Semantic Web gradually. One of the main things is to embrace RDF directly or through RDF-A or GRDDL. Without OWL we're stuck doing things like inferring mappings between various ontologies by ourselves, but the more widespread RDF datasources become, the more incentive are we creating to invest in creating tools that can solve the interoperability issue (whether by making OWL usable, or by finding something else).

I'm curious about to what extent the complexity of OWL is needed, or whether it is complex because the problem is still not sufficiently well understood and a simpler solution may come along.

David Weinberger on taxonomy and tags

David Weinberger has posted som short noted from his birds of a feather on taxonomy and tags from etech that's well worth a read.

They sum up quite simply the differences between "real world" taxonomies, structured around the concept of straight subdivision of concepts, and the emerging categorisation that occurs with tagging, where everything is "one big pile" of concepts with added semantic markup that leaves the categorisation to users (whether human or software).

March 17, 2005

Semantic Web Round Up

I've written a few entries about the Semantic Web already, but since my deadline is nearing on my essay for the MSc. course I'm doing, I've started rounding up a few links that I think is worthwhile sharing as well.

So what does the Semantic Web look like, then?

That's a debate that seems to be getting more and more heated.
On the more artistic side Dan Cooney has an interesting interpretation (via Edward Vielmetti).

The Semantic Web is here is a great introductory presentation by Eric Miller (via hannes.kaywa.com)

At heart of this debate is the discussion on whether folksonomies or ontologies provides the most value in a distributed, uncontrollable media like the internet.

This also tends to translate into a debate on whether to use micro formats or RDF as the carrier of semantic information.

Some thoughts on RDF vs micro formats

A quick and dirty RDF tutorial (via Ebiquity blog at UMBC - Thanks!) is a good way to start if RDF and the semantic web is completely new to you.

Regardless of past experience with the semantic web I would also recommend Tantek's presentation The Elements of Meaningful XHTML which provides a great overview of Micro formats and how to convey as much semantic information as possible through the use of XHTML alone.

However, while I see micro formats and overloading XHTML as useful to some extent, it misses a lot of the potential of the Semantic Web by not making the semantics of the markup easily discoverable, for instance through RDF-S or OWL.

RDF without the nasty syntax

One way of getting that benefit while at the same time achieving much of the simplicity and bottom-up approach to the semantic web is RDF-A.

Instead of specifying individual micro formats, RDF-A is an attempt to provide a trivially simple way of attaching attributes and elements to an XHTML/XML document from which it is easy to derive RDF triples.

The benefit is that you can do all the fancy stuff that people are working towards doing with RDF, while at the same time getting most of the simplicity that Tantek and the guys at Technorati is going for with the micro formats.

You can have your XHTML and still get RDF too

Another approach to solving this problem is GRDDL:

A mechanism for using transformations (in XSLT in particular) to express the relationship between XHTML dialects and RDF in order to expose the data in these dialects to the Semantic Web. The mechanism extends straightforwardly to XML formats in general.

GRDDL would allow micro formats to live alongside RDF eating agents by letting the XHTML specify a transformation, for instance an XSL document that specifies how to transform the XHTML into RDF.

GRDDL could be used for RDF-A as well, obviating the need for an RDF processor to specifically know RDF-A or future alternative syntaxes as long as it knows how to apply the transformations.

See GRDDL as the glue that makes it possible for you to more or less ignore the "war" between RDF and micro formats as markup if all you want to do is write apps that consume RDF.

In the end I prefer RDF-A over Microformats because they seem to give more potential for reuse.

Mike Linksvayer knows this stuff better than I do, and has this to say (this was where I found cool stuff like RDF-A)

Who'll tag all this stuff?

In this article Russel Glass raises exactly that question and goes on to say:

Just as the Web, however, allowed an upstart like Amazon to compete with Barnes & Noble, the Semantic Web has the potential to level the playing field again for a whole new generation of startups. With the Semantic Web in place, any vendor will have the ability to tag their product information and make it as easily accessible as Amazon.

I agree with this. One of the key values of the semantic web is that it breaks down virtual monopolies. Today, it takes a tremendous effort to gather together product information to set up a product search, for instance, because the information is harder to find than need be. Contrast it to how easy it is to find news items via RSS. Now imagine that a retailer can achieve only a one percent sales increase thanks to aggregators and new search services if they tag their data. They'd jump on it instantly.

Will the Semantic Web become a success?

Notes from The Semantic Web: Promising Future or Utter Failure, a panel discussion with Linksvayer, Galbraith, Marlow, Haughey, and Champeon is a good read for a quick introduction to the various viewpoints.

philwilson.org: Finding related items in your RSS datastore points out another issue: Once you have all this information tagged, as we do with blog entries via RSS for instance, how the f**k do we actually find what we want in between all the cruft that's bound to show up (take a look at the list of blogs at blo.gs for instance, and you'll see a worrying number of pure spam blogs), as well as all the stuff posted with a good intention that simply isn't your cup of tea.

March 16, 2005

The lowercase semantic web

Tantek çelik and Kevin marks have a presentation titled Real world semantics covering micro formats and other building blocks of a "lo-tech" version of the semantic web focusing on an evolutionary, developer led approach rather than the committee approach.

Danny Ayers: XML and/or RDF

Danny Ayers have another great article on RDF, using Amazon's OpenSearch Description Document. He briefly ties it in to FOAF and DOAP (a RDF schema describing a vocabulary for describing an open source project).

March 15, 2005

The Long Tail of Recombinant Components - Bridging the Semantic Gap

In Manageability - The Long Tail of Recombinant Components, Carlos E. Perez expands on the previously discussed article
The Long tail of Software. Millions of Markets of Dozens by addressing how this relates to recombinant computing - or the ability to assemble an application from configurable components.

(See also Strong Signals: The Recombinant Corporation by John Parkinson for a higher level look at recombinant components.)

The idea of recombinant components is more that just about being able to configure components and tie them together - it is about being able to adapt and reuse software without engineering in the traditional sense.

One of the key problems that needs to be solved in order to allow proper interaction between disparate components is semantic disparities and the vocabularies of they use to communicate. Well defined ontologies will likely pay a key role, and as such the Semantic Web technology is likely to pay an important role by bridging the semantic gap between components created in different environments.

This semantic gap is present in many areas of software engineering - different components operate at different levels of complexity, or make different assumptions about the how a system should work.

Even components that operate on the same level of complexity will often end up using disparate technologies to communicate with the outside world, or different vocabularies if they happen to share the same external represenation (such as XML).

To allow a user to reconfigure an application, you can't rely on components where "glue" code has to be written in a programming language to modify how the application works. Even basic scripting often proves a significant barrier, and is brittle in the face of evolving software.

OWL - the Web Ontology Language, a component of the Semantic Web, may prove to be an enabling technology for recombinant components by allowing software to reason about software - provide facts about the software in OWN and RDF, and some degree of bridging the Semantic Gap can be automated, provided that components are built with it in mind.

Some work has already been done to represent UML in rdf (see also A Discussion of the Relationship Between RDF-Schema and UML) allowing UML diagrams describing parts of a component to be converted to a form where it can be reasoned about automatically.

With components being well documented with UML, the UML being represented as RDF, and the appropriate RDF or OWL schemas, one would be approaching a situation where one could imagine a visual interface for recombining components based on automatically generated "bridges" between disparate vocabularies, where the software would be able to give guidance as to what connections makes sense from a semantic perspective.

I'll probably revisit this subject once I've spent some more time reading up on the semantic web.

March 14, 2005

UMBC Semantic Web Reference Card

Via: Danny Ayers, Raw Blog:

"The UMBC Semantic Web Reference Card is a handy "cheat sheet" for semantic web developers and programmers. It can be printed double sided on one sheet of paper and tri-folded. The card lists common RDF/RDFS/OWL classes and properties, popular namespaces and terms, XML datatypes, reserved terms, grammars and examples for encodings, etc."

TV Listings via RSS/Atom?

I want my TV listings via RSS or Atom. Atom would be great because it allows arbitrary XML to be inserted, so you could add RDF triples to add machine readable versions of most of the information.

It would in particular be interesting to allow easy publication of customised schedules for fan sites etc.

Maybe I'll have to hack something together based on XML-TV

Now that I think of it, this would work great for things like playlists as well...

Do we need the Semantic Web?

ZDNet UK takes a brief look at the current status of the Semantic Web in this article.

I'm currently preparing an essay on the Semantic Web as part of my MSc. studies with Open University, so it's a subject I'm particularly interesting in, and in some ways it was actually my interest in the Semantic Web that finally got me to set up a blog. Why?

Because bloggers are among the first to actively embrace elements of the Semantic Web, such as RDF used to varying extents in RSS and Atom.

As such, we do need the Semantic Web. Every time we link in an RSS/Atom feed to a page, or adds a FOAF profile to a page, we're building the Semantic Web.

Longer term, the work on ontologies and schema languages such as RDF Schema and OWL promises to bring a lot more advanced features, such as reasoning about data to allow agents to "understand" new data types and integrate data types using new vocabularies automatically into their data models.

However, short term, whenever the scripts you use to maintain your blog autodiscovers trackback info, or you fire up your RSS reader, you are seeing the early fruits.

As for benefits, consider this: Bloggers are currently dealing with older RSS, RSS 2 and Atom, all of which have different vocabularies.

One of the most immediate benefits of the Semantic Web is unifying vocabularies. Instead of asking for the author of a feed, you'll use a library to query for a specific concept so that when somebody releases YetAnotherFeedFormat v1, and it's RDF linking to an OWL schema, your software would automatically know that your "author" concept matches YetAnotherFeedFormat's "SomeDudeWhoWroteAnEntry" tag, except that the latter also includes "CoolNickName", and would be able to use the data from the new format without having to release a new version.




About me

E-mail: vidar@hokstad.com Skype: vhokstad
Twitter: vhokstad
View my LinkedIn profile.

I was born April 21st, 1975, in Oslo, Norway. Since 2000 I've been living in London, UK. I'm married and we just had our first child, Tristan Ikemefuna Hokstad.

I'm working for Aardvark Media as Director of Technology. I'm also currently on the board of SpatialQ, a startup in the GIS space, and an advisor to Skoach, a startup doing a time management app for people with ADD.

Twitter Updates

    follow me on Twitter