|Here is a partially simplified UML diagram of Atom-FOAF. |
The reality is a little more complicated because there are in fact two ways to represent an Entry:
There is a simple logical relation between the two views, which I will get into in a later blog entry.
- the simple default one shown here
- another way that takes into account the possible states an Entry can have over time.
Two important things to notice here are the yellow and green background zones.
The classes on the green background come from the FOAF namespace. Those on the yellow background have until recently been thought to belong to the Atom namespace. It is my contention here (arrived at after a long conversation with Ken McLeod) that these Feed classes are in fact much more general, and don't in any particular way belong to Atom. We can find similar structures in many places we look on the web - pretty much anywhere we need to chunk a potentially large list of results into smaller sections - such as for example search engine results, WebDav search results(?)... So this is a first attempt at simplification. By pushing out everything Atom related into the Blog class located on the white background reserved for Atom concepts, we end up with a little 'Feed' structure that could be nicely useful elsewhere (after due renaming perhaps) and with a Blog class where we can place a lot of the 'introspection' information.
The UML diagram is of course backed up by the formally specified Atom OWL spec that ships with this release.
|title||It's all about the Entries, stupid!|
|author||bob wyman||created||13 Aug 2004 03:30:00 GMT||N3 file||entry.2004-08-13-0530.n3||RDF file||entry.2004-08-13-0530.rdf||Atom-like RDF file||atom.2004-08-13-0530.rdf|
|This also very nicely illustrates what I was trying to get at with my It's about the Entries, Stupid! post I sent a few months back. By simplifying the model down to the core it becomes apparent that it is indeed the Entry that is at the core of Atom. The person and the Feed concepts are not central to Atom. They are generic concepts that can be found and used elsewhere.|
|title||N3 illustration - The Feed|
|author||H. Story||created||13 Aug 2004 08:47:00 GMT||N3 file||entry.2004-08-13-1047.n3||RDF file||entry.2004-08-13-1047.rdf||Atom-like RDF file||atom.2004-08-13-1047.rdf|
|To start off let us look at the feed files. There are two sets of these files:|
Each of these files is a part of the whole result. I don't yet have a concept yet the union of all the content in these files. This may be something that needs adding. Each file points to the previous results with code such as
feed.n3, which is the head of the feed, the dynamic file that changes whenever a new entry is added to the blog. This is the file that blog readers will be polling every so often.
feed-entries_x_to_y.n3,... each of which is an archive of older feed entries. These files SHOULD NOT change, making them prime candidates for cacheing.
[ a :Link ;
:href <feed-entries_0_to_3.n3> ;
:mime-type "application/rdf+n3"^^xsd:string ;
:text "previous 4 entries"^^xsd:string
which says that the previous entries can be found at the resource
<feed-entries_0_to_3.n3> itself points to the dynamic element of the feed thus
<> a :Feed ;
:about <blog.n3> ;
:dynamic <feed.n3> ;
which points to the dynamic part of the feed. It also points to the blog file that contains the so called 'introspection' information about the blog: namely where the url for adding new entries is located, and other things which I know are not yet fully thought through.
Notice: The current feeds contain very little information. They point to the entries themselves.
feed.n3 for example points to four (only four for illustrative purposes) entries as shown here:
<> a :Feed ;
:dynamic <> ;
:entry <entry.2004-08-13-1047.n3> , <entry.2004-08-13-1445.n3> ,
<entry.2004-08-13-1752.n3> , <entry.2004-08-13-1632.n3> ;
To help clients tell which entries they have or have not downloaded we can add further information such as the EntryID and the EntryVersion of each of these entries. That is done further down in the
:entry-version <tag:bblfish.net/20040813/1047/blog1#version1> ;
:id <tag:bblfish.net/20040813/1047/blog1> .
:entry-version <tag:bblfish.net/20040813/1445/blog1#version1> ;
:id <tag:bblfish.net/20040813/1445/blog1> .
Clearly a lot more could be added. One could add the title (an obvious addition), perhaps the publication date, the last changed date... One could of course add everything, as with the AllInOneDatabase.n3, but that would be extreemely wasteful in bandwidth and very un-RESTful. What to add and not to add is really an empirical research topic. Having very little information is not really a problem. As long as the client can determine where the entries are and which entries it allready has fetched (hence the entry-version field) it will only need to fetch the content once. With HTTP 1.1 Persistent Connections, having to make multiple requests is not at all a problem.
|title||Two perspectives on a blog entry|
|author||Henry Story||created||13 Aug 2004 12:45:00 GMT||N3 file||entry.2004-08-13-1445.n3||RDF file||entry.2004-08-13-1445.rdf||Atom-like RDF file||atom.2004-08-13-1445.rdf|
|The current model proposes two view on an entry: |
This is illustrated by the following diagram:
- the simple Entry, that can be found at a certain retrievable location, and shows only its current state.
- the Entry as a historical thing, that encompasses all the changes that occurred to it in the actualworld (we don't deal with counterfactual entries). This is the EntryID and its associated EntryVersion-s.
Again I have tried to highlight the two areas by placing their classes on differently colored backgrounds. On the yellow background is the main class for the temporal Entry representation, and on the green background, we have the atemporal Entry Representation. Given any one of these one can deduce the other. Ie, they are logical consequences of one another.
Some of the main points distinguishing them are:
- An entry has a URL resource, that allows one to fetch the information (for example a relative uri such as entry.2004-06-29-1010.n3), whereas EntryID and EntryVersions are URNs such as tag:bblfish.net/20040629/1010/blog1#version1 which will indeed uniquely identify an Entry, but will not allow one to retrieve them without a search engine. This difference creates a fundamental difference in use between these two ways of looking at the entry. An Entry is what people should be editing and fetching in a RESTful manner using GET, POST, and PUT. An EntryID is how a client would identify the Entry-s it downloaded to keep track of the changes to them, and that to which they were responding, so it could follow how the entry propagated around the web, etc. The EntryID and EntryState classes are key elements in databases such as AllInOneDatabase.n3, which contains all the information about all the entries in this directory.
- An entry must have an id and of course an entry-version. The unchanging parts of the Entry, its essential properties, go into the EntryID structure. The contingent properties of an Entry go into the EntryVersion. An EntryID may on the other hand have a number of EntryVersion-s. Each of these EntryVersions represents the state of the Entry over a particular span of time. From an Entry one can easily deduce the EntryVersion and EntryID fields. To go in the opposite direction one first needs to select the latest EntryVersion of an EntryID.
- An Entry can be a reply to another EntryVersion. It is important to keep track of which version of an entry one is replying to, as this can significantly change the meaning of a response. For clients this could help clients flag responses that might need to be updated or even deleted, or it could help readers beware that a response may no longer be relevant to the entry it is relating to.
|title||Graphical illustration of the two pespectives|
|author||H. Story||created||13 Aug 2004 14:32:00 GMT||N3 file||entry.2004-08-13-1632.n3||RDF file||entry.2004-08-13-1632.rdf||Atom-like RDF file||atom.2004-08-13-1632.rdf|
|Let me illustrate two views on an Entry graphically, so as not to have to take any sides among the many possible serialisations of semantic web triples: N3, N-Triples, RDF, ... Each of these serialisation formats can be mapped onto a graph of triples, as explained in the w3c's RDF Semantics paper. I here represent resources in rounded rectangles, blank nodes by circles, Literals by rectangles, and of course predicates by named arrows.|
Let us start off with a simple graphical representation of an Entry written in a file
entry1.n3, written by Karl Dubost, where he asserts the cryptic '2b v not2b'.
id of the entry is
tag:e1, and it is the first version as hinted at by the
entry-version which is
tag:e1#v1. The entry was created on 11 Jun 2002 at 5pm, and was published (
issued) shortly thereafter, at 10 minutes past 5. (Note that since we know that the entry is written by Karl Dubost, we may be able to find who is friends are if we have access to some FOAF files that mention him.)
Perhaps shortly later Karl finds that he wants to make a change to his entry. He prefers titles to start with capitals, and changes his statement to a question. He is still thinking about this change, so this change does not yet have a publication date. (how we got this file is of course a problem for my story now). As a result the graph we have is as follows:
Here I have highlighted in green the changes to the graph. Gone is the issued field, a modified date has appeared, and the data fields of the title and entry fields have slightly changed. Of course we have a new version id.
Any person who fetches
entry1.n3 after the change (and after he issues it) will not be able to retrieve the original version, as it will have been completely replaced by the new one. They will know when the file was last modified though. But if someone were to keep track of all these changes - either the editor that Karl is using in order to allow him to backtrack to previous versions were he to think he had made a mistake, or some agregator that wanted to keep a fuller view of the changes made to the posts on Karl's web site (perhaps in order to notify the aggregator's owner that a reply he wrote to Ken's post had changed) - then he would presumably want to keep the changes stored in its local database by organising the entries by EntryID, in a database similar to our AllInOneDatabase.n3. The graph for this entry would then look like this:
Here the root of the tree is the
EntryID, which points to the two
EntryVersions. Notice that in this case the
EntryVersions have an
entry-location, to help find the original entry file. The location is not attached to the
EntryID as the location of an Entry could change over time. In this case the entry has remained in the same position.
It should be very easy to specify a logical relation allowing one to deduce one of the views from the other. Since we are speaking in ontology, there is not concptual priority of one of these views over the other. They both exist simultaneously.
|title||N3 illustration on the two pespectives|
|author||H. Story||created||13 Aug 2004 15:52:00 GMT||N3 file||entry.2004-08-13-1752.n3||RDF file||entry.2004-08-13-1752.rdf||Atom-like RDF file||atom.2004-08-13-1752.rdf|
|The entry file using the Entry class is pretty easy to understand. As an example let us take the file describing this entry, namely entry.2004-08-13-1752.n3.|
<> a :Entry ;
:author [ a <http://xmlns.com/foaf/0.1/Person> ;
The first line just specifies that this file ('<>' in N3) is an Entry. It then continues by specifying the author of the Entry using the FOAF classes. Everything else in the file is pretty self evident. Perhaps the following requires a little closer look at
:id <tag:bblfish.net/20040813/1752/blog1> ;
:entry-version <tag:bblfish.net/20040813/1752/blog1#version1> ;
:in-reply-to <tag:bblfish.net/20040813/1445/blog1#version1> ;
:title [ a :Content ;
:data "N3 illustration on the two pespectives"^^rdf:string ;
Here we specify the id and entry-version tags. Notice that with a well designed URI structure one should be able to guess the id tag from the version tag. Here the version tag just consists of the entry id with '#version1' appended.
in-reply-to property. It relates the current Entry not to another Entry (with a URL) but to antother EntryVersion. How do we retrieve the location of the EntryVersion to which this Entry is a reply? Well further down in the file we have the following statement:
:entry-location <entry.2004-08-13-1445.n3> .
which associate that entry version with a Entry URL namely the above entry.2004-08-13-1445.n3.
How does this Entry appear in the AllInOneDatabase.n3? We just need to search that file for the EntryID
a :EntryID ;
:author _:b2 ;
:created "2004-08-13T17:52:00+0200"^^xsd:dateTime ;
:in-reply-to <tag:bblfish.net/20040813/1445/blog1#version1> ;
:state <tag:bblfish.net/20040813/1752/blog1#version1> .
This just gives us the author
_:b2 wich is an empty node that is specified in more detail elsewhere in the file, the creation time, what this EntryId was in reply to, and the EntryState tag. The values for that tag are also to be found in the file, and its content should correspond to the text you are now reading.