[Date Prev] [Date Next] [Thread Prev] [Thread Next] Indexes: Main | Date | Thread | Author

[ba-ohs-talk] Fenfire, RDF (re "Towards a Standard Graph-Based...")

Dear Eugene,    (01)

Toni Alatalo pointed me to your paper, `Towards a Standard Graph-Based 
Data Model for the Open Hyperdocument System`__. You make the point that 
a standard graph-based model could allow different applications to be 
integrated into a single whole. You say that RDF is one possible choice 
for a modeling language, but reject it because of its complicated syntax.    (02)

__ http://www.eekim.com/ohs/papers/graphmodel/    (03)

The Fenfire project (formerly `Gzz`_, formerly GZigZag), which I'm a 
developer on, has recently `adopted`_ RDF as its model (i.e., all data 
is stored in RDF).    (04)

.. _Gzz: http://savannah.nongnu.org/projects/gzz/
.. _adopted:
    http://mail.nongnu.org/archive/html/gzz-dev/2003-02/msg00058.html    (05)

Our aim is very close to what you describe; we want information from any 
application on a computer system (or network) to be available for 
linking with information from any other application, in any linking 
structure (for example IBIS discussion). For a person I'm in contact 
with, there should be a single node on my computer, connected to their 
address, their birthday, my appointments with them, emails I received 
from them, photos of them, and so on.    (06)

(We originally intended to implement this vision based on Ted Nelson's 
zzstructure, but we had to stop using that due to `patent problems`_. 
Toni has `noticed`_ that zzstructure was avoided by (some in) the OHS 
community `because of the patent`_.)    (07)

.. _patent problems:
.. _noticed:
.. _because of the patent:
    http://www.bootstrap.org/lists/ba-ohs-talk/0204/msg00187.html    (08)

So, I'm hoping for some discussion about whether RDF is the right choice 
for these goals. Your concern about RDF is the RDF/XML serialization 
syntax. However, the RDF model is specified independently (`Concepts and 
Abstract Syntax`_) from its serialization (`RDF/XML Syntax 
Specification`_).    (09)

.. _Concepts and Abstract Syntax:  http://www.w3.org/TR/rdf-concepts/
.. _RDF/XML Syntax Specification:
    http://www.w3.org/TR/rdf-syntax-grammar/    (010)

It is entirely possible to specify an alternate syntax for RDF, though 
RDF/XML is of course prefered as the standard. `N3`_ (Notation 3) is 
another syntax of RDF in actual use, designed to be practical to read 
and write for humans. There is ongoing work about XML `Schema 
annotations`_ allowing any schema-conforming XML to be interpreted as 
RDF. For Fenfire, we need a canonical format for RDF graphs, so that 
equal RDF graphs are always serialized to the same byte sequence; we 
might invent our own serialization language for that.    (011)

.. _N3:  http://www.w3.org/2000/10/swap/Primer
.. _Schema annotations: http://www.w3.org/2003/02/schema-annotation.html    (012)

You `propose`_ `GSIX`_ and `GXL`_ as possible alternatives to RDF. I 
must admit that I have not surveyed the two as thoroughly as RDF, but 
from my cursory looks, I've felt that RDF is better suited to our goals 
(Fenfire's, and maybe also yours) than either of them.    (013)

.. _propose:  http://www.eekim.com/ohs/papers/graphmodel/#nid074
.. _GSIX:  http://www.concept67.fsnet.co.uk/gsix/
.. _GXL:  http://www.gupro.de/GXL/    (014)

Firstly, RDF's stated goal is that "Anybody can say anything about 
anything." To archieve this goal, it uses URIs to identify both nodes 
and edge types. This provides--    (015)

     Two applications can say things about the same node without ever
     knowing about each other. Since each has its own set of properties
     (graph edge types), neither 'sees' or inteferes with the other's.    (016)

     This allows different structures to overlap, as you
     `too describe`__.    (017)

     __ http://www.eekim.com/ohs/papers/graphmodel/#nid066    (018)

Location-independent links
     A node identified by a URI can appear in different graphs-- for
     example, my personal information manager, your design proposal,
     and an email to this list. When I view the mail, I see my
     personal connections to the nodes appearing in it; when you
     view it, you see yours. Anybody can publish links to it without
     modifying the original context (e.g. the mail).    (019)

     This seems pretty essential for a hypertext/hypermedia system.    (020)

 From my cursory look, it seems that GXL uses URIs for edges (thus 
providing orthogonality) but not for nodes (thus not providing 
location-independent links). GSIX seems to use local identifiers for 
both edges and nodes.    (021)

Secondly, RDF has a really simple model. Here's my summary (I'm 
structuring into 'steps' for easier understandability-- step 3 is the 
real model):    (022)

     Step 1. A directed labelled graph where each node is a URI
     (actually, URIref-- it can contain a fragment identifier), and each
     arc is labelled by a URI. In other words, a set of triples of URIs.
     Triples are interpreted as [subject, predicate, object].    (023)

     Step 2. In addition to step 1, the objects of triples can be
     *literals* instead of URIs. A literal is a Unicode string, with an
     optional datatype (URIref) and an optional language attribute ('de',
     'fi' etc).    (024)

     Step 3. In addition to step 2, the subjects and objects of triples
     can be *blank nodes*. A blank node is like a URIref, but local to a
     graph (if you join two graphs, the blank nodes in each one are
     different).    (025)

GSIX seems to be reasonably simple, too-- it has 'wildcard placeholders' 
and 'cspaces' which I haven't taken the time to understand, but I 
suppose it would still be ok. GXL, on the other hand, tries to express 
*any* graph-- directed, labelled, attributed, hierarchical, whatnot.    (026)

For Fenfire, we want a very simple model because we want moderately 
skilled users to learn it and use it to their own benefits. We don't 
want users to be locked into expressing those kinds of associations some 
programmer or designer came up with; rather, when they find they have a 
need for expressing their own kind of structure/relationships, they 
should be able to do so. And all structures should be viewable in a 
single 'structure editor' (even though we'll have all sorts of different 
views for application-specific data which you can switch forth and back 
between-- we share the OHS's vision here).    (027)

I also believe that having e.g. attributes on nodes in addition to 
connections on nodes (as possible in GXL) is exactly wrong, because it 
doesn't allow different applications to add other attributes, 
orthogonally. In RDF, all attributes would be represented as properties 
(edges in the graph). This way of expressing the same information is 
orthogonal as defined above.    (028)

The simplicity is also a reason for me to favor RDF over Nodal: in 
Nodal, you have predefined types like sequence or map; in RDF, the same 
information is expressed, but in terms of a very simple underlying 
structure (which also provides for orthogonality and 
location-independent links).    (029)

Finally, RDF has a user community-- not as big as XML, but there are 
e.g. extensive libraries for Java, Python or C, and there's a lot of 
research going on based on it. There are parsers and writers, schema 
languages and tools for checking conformance to a schema, there are 
defined vocabularies (i.e., sets of relationships and nodes with an 
assigned meaning), etc. I even hope that we'll profit from the work 
being done w.r.t. semantics: If tools are developed for translating one 
web service's RDF vocabulary to another's, these tools should be usable 
to translate one hypertext application's vocabulary into another's; for 
example, we could use a view developed for one vocabulary to view data 
stored in another.    (030)

If the work on XML Schema annotations takes off, we can access XML data 
from our system without building additional conversion tools.    (031)

So these are reasons for us selecting RDF-- *orthogonality*, 
*location-independent links* and *simple structure* being probably the 
most important. I'm hoping for and am looking forward to discussion.    (032)

- Benja    (033)