[N.B. I've moved this discussion to the OHS development list...]
In message <3A525655.FF59E087@eng.sun.com>, Eric Armstrong writes:
>In conversation with Eugene, he's been asking
>the question: Are multiple parents for a node
>really necessary? Basically, if that concept
>adds complexity, maybe we should plan on not
>implementing it.
>
>I have felt that it was vital, but so far have
>not presented a compelling enough use case to
>settle the argument. This message is an attempt
>to do so. (And it led to another interesting
>insight that prompted me to jot down my thoughts.)
I'm sort of on Eric's side on this one. I do believe that content
re-use is so fundamental as to require some kind of means of managing
multiple parenthood. Of course as has been pointed out, with
node-level versioning this can get complicated.
Now, there is a simple way of handling this without causing many
problems (I believe) and that is to conceptually separate the nodes
from the paths used to access them (this is a strategy used in Scene
Graphs such as Inventor etc. where reuse of scene components for
essential for efficiency).
So, now I'd finally like to insert myself into the design discussions
in a serious way. Let's assume that we want to build a system that
allows (at minimum) the integration of arbitrary XML documents. What
I'll outline below gets us nearly there (and could easily be used as a
framework upon which to build DOM or SAX interfaces).
Let's walk through some simple IDL:
---- interface Node { readonly attribute ID id; };interface ElementNode : Node { readonly attribute string tag;
readonly attribute unsigned int numChildren; Node child (in unsigned long index);
Attr getAttrFromName (in string name); readonly attribute unsigned int numAttrs; Node attr (in unsigned long index);
};
interface TextNode : Node { readonly attribute string text; };
interface Path { readonly attribute Node node;
readonly attribute Document document; readonly attribute Path parent;
Path firstChild (); Path lastChild (); Path previousSibling (); Path nextSibling (); };
interface Document { readonly attribute Path root; };
----
So, what we've done above is to separate the structure of our tree into a set of *placeless* Nodes (some of which have children) and a Document tree defined in terms of Paths which define where Nodes are *placed* in a Document.
Given this basic structure, we can navigate easily and nodes need have *no parent whatsoever*. All of the tree-based relationships are defined wrt. the Path used to access a Node and not by the Node itself.
Now this division of labour does introduce some issues (and usually simple ways to resolve them).
1) Node IDs: Since we give each node a unique ID, searching for a node is most easily done be providing a mapping from Document->Node via the ID. We can extend Document as follows:
interface Document { ... Path getNodeFromID (in ID id); };
At a level higher than documents, we will also need to provide a map from 2) Attribution: Clearly it is the nodes themselves which should be tagged with creator/contributor tags. Thus:
interface User; interface Timestamp;
interface Node { ... readonly attribute User creator; readonly attribute UserList contributors; readonly attribute Timestamp creationTime; readonly attribute Timestamp modificationTime; };
3) Versioning: Again, it is the nodes which are the units for maintaining version information. Thus:
interface Node { ... readonly attribute Node olderVersion; readonly attribute Node newerVersion;
Node versionDated (in Timestamp time); };
In order to view a document as it was at some previous time, we may need to access dated root, so we'll need to modify the Document as well:
interface Document { ... Path rootDated (in Timestamp time); };
4) Editing: For a number of reasons (more on this later) it is useful to separate the interfaces for access from editing (and maybe even to granularize editing interfaces more finely). But since editing is *very* contextual, the editing interfaces need to be derived from Paths (not Nodes).
interface Path { ... EditablePath edit (in User user); };
interface EditableNode : Node { };
interface EditableElementNode : EditableNode { void setTag (in string name); void setAttr (in string name, in string value); void removeAttr (in string name); };
interface EditableTextNode : EditableNode { void setText (in string text); void splitText (in unsigned long index); };
interface EditablePath : Path { EditableNode editNode ();
Path insertBefore (in Node node); Path insertAfter (in Node node); void remove (); };
Note: All attribution and version updates are managed automatically by these methods. Since an EditablePath is created with a User specifier we have all ensured that all necessary information is available at that time.
Note: Since all insertions are performed wrt. a Path, it is easy to verify both permission and validity of the operation.
5) Creating: All Nodes are created via the EditablePath interface. (N.B. A newly-created node has not actually been inserted into the graph and must be passed as an argument to insertBefore or insertAfter).
interface EditablePath : Path { ... ElementNode createElement (in string tag); TextNode createText (in string text); };
General notes:
* As far as possible, this is a minimal interface description. It provides (usually) exactly one way to ask that a particular action be performed (unlike the DOM for example). It should be easy to see how some of the alternate DOM interface methods could be implemented with this framework.
* The public interfaces expose only readonly attributes. Without the ability to create an EditablePath this is entirely readonly. All modifications are done via methods of EditablePath and its subclasses.
* One subtle issue here is how remote node references change when a new version of a Node is created by modification of an existing Node. A naive implementation would leave old references to Nodes in place in the document tree, so there is a little bit of subtle implementation involved here. But since we had the foresight to establish a locus for representing the mapping from document tree to Nodes, this can all be hidden from the user in the Path implementation...
------------------------------------------------------------------------------- Lee Iverson SRI International leei@ai.sri.com 333 Ravenswood Ave., Menlo Park CA 94025 http://www.ai.sri.com/~leei/ (650) 859-3307
This archive was generated by hypermail 2.0.0 : Tue Aug 21 2001 - 17:57:58 PDT