Doug/Eugene summary -- August 22, 2000

From: Eugene Eric Kim (
Date: Fri Aug 25 2000 - 08:12:45 PDT

Doug and I went through the exercise of recreating our respective
pictures of the OHS architecture from scratch. Doug then explained
some of his requirements for the OHS and some of his concerns about
its linking capabilities. Finally, we talked about the e-mail
component of the OHS.

OHS Architecture

I'll discuss the architectural diagram in detail in a separate e-mail,
because it'll take some time for me to regenerate it using xfig, and I
want to get this summary out quickly. I will make a few quick points
here, however.

At the center of the architecture is the OHS document. The OHS
document has structure and granular addressability. It should also be
able to support embedded links, although in our picture, we separate
those into a different box, the linkbase. In all of our discussions
thus far, we have been assuming that this document uses XML as its
file format, because XML supports all of its requirements: structure,
addressability, and links.

There are two crucial requirements for stage one of the OHS: it must
support multiple views of the OHS document for multiple clients, and
it must extend the capabilities of legacy documents. Both of these
requirements require translation. In the case of the former, it
requires translating the OHS document into some presentation form.
In the case of the latter, it requires translating the legacy
document into an OHS-supported document type.

For the sake of clarity, Doug has been using "transcoding" to
represent the former translation, and "translating" to represent the
latter translation. I personally don't like his choice of terms, but
the important thing is that he has chosen something, so I'll stick
with his terminology.

Also, note that the transcoding can occur in either the client or the
server stage. I'll discuss this in more detail in my other e-mail.

One of the very important things that we haven't yet done is that we
need to mockup some user interfaces for this project. It's something
that Doug would very much like to see us develop. I know that some
people on ohs-dev right now, but I just want reiterate the importance
of doing this.

Supporting Legacy Documents

One of the crucial features of the OHS is integration with legacy
documents. For our initial purposes, this essentially means the
ability to link to and from legacy documents. Initially, we also want
to support multiple viewing capabilities through our architecture, and
we want to enable people to add links to and from legacy documents
using legacy tools. In other words, I should be able to view a source
code file with links to and from it using the OHS, and be able to add
links to and from the source code using existing text editors.

XML gives us high resolution addressability. To get the same
resolution in legacy documents, we can dynamically translate those
documents into some XML format, and then link into those XML

For static documents, like e-mail, this solution works great.
However, for constantly changing documents, this solution leads to
link rot. The only fool-proof way around this is to have permanent
IDs (or in Augment lingo, SIDs) attached to every node. However,
legacy documents will not have these.

Doug is really concerned about this problem -- I, less so, for reasons
I'll discuss in a second -- so we talked about possible solutions.
First, we explored some automatic solutions. Doug suggested creating
an ID for each node in a legacy document by hashing the node. This is
only a partial solution. If you link to an ID that is a hashed node,
then that link will still be good if the node moves, but its content
stays unchanged. However, if the content changes, the ID changes, and
so you've got link rot again.

My proposed partial solution was to require version information in
links. That way, links are always good, although they won't always be
current. Doug agreed that this was a good solution, although I could
tell that it didn't satisfy him.

My compromise solution was, when users commit new versions of their
legacy documents into version control, we run a script that
automatically attempts to map links to and from the old versions of
documents to new ones using whatever clever algorithms people come up
with. The system would then prompt users to confirm or edit these
links. This, of course, assumes that people are using version control
in the first place.

We talked a little bit about source code documents as being an early
candidate for translation. I agree that this is a good candidate for
some limited work, because source code is inherently structured and
labelled. There are already systems that generate multiple,
hyperlinked views from source code, and we can certainly incorporate
those features into our system.


Doug asked, what's the stage we need to reach in order for people to
use our system for viewing (and responding to) e-mail rather than
their existing clients? Doug wondered why people would rather read
unrev-ii and ohs-dev from their e-mail clients rather than from
hypermail. Doug prefers the latter, because people can link to those
messages. I explained that linking to hypermail messages requires a
lot more effort than hitting "r" in your e-mail client, and having the
entire message included in your editor. Regarding his first question,
I felt that having multiple views and linkability for e-mail will be
immediately compelling.

One area where Doug and I were apart was, what's the document type we
need to translate in order to support e-mail archives? Doug's feeling
is that we should translate existing hypermail archives. I think that
we should start by translating RFC822 files. I feel strongly about
the latter, but I need to write up my ideas more extensively, so that
they are clear and understandable.

Bootstrapping Commentary

I think it's important for us to keep the bootstrapping strategy in
mind when designing this system. Supporting legacy documents is
crucial for the bootstrapping strategy. However, guaranteeing 100
percent compatibility with legacy documents is not.

For example, we could spend years coming up with solutions for link
rot with legacy documents, but we'll never completely solve it.
However, an OHS document with SIDs shouldn't have these problems.
(This is somewhat of a simplified claim, that isn't actually entirely
true, but I'll reserve my commentary for a later time.)

What we want to do is to enable as many OHS features as we can for
legacy document types and tools, then encourage people who find these
features compelling to migrate over to native OHS documents and
tools. We'd also like to encourage other developers to incorporate
OHS support into existing tools.


+=== Eugene Eric Kim ===== ===== ===+
|       "Writer's block is a fancy term made up by whiners so they        |
+=====  can have an excuse to drink alcohol."  --Steve Martin  ===========+

This archive was generated by hypermail 2.0.0 : Tue Aug 21 2001 - 17:57:53 PDT