Re: Link Evolution

From: Eugene Eric Kim (eekim@eekim.com)
Date: Fri Apr 13 2001 - 00:04:54 PDT


On Thu, 12 Apr 2001, Murray Altheim wrote:

> One of the things that I've been thinking about that might avoid this
> problem would be to keep a site map topic map that would act as a sort
> of online revision control system. As documents are checked in as
> replacements for older revisions, a link mapper would create a topic
> map that mapped the old links to the new. URLs to a previous version
> would be handled by a server preprocessor that would provide the updated
> link from the topic map.

This would be cool. The real difficulty, though, is, what is the mapping
algorithm?

Over the past few months, I've been writing down my thoughts on links and
link databases. It's not quite ready to put up on my web site -- I still
have to convert some scribbled notes into prose -- but my section on link
integrity is more or less complete. I've attached it below as some food
for thought. I'd love to get some feedback, especially from Roger, as I'm
sure the Xanadu guys have already worked through many of the issues I
raise.

-Eugene

-- 
+=== Eugene Eric Kim ===== eekim@eekim.com ===== http://www.eekim.com/ ===+
|       "Writer's block is a fancy term made up by whiners so they        |
+=====  can have an excuse to drink alcohol."  --Steve Martin  ===========+

----- Link Integrity

We want the OHS to maintain link integrity across all documents. In other words, once you create a link to something, it should never break.

The first requirement for link integrity is that documents are never deleted from the system. If you link to a document, and that document is subsequently removed, the link breaks. The only way to fix that link is to put the document back into the system.

The second requirement is to have a logical naming scheme that is separate from the physical name and location of a document. On the web, if you have the document http://foo.com/bar.html, and you move it to http://foo.com/new/bar.html, links to the first URL break. You need a name for that document that will always point to the right place, even if the document is physically moved to a different part of the system.

The third requirement is version control. This is where things start to get a little hairy. Version controlled systems are insert-only. In theory, nothing is ever removed. This satisfies the first requirement.

However, in a useful DKR, links don't just not break, they also evolve. Suppose you have a document, foo.txt, that contains the following text:

These are the dasy that try men's souls.

Example. foo.txt, version 1.

Note that there's a typo -- "dasy" should be "days."

Now suppose someone creates a link to this sentence in this version of the document. Suppose that afterwards, you notice the typo and correct it. This results in a new version of the document:

These are the days that try men's souls.

Example. foo.txt, version 2.

If your links neither broke nor evolved, then the original link would continue to point to version 1 of the document, not this new version. However, this does not always seem to be desirable behavior. If I created a link to this sentence -- essentially designating it interesting and relevant content -- when the typo is corrected, I'd prefer that the link now point to the corrected document, version 2.

This is certainly doable. The system could automatically assume that the link pointing to the first sentence in version 1 should now point to the first sentence in version 2.

However, there are two scenarios when this would not be the correct behavior. First, what if, instead of fixing the typo, the sentence was changed to:

Livin' la vida loca.

Example. foo.txt, version 3.

If the purpose of the link is to designate the target content as relevant, then the content of the first sentence of this third version no longer applies, because the meaning of the sentence has completely reversed.

Second, what if the link is from an annotation that says, "There's a typo in this sentence"? In this case, you would want the link to point only to version 1, since the typo does not exist in version 2 (and, for that matter, in version 3).

How can we accomodate these scenarios? One solution would be to allow the user to define how the link should evolve with new versions of the document. So, for example, you could specify that the link that points to the first sentence in version 1 should also point to the first sentence in some number of subsequent versions of foo.txt.

Another solution would be to have the system automatically notify everyone who has linked to a document (or who has otherwise registered for notification) that the document has changed, and have those people manually update their links, possibly providing suggestions as to how to update the links.



This archive was generated by hypermail 2.0.0 : Tue Aug 21 2001 - 17:58:04 PDT