Re: [unrev-II] TopicMaps, Ted Nelson, Virtual Files, and everything

From: Eugene Eric Kim (
Date: Tue Jun 05 2001 - 11:11:15 PDT

  • Next message: "Re: [unrev-II] TopicMaps, Ted Nelson, Virtual Files, and everything"

    On Wed, 30 May 2001, Lee Iverson wrote:

    > I believe that there is a problem here. I don't believe Ted "gets"
    > the shifts in understanding that have taken place in moving from SGML
    > and HTML to XML. In almost all circumstances, embedded tags are now
    > (reading the most-enlightened XML literature) considered to be
    > semantic type identifiers.

    Okay, I tried to resist the urge of being pedantic, but to no avail. :-)
    Aren't all forms of syntax in essence semantic type identifiers? And in
    this sense, aren't HTML tags also semantic type identifiers? They
    represent formatting semantics -- i.e. this element is a headline, this
    element should bold, etc. -- but aren't those still semantics?

    I actually agree with your original point, Lee. I didn't quite understand
    Ted's brief explanation last Tuesday of why he didn't like markup
    languages, so I dug up his paper "Embedded Markup Considered Harmful" (WWW
    Journal, Winter 1997). It's not on the Web, unfortunately, so I'll
    summarize his objections here:

    1. Transclusion of a marked-up document may result in a syntactically
    invalid excerpt. In other words, if I want to transclude the phrase "To
    be, or not to be" from the XML document "<quote>To be, or not to be. That
    is the question.</quote>", the textual excerpt returned will not be
    well-formed XML, because it won't include the end tag.

    2. The quoting author may want to change the look-and-feel or even the
    structure of the quotation. If this is done by copy-and-paste, no
    problem, but if this is done using transclusion, then you need to have
    some form of view control.

    3. Embedded structure limits the kinds of structure that can be expressed.

    For brevity's sake, I won't comment on the first two objections here,
    nor to his general feeling about mark-up, but I do want to say a few words
    about his third objection. Nelson's solution is to store content as raw
    text, and structure as a layer over the content. I agree with this in
    theory, but in reality, there isn't a true separation between the two.
    Raw text contains just as much embedded structure as an XML file, the only
    difference being that valid syntax is well-defined for XML documents,
    whereas it is undefined for raw text.

    The challenge for us is defining what goes into the content layer and at
    what level of granularity.

    > I wasn't particularly impressed with ZigZag, since it didn't seem to
    > give me anything that I couldn't get with generic (and
    > well-understood!) graph structures and algorithms. Ted's search for
    > "revolutionary" data structures seems to be too much of a barrier for
    > most programmers, let alone the users of his systems.

    I don't think the data structure is meant to be revolutionary; it's how
    that data structure is used for organizing and viewing information. I
    found Ted's spreadsheet analogy compelling. A table structure wasn't
    revolutionary, even in the early 1980s. :-) It was the fact that Joe
    Schmoe user could map out an entire table of dependencies and perform and
    view all sorts of sophisticated calculations using a fairly simple

    > I'm not entirely convinced that all structure wants to be "above the
    > information" . I'd suggest that useful information has inherent
    > structure. The real issue is the flexibility of the structural
    > building blocks and the ability to reference and reuse this structured
    > information with a variety of higher-level structures. In order to
    > reuse a birthday, I want to maintain the fact that it is So-and-so's
    > birthday, no matter what the context that nugget is being used in.
    > So, with your permission, I'd say that the real failure of current
    > systems is that the only level of reusable structure "above the
    > information" is the document. That is just way too coarse.

    So we agree on this point. A while back, Paul Fernhout recommended
    William Kent's book Data and Reality. I checked it out, and have to
    second Paul's enthusiastic endorsement. Kent really does a wonderful job
    of discussing all of these complexities, and his writing and thinking is
    extremely clear and easy to follow.


    +=== Eugene Eric Kim ===== ===== ===+
    |       "Writer's block is a fancy term made up by whiners so they        |
    +=====  can have an excuse to drink alcohol."  --Steve Martin  ===========+

    Community email addresses: Post message: Subscribe: Unsubscribe: List owner:

    Shortcut URL to this page:

    Your use of Yahoo! Groups is subject to

    This archive was generated by hypermail 2b29 : Tue Jun 05 2001 - 11:31:28 PDT