[unrev-II] Eugene's "purple numbers" talk / XML editors

From: Eric Armstrong (eric.armstrong@sun.com)
Date: Fri Oct 26 2001 - 16:26:11 PDT

  • Next message: Eric Armstrong: "[unrev-II] Do we have an admin, yet??"

    Eugene Eric Kim wrote:

    > Slides from my presentation on purple numbers are available at my web
    > site:
    > http://www.eekim.com/talks/purple/
    > Comments and questions are welcome.

    Although Eugene posted this message to a smaller group, I am cross-posting
    here, because I think it deserves a lot of attention.

    The presentation Eugene gave was, in a word, brilliant. We all know what
    "purple numbers" are, and most of us have an idea of why we want them,
    but Eugene did a magnificent job of summarizing the important points and
    putting them in bullet form succinctly, so they hit you right between the eyes.

    I'm happy to say that I originated (or at least think I originated) some of the

    thoughts that were captured in the presentation. The two that come to mind
    are document as a (structured) "view" of paragraph nodes, and email replies
    that take advantage of granular addressing. But contributing one or two
    thoughts is not the same as creating a masterful summary of a huge, complex
    picture that can easily become overwhelming.

    By zeroing in on "purple numbers", Eugene reduced the problem to one
    of managagable proportions -- the problem of getting stuff authored with
    them. He focused on a few easily recognized benefits among HTML
    users, thereby sidestepping the potential for other, more grandiose uses
    (and the problems that inevitably attend such attempts).

    In this section of this message, I'm going to try to capture the most salient
    Eugene's remarks, in order to highlight them. In the next section, I'll list
    a few thoughts that were sparked by the presentation.

    1. Purple numbers (p#s) are important because they provide for
        granular addressability.

    2. Granular addressabilitity is important for:
        a) Quoting
            (Transcluding information from one document in another.)
        b) Responding
             (Copying an email message and removing parts you're not
              concerned about is "a poor man's granular addressibility".)
        c) Categorizing
             (The foundation for knowledge-based systems.)
        d) Annotating and Revising
            (A small quarrel, here. I would have divided this into two
             items. Annotationg would be one. Referencing would be
             the other. When he spoke about "revising", he appeared
             to mean the ability to name the section under discussion.)
        d) Referencing
             (I'm adding this one. The ability to reference a subset of
              a document, as with a hypertext link -- only with the author
              having had the prescience to create an anchor that would
              allow it, or requiring the user to inspect the underlying HTML
              to find what the anchor is.)

    3. Purple numbers also offer a quick "visual revision history".
        (You can see where things have been removed and

    4. Hierachical IDs present a couple of problems:
        a) In table structures, they often don't apply well (especially with
            complex row- and column-spanning cells).
        b) The "hierarchy" in an HTML document can be difficult to
            discern. (HTML tags aren't nested, and there is no enforcement
            of ordering. For example, I use H1 for a document title, H5 for
            a byline, and then H2 for a section heading. Or H1 could exist
            in a table, and on and on...)

    5. Eugene's Purple script (Perl and XSLT) translates XML structures
        to HTML, adding purple numbers. It also rewrites the original
        XML, adding purple numbers so they can be preserved from
        version to version.

           a) His document structure for writing looked great! What was it?
           b) He said he could construct PDF, too. (Yes? Today? Or is
                that a future thing? If today, how????)
           c) He's thinking about redoing the Perl translator in Java.
               (He'll love the 1.4 regular expression package added in 1.4!)

    6. Murray's Plink program (Java) works on XHTML. It does a
        great job of adding p#'s, if you're working in xhtml.

    A few thoughts that occurred to me as the presentation went on:

    1) I wonder if there is a version of SmallTalk that runs on the Java VM
        yet? If so, I feel sure that the Augment client could be transported to

    2) I wonder how difficult it would be to modify a browser so that it
        recognizes transclude links? I mean, suppose a reference looked
        like this: <a href="..." type="include">. Would a regular browser
        ignore the "type" attribute, and display the link? If so, I custom
        version could do the processing to quote the material in the
       document stream.

    3. Interestingly, whether a link should point to the latest version of a
        node, or to the original version, is a function both of the user and
        and the node. For example, if I quote a particularly pithy saying,
        I might want to quote that version, so I know I get that pithy saying,
        even if the author mucks it up later. On the other hand, even when
        I think I'm interested in a particular paragraph, even if it is modified,
        such wholesale modifications could occur that no longer says
        anything remotely like it used to say. In that case, the authoring
        tool would need to create a branch, so that existing references could
        refer to something relevant. (Eugene raised the latter issue. It's a
        fascinating one to consider. I confess to having no great insights on
        the subject, at this point.)

    4. The ability to use the Purple system boiled down, at the end, to having
        a good XML editor that people could use to author documents in
        reasonably WYSWIG form. Possible options:
        * ArborText Adept
           The original and still the best. But very pricy. Around $1,000
           a seat, I think.

        * Xmetal
           Works well when you have a stylesheet. Doesn't work so
           well when you don't.

       * Xeena
          Java-based. Open source. Availalble at alpha works.

        * XmlWriter
           Simple. Works well. I use it.

       * Stilo
          An editor that looked interesting, but which I never got around to

    Finally here's a great summary page that reviews many of the XML-
    editing options:

    ------------------------ Yahoo! Groups Sponsor ---------------------~-->
    Pinpoint the right security solution for your company- Learn how to add 128- bit encryption and to authenticate your web site with VeriSign's FREE guide!

    Community email addresses:
      Post message: unrev-II@onelist.com
      Subscribe: unrev-II-subscribe@onelist.com
      Unsubscribe: unrev-II-unsubscribe@onelist.com
      List owner: unrev-II-owner@onelist.com

    Shortcut URL to this page:

    Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/

    This archive was generated by hypermail 2.0.0 : Fri Oct 26 2001 - 16:14:03 PDT