[unrev-II] historical considerations for the OHS

From: Eugene Eric Kim (eekim@eekim.com)
Date: Tue Jul 18 2000 - 14:01:42 PDT

  • Next message: Eric Armstrong: "Re: [unrev-II] Editing XML Outlines: A Workaround"

    I'm putting together a DTD for e-mail, and remembered an important
    consideration I want to raise here. As a historian, I find it very
    valuable to have a pristine copy of the original document. For example, I
    got a hold of some e-mail archives circa 1983 for one of my projects, and
    I found the format of the e-mail as well as some of the Received headers
    just as valuable historically as the content itself. There's very little
    anthropological information attached to digital documents, and so anything
    that is there should be kept.

    What does this mean for the OHS? Well, specifically for an e-mail DTD,
    it's valuable to have various meta-information stored in their own
    elements, i.e. <from>, <to>, etc. However, it would be impossible to
    restore perfectly the original e-mail from an e-mail broken down into XML
    elements. This means storing the original e-mail as well as the
    translated e-mail.

    Does transcoding solve this problem? In some cases, yes. But it may not
    always be an optimal solution to transcode on the fly everytime,
    especially for journaled documents such as e-mail. For example, when I do
    research on the Web, I like to save Web pages locally. Unless the HTML is
    transcoded into XHTML + our extensions, I won't be able to do the fancy
    things the OHS will be capable of doing. However, I'd like to have the
    original, pristine HTML source for the page, and I'd like to have a
    viewable HTML page that looks exactly like the original (essentially the
    same as the former, but with URLs rewritten so that embeddable components
    like images can be saved locally). This means the OHS needs to save:

        - original HTML source
        - rewritten HTML source for displaying locally and offline
        - transcoded/translated HTML

    I don't want to be too blase about disk space considerations, but I really
    think saving all of this additional data is important, and should at least
    be an option within the system.


    +=== Eugene Eric Kim ===== eekim@eekim.com ===== http://www.eekim.com/ ===+
    |       "Writer's block is a fancy term made up by whiners so they        |
    +=====  can have an excuse to drink alcohol."  --Steve Martin  ===========+

    ------------------------------------------------------------------------ Old school buds here: http://click.egroups.com/1/7081/5/_/444287/_/963954730/ ------------------------------------------------------------------------

    Community email addresses: Post message: unrev-II@onelist.com Subscribe: unrev-II-subscribe@onelist.com Unsubscribe: unrev-II-unsubscribe@onelist.com List owner: unrev-II-owner@onelist.com

    Shortcut URL to this page: http://www.onelist.com/community/unrev-II

    This archive was generated by hypermail 2b29 : Tue Jul 18 2000 - 14:20:55 PDT