[unrev-II] XML limits (Was: [Fwd: Tepid water ...])

From: Paul Fernhout (pdfernhout@kurtz-fernhout.com)
Date: Sun Apr 02 2000 - 19:46:30 PDT

  • Next message: Eric Armstrong: "Re: [unrev-II] XML limits (Was: [Fwd: Tepid water ...])"

    Eric -

    Excellent URL. I enjoyed the essay there.

    My issues with XML has always mainly been that it does not solve the
    semantic agreement issue among programs (or people). It also does not
    address the issue of semantic shift -- as meaning of fields or terms
    change over time as end user needs change.

    This article touches on this -- both by pointing out you need code to
    process the XML and that you can confuse yourself that such code woudl
    be easy to write by picking seemingly "simple" names for XML tags which
    in reality are complex concepts.

    Here's a typical example of why even the simplest knowledgebase (DKR)
    can be difficult to maintain over time. We create an XML specification
    for a "User" (seems like a simple enough concept!) for a DKR so we can
    record who has made changes to the DKR by referencing that user, and we
    also want to send them "snail" mail newsletters occasionally.

    We make an XML DTD defining a record like:
      <Name>John Doe</Name>
      <Address>3 Lookinglass Lane, Minnetonka, MN 55555 USA</Address>

    And then we find out things like:
    * This person has multiple names (nicknames, aliases, pen names). We
    want to record these for use in a web clipping service to give users
    lists of pages that might contain text references to them.
    * This person's name is shared by multiple different users, so now we
    need unique IDs or something like them.
    * It turns out this person is also a temporary employee and we want to
    cross index those records with the user records. Oh, and they are also a
    customer, purchasing some services. Unfortunately, those systems were
    set up with their own unique IDs and did not anticipate users being
    employees or customers.
    * One "user" is actually a company account used by multiple people.
    Worse, that company has spun off a subsidiary and both accounts are now
    users. Oh, and by the way, now that subsidiary has merged with another
    company which was also one of our previous users. Oh, and some of those
    people using the original company accounts are now indpendnet users. We
    would still like to be able to trace who made what entry when to the
    best of our ability, and also still generate consistent good looking
    * Some of the entries attributed to one user were actually made by
    another user and entered erroneuously as belonging to the first user,
    but we want to remember that for a while we thought this user had made
    them, and various previously generated reports were based on that

    XML doesn't help solve any of these issues. It does nothing for us as
    the semantics of the data fields shift or as the meanning of "user"
    concept itself shifts. We run into problems with now having multiple
    versions of our XML Data Type Definition (DTD) as we attempt to
    accommodate new needs. XML cannot in any special way help us resolve
    differences among multiple DTDs.

    William Kent's classic book "Data and Reality" from 1979 delves into
    these sorts of issues in depth. I'd highly recommend it.

    Even the simple address field by itself ia mine field. Let's say the
    user has moved a few times and we want to remember where they move to.
    Some of these addresses are outside the USA, and sometime we send mail
    from offices outside the USA, so the address requires different
    (relative) routing information based on the country sending the mail and
    the country the mail is being sent to. So we need to store lots of
    address information, and know which is the right information to use in
    any particular situation. Again, XML by itself can not help us. We might
    use XML as part of a solution, but only by also creating lots of code
    (in some language) and data (maybe in XML, maybe in a database) related
    to international addressing.

    I do think in some cases XML is sometimes useful as a data transmission
    format because it makes it easier to reverse engineer the intent of the
    information structure -- if that structure was put together with the
    intent of being reverse engineered. It is also useful for quick and
    dirty data encoding as one bootstraps up a program -- much the same way
    you can easily encode LISP data structures using lots of parentheses.
    But as the article points out, the XML hype paints XML as a data
    exchange panacea, which it is not.

    -Paul Fernhout
    Kurtz-Fernhout Software
    Developers of custom software and educational simulations
    Creators of the Garden with Insight(TM) garden simulator

    Eric Armstrong wrote:
    > A fascinating look at some of the limitations of XML.
    > The discussion of MathML was *actually* done is quite
    > fascinating, along with a similar business example.
    > It shows that even "XML" data may in fact require
    > proprietary engines to do the processing.
    > We'll have to keep the MathML model in mind, *just in
    > case* XML by itself doesn't get us where we need to go...
    > -------- Original Message --------
    > Subject: Tepid water ...
    > Date: Thu, 30 Mar 2000 17:28:56 -0800
    > To: xml-tech@eng.sun.com
    > For those that haven't seen this, I thought it
    > was quite interesting:
    > http://www.interlog.com/~gray/markup-abuse.html
    > Philip

    Get a NextCard Visa, in 30 seconds!
    1. Fill in the brief application
    2. Receive approval decision within 30 seconds
    3. Get rates as low as 2.9% Intro or 9.9% Fixed APR
    Apply NOW!

    Community email addresses:
      Post message: unrev-II@onelist.com
      Subscribe: unrev-II-subscribe@onelist.com
      Unsubscribe: unrev-II-unsubscribe@onelist.com
      List owner: unrev-II-owner@onelist.com

    Shortcut URL to this page:

    This archive was generated by hypermail 2b29 : Sun Apr 02 2000 - 19:52:52 PDT