[unrev-II] XML limits (Was: [Fwd: Tepid water ...])

From: Paul Fernhout (pdfernhout@kurtz-fernhout.com)
Date: Sun Apr 02 2000 - 19:46:30 PDT

Next message: Eric Armstrong: "Re: [unrev-II] XML limits (Was: [Fwd: Tepid water ...])"

Previous message: Eugene Kim: "[unrev-II] Lore -- an XML DBMS"
In reply to: Eric Armstrong: "[unrev-II] [Fwd: Tepid water ...]"
Next in thread: Eric Armstrong: "Re: [unrev-II] XML limits (Was: [Fwd: Tepid water ...])"
Reply: Eric Armstrong: "Re: [unrev-II] XML limits (Was: [Fwd: Tepid water ...])"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Eric -

Excellent URL. I enjoyed the essay there.

My issues with XML has always mainly been that it does not solve the
semantic agreement issue among programs (or people). It also does not
address the issue of semantic shift -- as meaning of fields or terms
change over time as end user needs change.

This article touches on this -- both by pointing out you need code to
process the XML and that you can confuse yourself that such code woudl
be easy to write by picking seemingly "simple" names for XML tags which
in reality are complex concepts.

Here's a typical example of why even the simplest knowledgebase (DKR)
can be difficult to maintain over time. We create an XML specification
for a "User" (seems like a simple enough concept!) for a DKR so we can
record who has made changes to the DKR by referencing that user, and we
also want to send them "snail" mail newsletters occasionally.

We make an XML DTD defining a record like:
<User>
<Name>John Doe</Name>
<Address>3 Lookinglass Lane, Minnetonka, MN 55555 USA</Address>
</User>

And then we find out things like:
* This person has multiple names (nicknames, aliases, pen names). We
want to record these for use in a web clipping service to give users
lists of pages that might contain text references to them.
* This person's name is shared by multiple different users, so now we
need unique IDs or something like them.
* It turns out this person is also a temporary employee and we want to
cross index those records with the user records. Oh, and they are also a
customer, purchasing some services. Unfortunately, those systems were
set up with their own unique IDs and did not anticipate users being
employees or customers.
* One "user" is actually a company account used by multiple people.
Worse, that company has spun off a subsidiary and both accounts are now
users. Oh, and by the way, now that subsidiary has merged with another
company which was also one of our previous users. Oh, and some of those
people using the original company accounts are now indpendnet users. We
would still like to be able to trace who made what entry when to the
best of our ability, and also still generate consistent good looking
reports.
* Some of the entries attributed to one user were actually made by
another user and entered erroneuously as belonging to the first user,
but we want to remember that for a while we thought this user had made
them, and various previously generated reports were based on that
information.

XML doesn't help solve any of these issues. It does nothing for us as
the semantics of the data fields shift or as the meanning of "user"
concept itself shifts. We run into problems with now having multiple
versions of our XML Data Type Definition (DTD) as we attempt to
accommodate new needs. XML cannot in any special way help us resolve
differences among multiple DTDs.

William Kent's classic book "Data and Reality" from 1979 delves into
these sorts of issues in depth. I'd highly recommend it.
http://www.bkent.net/

Even the simple address field by itself ia mine field. Let's say the
user has moved a few times and we want to remember where they move to.
Some of these addresses are outside the USA, and sometime we send mail
from offices outside the USA, so the address requires different
(relative) routing information based on the country sending the mail and
the country the mail is being sent to. So we need to store lots of
address information, and know which is the right information to use in
any particular situation. Again, XML by itself can not help us. We might
use XML as part of a solution, but only by also creating lots of code
(in some language) and data (maybe in XML, maybe in a database) related
to international addressing.

I do think in some cases XML is sometimes useful as a data transmission
format because it makes it easier to reverse engineer the intent of the
information structure -- if that structure was put together with the
intent of being reverse engineered. It is also useful for quick and
dirty data encoding as one bootstraps up a program -- much the same way
you can easily encode LISP data structures using lots of parentheses.
But as the article points out, the XML hype paints XML as a data
exchange panacea, which it is not.

-Paul Fernhout
Kurtz-Fernhout Software
=========================================================
Developers of custom software and educational simulations
Creators of the Garden with Insight(TM) garden simulator
http://www.kurtz-fernhout.com

Eric Armstrong wrote:
>
> A fascinating look at some of the limitations of XML.
> The discussion of MathML was *actually* done is quite
> fascinating, along with a similar business example.
> It shows that even "XML" data may in fact require
> proprietary engines to do the processing.
>
> We'll have to keep the MathML model in mind, *just in
> case* XML by itself doesn't get us where we need to go...
>
> -------- Original Message --------
> Subject: Tepid water ...
> Date: Thu, 30 Mar 2000 17:28:56 -0800
> To: xml-tech@eng.sun.com
>
> For those that haven't seen this, I thought it
> was quite interesting:
>
> http://www.interlog.com/~gray/markup-abuse.html
>
> Philip

------------------------------------------------------------------------
Get a NextCard Visa, in 30 seconds!
1. Fill in the brief application
2. Receive approval decision within 30 seconds
3. Get rates as low as 2.9% Intro or 9.9% Fixed APR
Apply NOW!
http://click.egroups.com/1/2646/3/_/444287/_/954729935/
------------------------------------------------------------------------

Community email addresses:
  Post message: unrev-II@onelist.com
  Subscribe: unrev-II-subscribe@onelist.com
  Unsubscribe: unrev-II-unsubscribe@onelist.com
  List owner: unrev-II-owner@onelist.com

Shortcut URL to this page:
http://www.onelist.com/community/unrev-II

Next message: Eric Armstrong: "Re: [unrev-II] XML limits (Was: [Fwd: Tepid water ...])"
Previous message: Eugene Kim: "[unrev-II] Lore -- an XML DBMS"
In reply to: Eric Armstrong: "[unrev-II] [Fwd: Tepid water ...]"
Next in thread: Eric Armstrong: "Re: [unrev-II] XML limits (Was: [Fwd: Tepid water ...])"
Reply: Eric Armstrong: "Re: [unrev-II] XML limits (Was: [Fwd: Tepid water ...])"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2b29 : Sun Apr 02 2000 - 19:52:52 PDT