On Wed, 30 May 2001, Lee Iverson wrote:
> I believe that there is a problem here. I don't believe Ted "gets"
> the shifts in understanding that have taken place in moving from SGML
> and HTML to XML. In almost all circumstances, embedded tags are now
> (reading the most-enlightened XML literature) considered to be
> semantic type identifiers.
Okay, I tried to resist the urge of being pedantic, but to no avail. :-)
Aren't all forms of syntax in essence semantic type identifiers? And in
this sense, aren't HTML tags also semantic type identifiers? They
represent formatting semantics -- i.e. this element is a headline, this
element should bold, etc. -- but aren't those still semantics?
I actually agree with your original point, Lee. I didn't quite understand
Ted's brief explanation last Tuesday of why he didn't like markup
languages, so I dug up his paper "Embedded Markup Considered Harmful" (WWW
Journal, Winter 1997). It's not on the Web, unfortunately, so I'll
summarize his objections here:
1. Transclusion of a marked-up document may result in a syntactically
invalid excerpt. In other words, if I want to transclude the phrase "To
be, or not to be" from the XML document "<quote>To be, or not to be. That
is the question.</quote>", the textual excerpt returned will not be
well-formed XML, because it won't include the end tag.
2. The quoting author may want to change the look-and-feel or even the
structure of the quotation. If this is done by copy-and-paste, no
problem, but if this is done using transclusion, then you need to have
some form of view control.
3. Embedded structure limits the kinds of structure that can be expressed.
For brevity's sake, I won't comment on the first two objections here,
nor to his general feeling about mark-up, but I do want to say a few words
about his third objection. Nelson's solution is to store content as raw
text, and structure as a layer over the content. I agree with this in
theory, but in reality, there isn't a true separation between the two.
Raw text contains just as much embedded structure as an XML file, the only
difference being that valid syntax is well-defined for XML documents,
whereas it is undefined for raw text.
The challenge for us is defining what goes into the content layer and at
what level of granularity.
> I wasn't particularly impressed with ZigZag, since it didn't seem to
> give me anything that I couldn't get with generic (and
> well-understood!) graph structures and algorithms. Ted's search for
> "revolutionary" data structures seems to be too much of a barrier for
> most programmers, let alone the users of his systems.
I don't think the data structure is meant to be revolutionary; it's how
that data structure is used for organizing and viewing information. I
found Ted's spreadsheet analogy compelling. A table structure wasn't
revolutionary, even in the early 1980s. :-) It was the fact that Joe
Schmoe user could map out an entire table of dependencies and perform and
view all sorts of sophisticated calculations using a fairly simple
interface.
> I'm not entirely convinced that all structure wants to be "above the
> information" . I'd suggest that useful information has inherent
> structure. The real issue is the flexibility of the structural
> building blocks and the ability to reference and reuse this structured
> information with a variety of higher-level structures. In order to
> reuse a birthday, I want to maintain the fact that it is So-and-so's
> birthday, no matter what the context that nugget is being used in.
>
> So, with your permission, I'd say that the real failure of current
> systems is that the only level of reusable structure "above the
> information" is the document. That is just way too coarse.
So we agree on this point. A while back, Paul Fernhout recommended
William Kent's book Data and Reality. I checked it out, and have to
second Paul's enthusiastic endorsement. Kent really does a wonderful job
of discussing all of these complexities, and his writing and thinking is
extremely clear and easy to follow.
-Eugene
-- +=== Eugene Eric Kim ===== eekim@eekim.com ===== http://www.eekim.com/ ===+ | "Writer's block is a fancy term made up by whiners so they | +===== can have an excuse to drink alcohol." --Steve Martin ===========+Community email addresses: Post message: unrev-II@onelist.com Subscribe: unrev-II-subscribe@onelist.com Unsubscribe: unrev-II-unsubscribe@onelist.com List owner: unrev-II-owner@onelist.com
Shortcut URL to this page: http://www.onelist.com/community/unrev-II
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
This archive was generated by hypermail 2b29 : Tue Jun 05 2001 - 11:31:28 PDT