Re: [unrev-II] XML limits

From: Jack Park (
Date: Wed Apr 05 2000 - 15:54:21 PDT

  • Next message: Eric Armstrong: "Re: [unrev-II] XML limits"

    A paper called The Story of Squeak can be found at:

    A trivial google search turned up

    ----- Original Message -----
    From: Eric Armstrong <>
    To: <>
    Sent: Wednesday, April 05, 2000 3:22 PM
    Subject: Re: [unrev-II] XML limits

    > First, some general notes:
    > * At the moment, I'm focusing on in-memory data structures
    > and the required manipulation mechanisms. I'm not focusing
    > on representation of the information until after there is a
    > sense of "completeness" in the internal data model. That's
    > the point at which it will become clear whether or not XML
    > will work as an external representation.
    > * The major problem with XML that I see at the moment is
    > data export. Given that I have a massively-interconnected
    > graph of information nodes, any one "slice" of those nodes
    > may constitute a document. That document can be represented
    > in XML. Differences between documents and messages that
    > transmit the differences can also be represented in XML.
    > But if you wanted to export the repository so you could
    > import it into another system, would XML be very useful
    > for that?
    > * Thanks for the new and repeated references. They're on my
    > priority reading list. (Which is to say I'll probably be
    > able to get to them by next month...)
    > Paul Fernhout wrote:
    > >
    > > My longer point is that the knowledge management / representation
    > > problem is a deep one, and XML doesn't address it in a serious way,
    > > and confuses the subject by the hype making it sound like XML does
    > > address the topic of knowledge representation in a serious way.
    > >
    > Hmmm. I never had that impression. I got that if I have data, I can
    > represent it in XML -- especially if the data is structured. What I
    > keep wrestling with is that any individual *view* of the data benefits
    > from hierarchy -- it helps to organize the info and orient the reader.
    > But the underlying data is a multi-connected graph, not a hierarchy.
    > So maybe what's really needed is:
    > +- GUI operations
    > V
    > ?repository? --> XML-based view +--> Html Representation
    > +--> PDF representation
    > etc.
    > Identifying the structure of the repository is my major quest at
    > the moment.
    > > Squeak, Python, Common Lisp (less so) are interesting choices.
    > > I'm starting to think Squeak might be the best choice for prototyping
    > > (for me) given that it is completely cross-platform and open. It's
    > > cross-platform GUI does the best job of addressing the DKR design
    > > requirement of shareable screens.
    > >
    > Can you tell me more about Squeak (again), and why I'm going to like it,
    > and where to find it?
    > > a talk last year by Marvin Minsky he went on at
    > > length about the need for multiple representational strategies for
    > > problem solving. He argued the human mind may perceive problems using
    > > five or six strategies (ex. geometrical reasoning, formal logic,
    > > heuristic rules of thumb, pattern recognition, semantic networks,
    > > others) and continuously picks the best one at the time to progress in
    > > thinking.
    > >
    > This seems fundamental. Has he written this up anywhere, to your
    > knowledge?
    > > Maybe what we need is a overview of the AI and knowledge management
    > > fields and how each area or major problem/topic would affect a
    > > DKR/OHS.
    > >
    > That strikes me as profitable enumeration of issues.
    > Any thoughts on how we should get started?
    > > Also, what will evolve over time for an OHS/DKR project is a set of
    > > useful code that can manipulate data strucutres that are related to
    > > knowledge representaion. We might also wish to have a survey of such
    > > existing code.
    > >
    > Yeah. I started the reference list with things like that in mind.
    > I've fallen behind in keeping the list up to date, much less producing
    > even preliminary evaluations of different papers. I've seen a lot of
    > stuff that doesn't excite me. IBIS was a notable exception. This is an
    > area
    > where we desperately need even a preliminary DKR, so we track
    > evaluations of different papers, and start sorting them by relevance and
    > other criteria (like readability and explanatory power).
    > > time goes on, any restrictions will become obsolete.
    > > One needs a representational system that can adapt to user needs.
    > >
    > Can you give an example of that? Something simple will do. Maybe my
    > sixth grade view of physics vs. my college-level view, for example.
    > Does that make sense? (A specific adaptation would be even better.)
    > > while XML, could be a part of that solution, the important issues go
    > > beyond that -- to standards creation and revison and communication,
    > > and to coin a phrase "data upgrading".
    > >
    > I understand about standards creation. That's where the interesting work
    > is going on even as we speak. I don't see how revision and communication
    > go beyond XML. And I'm not sure what you mean by data upgrading. Can you
    > elaborate?
    > > The deeper issue is that rather than focus on ways to limit
    > > representations (DTDs) we need to focus on ways to transform, extend,
    > > and simplify representations as needed (sort of along the multi-level
    > > approach I mentioned earlier).
    > >
    > As I mentioned, DTDs only give you minimal validation. Like Lisp or
    > SmallTalk apps, the "interesting" validation will probably occur within
    > the context of the app -- as long as you are doing "interesting"
    > validation.
    > However, I think the better strategy is to punt on that issue. I'm not
    > interested in AI-level reasoning about statements like "Horses fly". I
    > am totally uninterested in any sort automatic verification for such
    > things.
    > I am interested in one person having the ability to assert "Horses fly",
    > another person to argue against it, and for individuals to estimate the
    > value and usefulness of a document based on the assertions it contains.
    > Here are two analogies:
    > 1) "Decorative" tags vs. "Structure" tags.
    > In DocBook, these are called "inline" tags (like bold and italic)
    > vs. "block" tags (like sect1 and sect2). One thing that XML does
    > *not* give me is a good way to make a clean separation between
    > those two. That distinction is important, too, for two major
    > reasons:
    > a) When displaying a document, I want to know which elements
    > belong in the outline (table of contents, tree view) and
    > which elements belong only in the content-display.
    > b) For structure elements, the sub-structure should always
    > consist of (1) content -- any combination of text and
    > decorative elements -- *followed* by structure elements.
    > In other words, any structure element can have one piece
    > of content, followed by substructure elements, and there
    > is never any overlap between them. XML gives me no such
    > mechanism. (The DocBook solution is to define a <title>
    > element for each <sectN>. That introduces two tags where
    > only one is really needed, and complicates the processing.
    > The point of this analogy is that I frequently want to separate
    > structure from content, so I can treat them separately.
    > 2. The second analogy is in the graphic representation of computer
    > programs. In graphics, hierarchy is expressed by "diving in".
    > You look inside a graphic object to see what it contains. Here
    > again, I need a distinction between control elements and normal
    > statements. The reason: graphical representation of a = b;
    > does me no good whatsoever. It consumes space for the graphics
    > that has no value whatsoever for understanding the program.
    > Graphical representations of programs, therefore, need to stop
    > at the control-flow statements. A graphical representation of
    > all the if, for, and case statements in a program may be of use.
    > In any one block, though, a simple listing of the normal
    > statements is sufficient.
    > I see the same issue with respect to knowledge representation.
    > Attempting to solve the whole problem by representing "tree", "apple",
    > "green", "red", etc. is just too hard. Let the human interpret the
    > meaning of the words. But there is an underlying structure that it makes
    > sense to automated. Perhaps it is Noam Chomsky's deep structure, or
    > perhaps a logic model, or perhaps one of several representations as
    > identified by Minsky.
    > If we can construct a system within which we can model those
    > relationships and reason about them, we can make a ton of progress
    > without having to make a computer into a "thinking" machine.
    > > ...Any DKR/OHS will need to be more
    > > than a bunch of passive data in a database. It will need many programs
    > > to do things to that data to make new data (search, format, summarize,
    > > repackage, interpret, transfer, upgrade, etc.).
    > > A more important issue than data transmission format (the one XML
    > > tries to address) is to build a robust platform for doing those
    > > algorithmic things.
    > >
    > Oddly enough. I haven't seen that the knowledge repository needs a lot
    > of functionality. I've been looking for it, but most of the operations
    > you mention I see as either aspects of the UI (like searching) or
    > operations best conducted by the user (summarizing).
    > > ...As a deeper approach, one tries to represent the knowledge and
    > > algorithms in an abstract enough way as to be ideally programming
    > > language neutralor at least programming language retargetable
    > > (generating whatever code in whatever language as needed).
    > >
    > This would of course be ideal, assuming that the manipulations need to
    > be part of the repository system. I am as yet unconvinced that they have
    > to be, but I am open to argument on the subject.
    > (thanks for another great, thought-provoking note.)
    > ------------------------------------------------------------------------
    > Get a NextCard Visa, in 30 seconds! Get rates
    > as low as 2.9% Intro or 9.9% Fixed APR and no hidden fees.
    > Apply NOW!
    > ------------------------------------------------------------------------
    > Community email addresses:
    > Post message:
    > Subscribe:
    > Unsubscribe:
    > List owner:
    > Shortcut URL to this page:

    Get a NextCard Visa, in 30 seconds! Get rates as low as
    0.0% Intro or 9.9% Fixed APR and no hidden fees.
    Apply NOW!

    Community email addresses:
      Post message:
      List owner:

    Shortcut URL to this page:

    This archive was generated by hypermail 2b29 : Wed Apr 05 2000 - 15:58:10 PDT