Re: [unrev-II] XML limits

From: Eric Armstrong (eric.armstrong@eng.sun.com)
Date: Wed Apr 05 2000 - 15:22:52 PDT

Next message: Jack Park: "Re: [unrev-II] XML limits"

Previous message: Jon Winters: "[unrev-II] Video of the "Spiritual Robots" Symposium (fwd)"
In reply to: Paul Fernhout: "Re: [unrev-II] XML limits"
Next in thread: Jack Park: "Re: [unrev-II] XML limits"
Next in thread: Jack Park: "Re: [unrev-II] [Fwd: Tepid water ...]"
Reply: Jack Park: "Re: [unrev-II] XML limits"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Mail actions: [ respond to this message ] [ mail a new topic ]

First, some general notes:
* At the moment, I'm focusing on in-memory data structures
   and the required manipulation mechanisms. I'm not focusing
   on representation of the information until after there is a
   sense of "completeness" in the internal data model. That's
   the point at which it will become clear whether or not XML
   will work as an external representation.

* The major problem with XML that I see at the moment is
   data export. Given that I have a massively-interconnected
   graph of information nodes, any one "slice" of those nodes
   may constitute a document. That document can be represented
   in XML. Differences between documents and messages that
   transmit the differences can also be represented in XML.
   But if you wanted to export the repository so you could
   import it into another system, would XML be very useful
   for that?

* Thanks for the new and repeated references. They're on my
priority reading list. (Which is to say I'll probably be
able to get to them by next month...)

Paul Fernhout wrote:
>
> My longer point is that the knowledge management / representation
> problem is a deep one, and XML doesn't address it in a serious way,
> and confuses the subject by the hype making it sound like XML does
> address the topic of knowledge representation in a serious way.
>
Hmmm. I never had that impression. I got that if I have data, I can
represent it in XML -- especially if the data is structured. What I
keep wrestling with is that any individual *view* of the data benefits
from hierarchy -- it helps to organize the info and orient the reader.
But the underlying data is a multi-connected graph, not a hierarchy.
So maybe what's really needed is:

                         +- GUI operations
                         V
  ?repository? --> XML-based view +--> Html Representation
                                  +--> PDF representation
                                       etc.

Identifying the structure of the repository is my major quest at
the moment.

> Squeak, Python, Common Lisp (less so) are interesting choices.
> I'm starting to think Squeak might be the best choice for prototyping
> (for me) given that it is completely cross-platform and open. It's
> cross-platform GUI does the best job of addressing the DKR design
> requirement of shareable screens.
>
Can you tell me more about Squeak (again), and why I'm going to like it,
and where to find it?

> ...at a talk last year by Marvin Minsky he went on at
> length about the need for multiple representational strategies for
> problem solving. He argued the human mind may perceive problems using
> five or six strategies (ex. geometrical reasoning, formal logic,
> heuristic rules of thumb, pattern recognition, semantic networks,
> others) and continuously picks the best one at the time to progress in
> thinking.
>
This seems fundamental. Has he written this up anywhere, to your
knowledge?

> Maybe what we need is a overview of the AI and knowledge management
> fields and how each area or major problem/topic would affect a
> DKR/OHS.
>
That strikes me as profitable enumeration of issues.
Any thoughts on how we should get started?

> Also, what will evolve over time for an OHS/DKR project is a set of
> useful code that can manipulate data strucutres that are related to
> knowledge representaion. We might also wish to have a survey of such
> existing code.
>
Yeah. I started the reference list with things like that in mind.
I've fallen behind in keeping the list up to date, much less producing
even preliminary evaluations of different papers. I've seen a lot of
stuff that doesn't excite me. IBIS was a notable exception. This is an
area
where we desperately need even a preliminary DKR, so we track
evaluations of different papers, and start sorting them by relevance and
other criteria (like readability and explanatory power).

> ...as time goes on, any restrictions will become obsolete.
> One needs a representational system that can adapt to user needs.
>
Can you give an example of that? Something simple will do. Maybe my
sixth grade view of physics vs. my college-level view, for example.
Does that make sense? (A specific adaptation would be even better.)

> while XML, could be a part of that solution, the important issues go
> beyond that -- to standards creation and revison and communication,
> and to coin a phrase "data upgrading".
>
I understand about standards creation. That's where the interesting work
is going on even as we speak. I don't see how revision and communication
go beyond XML. And I'm not sure what you mean by data upgrading. Can you
elaborate?

> The deeper issue is that rather than focus on ways to limit
> representations (DTDs) we need to focus on ways to transform, extend,
> and simplify representations as needed (sort of along the multi-level
> approach I mentioned earlier).
>
As I mentioned, DTDs only give you minimal validation. Like Lisp or
SmallTalk apps, the "interesting" validation will probably occur within
the context of the app -- as long as you are doing "interesting"
validation.

However, I think the better strategy is to punt on that issue. I'm not
interested in AI-level reasoning about statements like "Horses fly". I
am totally uninterested in any sort automatic verification for such
things.
I am interested in one person having the ability to assert "Horses fly",
another person to argue against it, and for individuals to estimate the
value and usefulness of a document based on the assertions it contains.

Here are two analogies:
  1) "Decorative" tags vs. "Structure" tags.
     In DocBook, these are called "inline" tags (like bold and italic)
     vs. "block" tags (like sect1 and sect2). One thing that XML does
     *not* give me is a good way to make a clean separation between
     those two. That distinction is important, too, for two major
     reasons:
       a) When displaying a document, I want to know which elements
          belong in the outline (table of contents, tree view) and
          which elements belong only in the content-display.

       b) For structure elements, the sub-structure should always
          consist of (1) content -- any combination of text and
          decorative elements -- *followed* by structure elements.
          In other words, any structure element can have one piece
          of content, followed by substructure elements, and there
          is never any overlap between them. XML gives me no such
          mechanism. (The DocBook solution is to define a <title>
          element for each <sectN>. That introduces two tags where
          only one is really needed, and complicates the processing.

The point of this analogy is that I frequently want to separate
structure from content, so I can treat them separately.

  2. The second analogy is in the graphic representation of computer
     programs. In graphics, hierarchy is expressed by "diving in".
     You look inside a graphic object to see what it contains. Here
     again, I need a distinction between control elements and normal
     statements. The reason: graphical representation of a = b;
     does me no good whatsoever. It consumes space for the graphics
     that has no value whatsoever for understanding the program.

     Graphical representations of programs, therefore, need to stop
     at the control-flow statements. A graphical representation of
     all the if, for, and case statements in a program may be of use.
     In any one block, though, a simple listing of the normal
     statements is sufficient.

I see the same issue with respect to knowledge representation.
Attempting to solve the whole problem by representing "tree", "apple",
"green", "red", etc. is just too hard. Let the human interpret the
meaning of the words. But there is an underlying structure that it makes
sense to automated. Perhaps it is Noam Chomsky's deep structure, or
perhaps a logic model, or perhaps one of several representations as
identified by Minsky.
If we can construct a system within which we can model those
relationships and reason about them, we can make a ton of progress
without having to make a computer into a "thinking" machine.

> ...Any DKR/OHS will need to be more
> than a bunch of passive data in a database. It will need many programs
> to do things to that data to make new data (search, format, summarize,
> repackage, interpret, transfer, upgrade, etc.).
> A more important issue than data transmission format (the one XML
> tries to address) is to build a robust platform for doing those
> algorithmic things.
>
Oddly enough. I haven't seen that the knowledge repository needs a lot
of functionality. I've been looking for it, but most of the operations
you mention I see as either aspects of the UI (like searching) or
operations best conducted by the user (summarizing).

> ...As a deeper approach, one tries to represent the knowledge and
> algorithms in an abstract enough way as to be ideally programming
> language neutralor at least programming language retargetable
> (generating whatever code in whatever language as needed).
>
This would of course be ideal, assuming that the manipulations need to
be part of the repository system. I am as yet unconvinced that they have
to be, but I am open to argument on the subject.

(thanks for another great, thought-provoking note.)

------------------------------------------------------------------------
LOW RATE, NO WAIT!
Get a NextCard Visa, in 30 seconds! Get rates
as low as 2.9% Intro or 9.9% Fixed APR and no hidden fees.
Apply NOW!
http://click.egroups.com/1/2122/3/_/444287/_/954973378/
------------------------------------------------------------------------

Community email addresses:
  Post message: unrev-II@onelist.com
  Subscribe: unrev-II-subscribe@onelist.com
  Unsubscribe: unrev-II-unsubscribe@onelist.com
  List owner: unrev-II-owner@onelist.com

Shortcut URL to this page:
http://www.onelist.com/community/unrev-II

Next message: Jack Park: "Re: [unrev-II] XML limits"
Previous message: Jon Winters: "[unrev-II] Video of the "Spiritual Robots" Symposium (fwd)"
In reply to: Paul Fernhout: "Re: [unrev-II] XML limits"
Next in thread: Jack Park: "Re: [unrev-II] XML limits"
Next in thread: Jack Park: "Re: [unrev-II] [Fwd: Tepid water ...]"
Reply: Jack Park: "Re: [unrev-II] XML limits"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2b29 : Wed Apr 05 2000 - 15:30:18 PDT