First, some general notes:
* At the moment, I'm focusing on in-memory data structures
and the required manipulation mechanisms. I'm not focusing
on representation of the information until after there is a
sense of "completeness" in the internal data model. That's
the point at which it will become clear whether or not XML
will work as an external representation.
* The major problem with XML that I see at the moment is
data export. Given that I have a massively-interconnected
graph of information nodes, any one "slice" of those nodes
may constitute a document. That document can be represented
in XML. Differences between documents and messages that
transmit the differences can also be represented in XML.
But if you wanted to export the repository so you could
import it into another system, would XML be very useful
* Thanks for the new and repeated references. They're on my
priority reading list. (Which is to say I'll probably be
able to get to them by next month...)
Paul Fernhout wrote:
> My longer point is that the knowledge management / representation
> problem is a deep one, and XML doesn't address it in a serious way,
> and confuses the subject by the hype making it sound like XML does
> address the topic of knowledge representation in a serious way.
Hmmm. I never had that impression. I got that if I have data, I can
represent it in XML -- especially if the data is structured. What I
keep wrestling with is that any individual *view* of the data benefits
from hierarchy -- it helps to organize the info and orient the reader.
But the underlying data is a multi-connected graph, not a hierarchy.
So maybe what's really needed is:
+- GUI operations
?repository? --> XML-based view +--> Html Representation
+--> PDF representation
Identifying the structure of the repository is my major quest at
> Squeak, Python, Common Lisp (less so) are interesting choices.
> I'm starting to think Squeak might be the best choice for prototyping
> (for me) given that it is completely cross-platform and open. It's
> cross-platform GUI does the best job of addressing the DKR design
> requirement of shareable screens.
Can you tell me more about Squeak (again), and why I'm going to like it,
and where to find it?
> ...at a talk last year by Marvin Minsky he went on at
> length about the need for multiple representational strategies for
> problem solving. He argued the human mind may perceive problems using
> five or six strategies (ex. geometrical reasoning, formal logic,
> heuristic rules of thumb, pattern recognition, semantic networks,
> others) and continuously picks the best one at the time to progress in
This seems fundamental. Has he written this up anywhere, to your
> Maybe what we need is a overview of the AI and knowledge management
> fields and how each area or major problem/topic would affect a
That strikes me as profitable enumeration of issues.
Any thoughts on how we should get started?
> Also, what will evolve over time for an OHS/DKR project is a set of
> useful code that can manipulate data strucutres that are related to
> knowledge representaion. We might also wish to have a survey of such
> existing code.
Yeah. I started the reference list with things like that in mind.
I've fallen behind in keeping the list up to date, much less producing
even preliminary evaluations of different papers. I've seen a lot of
stuff that doesn't excite me. IBIS was a notable exception. This is an
where we desperately need even a preliminary DKR, so we track
evaluations of different papers, and start sorting them by relevance and
other criteria (like readability and explanatory power).
> ...as time goes on, any restrictions will become obsolete.
> One needs a representational system that can adapt to user needs.
Can you give an example of that? Something simple will do. Maybe my
sixth grade view of physics vs. my college-level view, for example.
Does that make sense? (A specific adaptation would be even better.)
> while XML, could be a part of that solution, the important issues go
> beyond that -- to standards creation and revison and communication,
> and to coin a phrase "data upgrading".
I understand about standards creation. That's where the interesting work
is going on even as we speak. I don't see how revision and communication
go beyond XML. And I'm not sure what you mean by data upgrading. Can you
> The deeper issue is that rather than focus on ways to limit
> representations (DTDs) we need to focus on ways to transform, extend,
> and simplify representations as needed (sort of along the multi-level
> approach I mentioned earlier).
As I mentioned, DTDs only give you minimal validation. Like Lisp or
SmallTalk apps, the "interesting" validation will probably occur within
the context of the app -- as long as you are doing "interesting"
However, I think the better strategy is to punt on that issue. I'm not
interested in AI-level reasoning about statements like "Horses fly". I
am totally uninterested in any sort automatic verification for such
I am interested in one person having the ability to assert "Horses fly",
another person to argue against it, and for individuals to estimate the
value and usefulness of a document based on the assertions it contains.
Here are two analogies:
1) "Decorative" tags vs. "Structure" tags.
In DocBook, these are called "inline" tags (like bold and italic)
vs. "block" tags (like sect1 and sect2). One thing that XML does
*not* give me is a good way to make a clean separation between
those two. That distinction is important, too, for two major
a) When displaying a document, I want to know which elements
belong in the outline (table of contents, tree view) and
which elements belong only in the content-display.
b) For structure elements, the sub-structure should always
consist of (1) content -- any combination of text and
decorative elements -- *followed* by structure elements.
In other words, any structure element can have one piece
of content, followed by substructure elements, and there
is never any overlap between them. XML gives me no such
mechanism. (The DocBook solution is to define a <title>
element for each <sectN>. That introduces two tags where
only one is really needed, and complicates the processing.
The point of this analogy is that I frequently want to separate
structure from content, so I can treat them separately.
2. The second analogy is in the graphic representation of computer
programs. In graphics, hierarchy is expressed by "diving in".
You look inside a graphic object to see what it contains. Here
again, I need a distinction between control elements and normal
statements. The reason: graphical representation of a = b;
does me no good whatsoever. It consumes space for the graphics
that has no value whatsoever for understanding the program.
Graphical representations of programs, therefore, need to stop
at the control-flow statements. A graphical representation of
all the if, for, and case statements in a program may be of use.
In any one block, though, a simple listing of the normal
statements is sufficient.
I see the same issue with respect to knowledge representation.
Attempting to solve the whole problem by representing "tree", "apple",
"green", "red", etc. is just too hard. Let the human interpret the
meaning of the words. But there is an underlying structure that it makes
sense to automated. Perhaps it is Noam Chomsky's deep structure, or
perhaps a logic model, or perhaps one of several representations as
identified by Minsky.
If we can construct a system within which we can model those
relationships and reason about them, we can make a ton of progress
without having to make a computer into a "thinking" machine.
> ...Any DKR/OHS will need to be more
> than a bunch of passive data in a database. It will need many programs
> to do things to that data to make new data (search, format, summarize,
> repackage, interpret, transfer, upgrade, etc.).
> A more important issue than data transmission format (the one XML
> tries to address) is to build a robust platform for doing those
> algorithmic things.
Oddly enough. I haven't seen that the knowledge repository needs a lot
of functionality. I've been looking for it, but most of the operations
you mention I see as either aspects of the UI (like searching) or
operations best conducted by the user (summarizing).
> ...As a deeper approach, one tries to represent the knowledge and
> algorithms in an abstract enough way as to be ideally programming
> language neutralor at least programming language retargetable
> (generating whatever code in whatever language as needed).
This would of course be ideal, assuming that the manipulations need to
be part of the repository system. I am as yet unconvinced that they have
to be, but I am open to argument on the subject.
(thanks for another great, thought-provoking note.)
LOW RATE, NO WAIT!
Get a NextCard Visa, in 30 seconds! Get rates
as low as 2.9% Intro or 9.9% Fixed APR and no hidden fees.
Community email addresses:
Post message: unrev-II@onelist.com
List owner: unrev-IIfirstname.lastname@example.org
Shortcut URL to this page:
This archive was generated by hypermail 2b29 : Wed Apr 05 2000 - 15:30:18 PDT