RE: [unrev-II] Augment + categories = OHS v0.1

From: Warren Stringer (
Date: Sat Jun 24 2000 - 00:01:35 PDT

  • Next message: Rod Welch: "Re: [unrev-II] Leadership and licenses"

    A) I would would like to tender an opinion from the following vantage
        1) Since 1976 I've designed and or written about 5 DBMSs
            a) three of which were full-on network DBMS's
            b) two of which were hierarchical DBMS's
        2) I am stronger in parallel, visual, depth-first, and syntax
        3) I am weaker in linear, auditory, breadth-first, and semantics
        4) I see knowledge management as a compression problem
            a) how to move entropy through the narrowest channel
            b) Huffman codes is a good starting place

    B) Some principals I would like to toss out (using myself as a use case)
        1) we should support people who are strong in my strong areas (see
            a) which implies unlabeled nodes and arcs
        2) we should support people who are strong in my weak areas (see above)
            b) which implies labeled nodes and arcs
        3) assume that we have unlimited resources to implement
            a) so, assume that we are writing from scratch
            b) assume that the system self improves
                1) so a kludge of open code can evolve towards code written from
                2) is this possible in the real world?
                3) alternative is to choose real world compromises
                    a) starting from ideal design
        5) design and deploy the simplest simplest case first
            a) add more complex case as second tier
            b) allow a gentle learning curve
                1) for future practitioners
                2) for old practioners who have low semantic persistence
        6) human augmentation systems should map easily between human and system
            a) take into account human limitations in assimilating information
                1) such as GOMs and 7+-2 short term memory constraints
            b) software should map easily to wetware
                1) information structures should map easily to brain structures
                2) facilitate dialog between designers and researchers
                    a) such as software designers and brain research
        7) try to keep a list of priciples to 7 or less.

    So, my proposal is to keep nodes and arcs pure. This yields about 5-10K of
    source code that deals with issues like: how to attach nodes together, how
    to represent an arc (as a node with a single predecessor and single
    successor -- for those who enjoy paradox). This represents about 3-6 months
    of work to implement, or a couple weeks of learning curve.

    From the perspective of compression, language can be seen as a somewhat
    static network of probabilities. At the lowest level (from an ASCII
    standpoint) you have the probabilities of letters -- which is handled quite
    nicely by Huffman codes. By adding a priori probabilities, you can even
    improve Huffman codes by about 30% -- you know: given the letter Q, the most
    likely successor letter is U ... Words could be treated as the next level
    up: as an aggregation of letter nodes that determine the the probabilities
    of the next aggregation of letter nodes. Ideas are the next step up from
    that. So, the phrase "the quick brown ___" should yield "fox", for most

    Why do I bring this up? Well, compression can be though as an operation upon
    a set of nodes with probabilities between the arcs. Thus, language can be
    thought of as a fairly static set of nodes with a commonly accepted range of
    probabilities between the arcs. After hearing Jeff Rulifson talk about the
    compiler-compiler in his part of the '68 Demo video, it would make sense to
    support a future DKR of DKRs. In this case, a "Categorized" node would
    represent a relationship between two network topologies: 1) a fairly static
    language topology and 2) a fairly volatile dialogue topology. Does this
    imply that we need to create a special arc type between topologies? The
    answer is "no no no!" (heheh) All we need is to perform a query on arc-nodes
    that have arcs to the "language topology" node. This being a very common
    query, one would hope that the code for such would compress into a very
    small instruction set -- a very nice use for the Transmeta code-morphing
    paradigm -- but I digress once too often.

    Popping these digressions back to the task at hand. I would suggest that we
    treat nodes, arcs, and categories separate. For now,I would suggest a simple
    keyword index to everything in the DKR. An email node with category "Foo"
    could be structured along the lines of a
        tree with with following nodes
            "Category", "Email-ID", "Transaction"
            "Foo", "ID103040", <link Foo Category Email-ID>
    Yawn, well it's late and I'm making up syntax in my sleep -- my apologies.
    At least I can spare you all from any more rant, alluded to by the above
    principles, that were merely iterated without elucidation.

    (Hey! Another thought: perhaps we could entice Joe to expand his glossary a
    bit to -- say -- the English language and all its metrics? heheh)

    Warren Stringer

    "no node knows another like another knows node, no?"
    -- nobody in particular
      -----Original Message-----
      From: Eric Armstrong []
      Sent: Friday, June 23, 2000 7:00 PM
      Subject: Re: [unrev-II] Augment + categories = OHS v0.1

      Jack Park wrote:
    > ... Gil uses the term "rigidify." That works for me, but there
    > are other points of view as well. At issue is the fact that we
    > all categorize the world in our own way. Production-line education
    > tends to enforce standardization in that arena, but we are still
    > individuals with our own non-linearities and so forth.
      Ah... Now I understand the point that Gil was trying to make.
      Yes, this is a system usage issue. The larger the system gets,
      the more rigid the categories become -- to the degree that they
      become standards. To the degree they don't, similar and redundant
      categories are continually added to the system.

      On the other hand, categories with various "shades of meaning"
      might even be useful. If someone develops a formulation for
      defining near-equivalences, of the form:
        "hyper" = 90% match with "intense"
                = 80% match with "over the top"
                = xx% match with conceptX

      Then some interesting fuzzy search capabilities begin to be
      possible. I don't intend to work on that layer of the system,
      but it is interesting that the foundation we are building may
      just enable it.

      --As you point out, there is still the proble of mapping from
      *my* concepts to some "shared" conceptual framework out there.

    > The fundamental architecture being espoused within the meeting
    > was that of an engine that mutates original documents by adding
    > links to them. The fundamental approach taken in the architecture
    > I present here is one in which absolutely no modifications are
    > ever performed on original documents. All linkages are formed
    > "above" the permanent record of human discourse and experience.
    > I strongly believe that the extra effort required to avoid
    > building a system that simply plays with original documents will
    > prove to be of enormous value in the larger picture.
      This idea deserves careful consideration. I have a suspicion you
      may be right about that. Our talks about how to use Wiki effectively
      have really centered on how we control modifications to underlying
      documents. I haven't come at things from the perspective you
      suggest. It's time to take a detailed look at that approach, I think.

      Also: I'm delighted that we're not going for a full ontology in
      version 1. Yay! But I am equally delighted that system we seem to
      be zeroing in on may help provide a basis for that work. Life should
      be interesting

    This archive was generated by hypermail 2b29 : Sat Jun 24 2000 - 00:12:10 PDT