[Date Prev] [Date Next] [Thread Prev] [Thread Next] Indexes: Main | Date | Thread | Author

Re: Deep contexts WAS: Re: [ba-ohs-talk] Re: A modest proposal


Let me try adding some more explanation to that.    (01)

There's a regress problem.
The regress can be stopped by using PSIs.
That's fine for indexing, but it also throws away a lot of information
that could be very useful in KR, imho.
So I want to find a way to halt the regress and preserve that extra
information, and maintain the use of PSIs (because they
are quite handy at one level).
Ergo, my proposals.    (02)

HTH,
Peter    (03)

----- Original Message -----
From: "Peter Jones" <ppj@concept67.fsnet.co.uk>
To: <ba-ohs-talk@bootstrap.org>
Sent: Sunday, March 24, 2002 9:21 PM
Subject: Re: Deep contexts WAS: Re: [ba-ohs-talk] Re: A modest proposal    (04)


> Hi Joe
>
> That part of the argument was merely expository, not real. The merge
is
> never automagic under that rule. It is a logical regress.
>
> The _real_ set of rules in TMPM4 rely on PSIs, which take care of that
> problem.
>
> All I am trying to do in relation to topic maps is strengthen the data
> model so that it will better support KR.
> For the KR aims I have in mind, my view is that the data model for
Topic
> Maps needs a slight modification, and also I don't think the merger
> rules in TMPM4 are strong enough.
>
> It might not change the fact that one might have to manually merge,
say,
> 4 topics in a million, but it might prevent one
> having to manually untangle 100,000 in a million.
> That's the plan anyway.
>
> Cheers,
> --
> Peter
>
>
> ----- Original Message -----
> From: "Joe D Williams" <JOEDWIL@earthlink.net>
> To: <ba-ohs-talk@bootstrap.org>
> Sent: Saturday, March 23, 2002 1:49 PM
> Subject: Re: Deep contexts WAS: Re: [ba-ohs-talk] Re: A modest
proposal
>
>
> > > What are the scopes of S1 and S2's baseNames though? Well, they
> appear
> > > to have the same baseName in the same scope so maybe they
> > > should be merged too? And so on.
> > > Is there an end to the regress?
> >
> > If the merge is not automagic, then
> > You do the merge manually and submit the result to the group.
> > If accepted, it becomes the new mapping.
> > If the merge is automagic, then
> > You untangle the merge and submit the result to the group.
> > if accepted, it becomes the new mapping.
> >
> > Best Regards,
> > Joe
> >
> >
> >
> >
> >
> > ----- Original Message -----
> > From: "Peter Jones" <ppj@concept67.fsnet.co.uk>
> > To: <ba-ohs-talk@bootstrap.org>
> > Sent: March 23, 2002 5:29 AM
> > Subject: Deep contexts WAS: Re: [ba-ohs-talk] Re: A modest proposal
> >
> >
> > > Hi Murray,
> > >
> > > OK, let me try and explain the idea of deep contexts some more
> first,
> > > then I hope to be able to present the argument as to why
> > > the division of the example topic you've provided into 4 topics is
> not
> > > really making any great change for the user, although it
> > > does make a big change behind the scenes for processing and system
> > > significance.
> > >
> > > {Warning: Very long post riding roughshod over sensitive terrain
in
> > > places.}
> > >
> > > Deep contexts:
> > > In XTM 1.0 and ISO13250 there was a 'catch' with scoping topics.
> > > Imagine you have TM1 that contains a topic T1, 'Cecil the cat', in
> scope
> > > S1, 'house 51', and a TM2 that contains a topic, C2, 'Cecil the
cat'
> in
> > > scope, S2, 'house 51'.
> > > TM1 and TM2 are separate documents ostensibly about different
> subjects.
> > > I load TM1 into my TM processor. I surf it's contents and via one
of
> its
> > > occurrences I am accidentally led to click on a link that imports
> TM2
> > > into the processor.
> > > T1 and T2 appear to be about the same thing, and they appear to be
> in
> > > the same scope. So let's try merging them.
> > > Topic maps say that the same baseName in the same scope indicates
a
> > > topic that should be merged.
> > > So it looks like T1 and T2 should be merged. But how do we
determine
> > > whether S1 and S2 are in fact the same scope.
> > > Well, they appear to have the same baseName in the same scope, so
> maybe
> > > they should be merged too.
> > > What are the scopes of S1 and S2's baseNames though? Well, they
> appear
> > > to have the same baseName in the same scope so maybe they
> > > should be merged too? And so on.
> > > Is there an end to the regress?
> > > There are three methods which can be combined.
> > > 1-  Restrict the application TM set so that you know that there is
> an
> > > upper bound in advance.
> > > 2-  Don't look too closely at the scoped properties of scoping
> topics
> > > (or put otherwise, things aren't so bad if every topic has a
> baseName in
> > > the unconstrained scope or has a Subject that is a PSI - which is
> the
> > > same thing). We are "strongly encouraged to use common Published
> Subject
> > > Indicators" in TMPM4.
> > > On the subject of PSIs TMPM4 says, "Such organizations [that serve
> > > communities of interest] should commit themselves to preserving
the
> > > longterm validity of the published addresses of such identity
> points, in
> > > order to protect the value and mergeability of the topic maps that
> use
> > > them."
> > > 3-  Ignore the Name merger rule and only go for merger when a
topic
> > > Subject is the same resource address in both cases.
> > > Although cf. section 7 of TMPM4 at www.topicmaps.net
> > > "The Subject-based Merging Rule requires conforming topic map
> processing
> > > systems to merge t-nodes that are known to such systems to have
the
> same
> > > subject, *on the basis of whatever information is available to
> them*. In
> > > addition, the Subject-based Merging Rule requires conforming topic
> map
> > > processing systems to conclude, on the basis of *certain
> conditions*,
> > > that two t-nodes have the same subject, and that they therefore
must
> be
> > > merged into a single t-node." [my added emphasis]
> > > What are the conditions TMPM4 proposes? To quote at length:
> > >
> > > "Whenever two t-nodes both have identity points that are subject
> > > constituting resources, they must be merged if and only if the two
> > > subject constituting resources are known to the processing system
to
> be
> > > one and the same resource, regardless of how that resource may
have
> been
> > > differently addressed. In other words, merging is required if and
> only
> > > if the two addresses are known to the processing system to be
> > > equivalent.
> > >
> > > "All t-nodes have at least one subject indicator resource. (If
> nothing
> > > else, a t-node must at least have the syntactic construct that
> demanded
> > > its existence as one of its subject indicators.) Two t-nodes that
do
> not
> > > have subject constituting resources shall be merged if and only
if:
> > >
> > > either:
> > >
> > > "one of the two t-nodes has at least one subject indicator
resource
> that
> > > is known to the processing system to be the same resource that
> serves as
> > > one of the subject indicators of the other t-node,
> > > [PPJ: and it seems to me that if the resource is within a topic
map
> the
> > > TM1 vs. TM2 regress continues here.]
> > >
> > > or:
> > >
> > > "the two subject indicator resources indicating the subject are
> known
> > > (on account of machine intelligence or human intervention) to the
> > > processing system to describe the same subject.
> > > [PPJ: which some might say might be passing the buck a tad too
much
> or
> > > encouraging an approach that has great risks for the integrity of
> the
> > > information. But perhaps that would be asking too much of XTM 1.0
> here.]
> > >
> > > "For purposes of the Subject-based Merging Rule, it is irrelevant
> > > whether two subject indicator resources, or two subject
constituting
> > > resources, contain the same data or are the same string. A simple
> string
> > > comparison of the two subject indicator resources is not, in the
> general
> > > case, a reliable indication of whether or not the same subject is
> being
> > > described. For example, different products in different sales
> catalogs
> > > may coincidentally have the same catalog number, and a comparison
of
> the
> > > two catalog numbers does not indicate that they are the same
> product.
> > > Therefore, the Subject-based Merging Rule is not based on
comparing
> the
> > > data content of the resources that serve as identity points.
Merging
> > > must occur if and only if:
> > >
> > > "either both subject identity points are subject indicators, or
both
> > > subject identity points are subject constituters (i.e., they can't
> be
> > > mixed), and
> > >
> > > "they are one and the same resource, meaning that they exist in
> exact
> > > same addressable context, even though there may be multiple
> different
> > > equivalent addressing expressions that can arrive at that same
> resource
> > > in that same addressable context."
> > > [PPJ: But note that this approach does not cure issues with
polysemy
> > > where two topics with the same resource indicator and baseName
> > > nonetheless have different types. And up the scoping regress we go
> > > again.]
> > >
> > > All of these three approaches have, imho, arise from a not
> altogether
> > > satisfactory limiting of the contextual information in terms of
the
> > > depth of investigation you can make into the properties of a
scoping
> > > topic.
> > > If in XTM 1.0 you could investigate the properties of such topics,
> > > instead of getting answers as you investigate more deeply into the
> > > scoping hierarchy you would get more questions - a negative
regress.
> > > (Hold that thought for a moment.)
> > >
> > > Now, my objections (but see comment below these):
> > > Objection to 1: is that it prevents the TM system being opened out
> into
> > > the web as an open collaborative enterprise.
> > > Objection to 2: it's buck passing on to another less than
> satisfactory
> > > system, because, let's face, agreeing on the words to use for
> something
> > > is a perennial human problem.
> > > Objection to 3: it's buck passing with less accuracy than in 2.
You
> > > can't really nail _that much_ from an address on the web.
> > >
> > > If all I were speaking of were ISO13250 then I would admit that I
am
> > > being far too harsh. The aims of ISO13250 were to index structured
> > > collections, and merge indexes. Indexing is not knowledge
> representation
> > > in the strict sense of both terms.
> > > So ISO 13250 arguably does what it is supposed to.
> > > However, it appeared to me that in XTM 1.0 meetings that for some
> the
> > > agenda went further, and that attempts were made to extend the
game
> to
> > > KR.
> > > And then things became very strained and confusing and messy.
> > >
> > > But now, what I still believe, after much thought, that XTM could
be
> > > converted to do full blown KR, and that it might prove to be a
solid
> > > lingua franca for such applications, and do so globally, and
openly
> on
> > > the web,
> > > One would need to remove negative regressions and make topic
> property
> > > comparison stricter.
> > > Then as you investigate up the scoping hierarchy you would get
> answers,
> > > not questions, and the answers would run as far up the scheme of
> things
> > > as you needed to look to be sure that you were putting something
in
> the
> > > right place.
> > > You would have what I am calling 'deep contexts'.
> > >
> > > Now, to come back to the 'splitting into four topics' issue. Given
> the
> > > above, I hope you can now see why there is at least an argument
(and
> > > only an argument as yet!) for removing internal scoping from
topics
> and
> > > restricting baseName cardinality, etc.
> > > Then, do you agree that if I split your topic into four, each with
> it's
> > > own scope, then the only information that is lost, is that they
all
> > > relate to the same binding point.
> > > My answer to that is: in indexing maybe the binding point is a
> useful
> > > shorthand, in KR perhaps it creates too much implicit information
> that
> > > would be better held in explicit scoped associations.
> > >
> > > I hope that makes my approach somewhat clearer.
> > >
> > > The discussion and debate, as ever, much appreciated.
> > >
> > > --
> > > Peter
> > >
> > >
> > > ----- Original Message -----
> > > From: "Murray Altheim" <m.altheim@open.ac.uk>
> > > To: <ba-ohs-talk@bootstrap.org>
> > > Sent: Friday, March 22, 2002 5:23 PM
> > > Subject: Re: [ba-ohs-talk] Re: A modest proposal
> > >
> > >
> > > > Peter Jones wrote:
> > > >
> > > > > Yes, you would.
> > > > > But then presumably if you were Swahili and wanted to look up
> > > giraffes
> > > > > you would
> > > > > be searching in Swahili and not in English.
> > > > > Swahili would be a scope.
> > > >
> > > >
> > > > Yes, exactly.
> > > >
> > > >
> > > > > Then it makes no difference to what the user would see, it
only
> > > enables
> > > > > deep contexts within the system if that is desirable.
> > > >
> > > >
> > > > But this is how topic maps work. Your use of "deep contexts" I
> > > > don't quite understand.
> > > >
> > > > Now, for reasons I'll not elaborate, let's try an example of
this
> > > > with "elephant" in Zulu instead of a "giraffe" in Swahili (okay,
> > > > I could locate a speaker of Zulu but not Swahili on short
notice,
> > > > and "giraffe" isn't translated into Japanese or Korean AFAIK).
The
> > > > topic element looks like this:
> > > >
> > > >    <topic id="ele34">
> > > >      <subjectIdentity>
> > > >        <subjectIndicatorRef
> > > >          xlink:href="http://www.altheim.com/zoo/elephant.html"/>
> > > >      </subjectIdentity>
> > > >      <baseName><!-- Zulu -->
> > > >        <scope><topicRef
xlink:href="../language.xtm#zu"/></scope>
> > > >        <baseNameString>ndofu</baseNameString>
> > > >      </baseName>
> > > >      <baseName><!-- Korean -->
> > > >        <scope><topicRef
xlink:href="../language.xtm#ko"/></scope>
> > > >        <baseNameString>코끼리</baseNameString>
> > > >      </baseName>
> > > >      <baseName><!-- Japanese -->
> > > >        <scope><topicRef
xlink:href="../language.xtm#ja"/></scope>
> > > >        <baseNameString>象</baseNameString>
> > > >      </baseName>
> > > >      <baseName><!-- English -->
> > > >        <scope><topicRef
xlink:href="../language.xtm#en"/></scope>
> > > >        <baseNameString>elephant</baseNameString>
> > > >      </baseName>
> > > >      ...
> > > >    </topic>
> > > >
> > > > I've included "elephant" basenames in Zulu, Korean, Japanese
> > > > and English. This is completely typical XTM, as shown in the
> > > > examples in the XTM spec. I'm not sure why you'd want to do
> > > > anything different, such as divide it into four topics. If
> > > > we assume that
> > > >
> > > >    "http://www.altheim.com/zoo/elephant.html";
> > > >
> > > > is an acceptable indicator of the subject "elephant", then all
> > > > three of the divided topics would appropriately contain a
> > > > <subjectIdentity> pointing to that URL, and would be merged
> > > > back into what you see above the first time it was processed
> > > > through a compliant topic map engine. People in four languages
> > > > could locate the topic (whose ID is "ele34") by searching in
> > > > the scope of either of the four languages provided.
> > > >
> > > > Murray
> > > >
> > > >
> ......................................................................
> > > > Murray Altheim                         <mailto:m.altheim @
> open.ac.uk>
> > > > Knowledge Media Institute
> > > > The Open University, Milton Keynes, Bucks, MK7 6AA, UK
> > > >
> > > >       In the evening
> > > >       The rice leaves in the garden
> > > >       Rustle in the autumn wind
> > > >       That blows through my reed hut.  -- Minamoto no Tsunenobu
> > > >
> > > >
> > >
> >
> >
>
>    (05)